Replication question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Replication question

John Fawcett
Hi

I'm currently debugging some replication issues between two dovecot
2.3.9.2 servers, where one is live and the other is just a copy used for
backup with no imap user access. After initial alignment (with various
error messages such as the stalled io messages a fnctl lock messages) I
am seeing replication miss messages or stop altogether on mailboxes,
even with no further error messages.

doveadm: Error: dsync(REMOTE_HOSTNAME): I/O has stalled, no activity for
600 seconds (last sent=mail_change (EOL), last recv=mailbox)

doveadm: Error: Couldn't lock
/var/vmail/DOMAIN/USER//.dovecot-sync.lock:
fcntl(/var/vmail/DOMAIN/USER//.dovecot-sync.lock, write-lock, F_SETLKW)
locking failed: Timed out after 30 seconds (WRITE lock held by pid 30307)

I was surprised by this because although I know there were replication
issues in 2.3.8 I understood these were resolved in 2.3.9 when both
servers had 2.3.9.

I am still investigating and will post further if I get any useful insights.

However, I have a question, which despite using dovecot for many years
in this configuration has never occurred to me before. I configured
dovecot using the wiki https://wiki.dovecot.org/Replication using tcp
and ssl. Both servers have an identical dovecot configuration except for:

1. different hostnames

2. on the backup server I have removed expire and quota plugins in the
global mail_plugins

3. in the configuration of mail_replica tcps://hostname:port each server
points to the other server's hostname

What I just realized is that nowhere in the wiki does it state that both
servers should be set up for replication. I had always assumed that was
the logical thing to do. So the question is, for successful replication
is it sufficient to setup one master configuration and just have a
replication process listening on the other master, or should both
servers be set up for replication in an almost identical way (with the 3
exceptions above)?

thanks for any insights.

John



pEpkey.asc (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Replication question

Aki Tuomi-3

On 12.1.2020 13.49, John wrote:

> Hi
>
> I'm currently debugging some replication issues between two dovecot
> 2.3.9.2 servers, where one is live and the other is just a copy used for
> backup with no imap user access. After initial alignment (with various
> error messages such as the stalled io messages a fnctl lock messages) I
> am seeing replication miss messages or stop altogether on mailboxes,
> even with no further error messages.
>
> doveadm: Error: dsync(REMOTE_HOSTNAME): I/O has stalled, no activity for
> 600 seconds (last sent=mail_change (EOL), last recv=mailbox)
>
> doveadm: Error: Couldn't lock
> /var/vmail/DOMAIN/USER//.dovecot-sync.lock:
> fcntl(/var/vmail/DOMAIN/USER//.dovecot-sync.lock, write-lock, F_SETLKW)
> locking failed: Timed out after 30 seconds (WRITE lock held by pid 30307)
>
> I was surprised by this because although I know there were replication
> issues in 2.3.8 I understood these were resolved in 2.3.9 when both
> servers had 2.3.9.
>
> I am still investigating and will post further if I get any useful insights.
>
> However, I have a question, which despite using dovecot for many years
> in this configuration has never occurred to me before. I configured
> dovecot using the wiki https://wiki.dovecot.org/Replication using tcp
> and ssl. Both servers have an identical dovecot configuration except for:
>
> 1. different hostnames
>
> 2. on the backup server I have removed expire and quota plugins in the
> global mail_plugins
>
> 3. in the configuration of mail_replica tcps://hostname:port each server
> points to the other server's hostname
>
> What I just realized is that nowhere in the wiki does it state that both
> servers should be set up for replication. I had always assumed that was
> the logical thing to do. So the question is, for successful replication
> is it sufficient to setup one master configuration and just have a
> replication process listening on the other master, or should both
> servers be set up for replication in an almost identical way (with the 3
> exceptions above)?
>
> thanks for any insights.
>
> John
>
>
Did you check what the process 30307 is?


It is enough for the backup server to have only the doveadm server
configured.

Aki