Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
Hi,

After upgrading Dovecot from version 2.2.27 to 2.3.9.2, we are
sporadically seeing lots of these errors in the error log on many of our
servers:

   imap-login: Error: master(imap): net_connect_unix(imap) failed:
Resource temporarily unavailable -
http://wiki2.dovecot.org/SocketUnavailable

The issue is causing significant delays and/or timeouts on login.

 From what I can tell, this happens because the imap-login service is
unable to connect to the imap service socket, however I am unable to
identify the root cause of the issue.

We have increased the client_limit for the stats service, as described
in the upgrade document at (https://wiki2.dovecot.org/Upgrading/2.3). It
is set to the same value as the process_limit for imap service.

System CPU usage does not appear to be saturated. The CPU usage sits at
around 50% when these errors appear.

We have doubled the number of file descriptors after upgrading from
2.2.27. It was previously set to 16392 (which worked fine):

~# cat /proc/`pidof dovecot`/limits | grep "open files"
Max open files            30000                30000                files

We have tried increasing default_process_limit and default_client_limit
from the default 1000 to 3000, but this has no effect.

Current configuration (with irrelevant parts removed):

~# doveconf -n

default_client_limit = 3000 # these were raised after upgrading in
attempt to remedy
default_process_limit = 3000
default_vsz_limit = 512 M
[...]
service imap-login {
   client_limit = 1250
   inet_listener imap {
     port = 143
   }
   inet_listener imaps {
     port = 993
     ssl = yes
   }
   process_limit = 16
   process_min_avail = 16
   service_count = 0
   vsz_limit = 512 M
}
service imap {
   client_limit = 1
   process_limit = 20000
}
service stats {
   client_limit = 20000
}
[...]

We appear to be nowhere near the 20000 process_limit set for the imap
service. We are also not seeing any warnings being logged with regards
to the process_limit.

~# doveadm who -1 | wc -l
8765

~# doveadm process status | grep "^imap " | wc -l
7583

~# ps aux | grep "dovecot/imap" | wc -l
7449

~# doveadm process status | grep -v "^imap "
name       pid   available_count total_count idle_start
last_status_update last_kill_sent
stats      19875 12384           582387      0          1578305489
   0
pop3-login 19868 1000            0           0          1578061504
   0
pop3-login 19867 1000            0           0          1578061503
   0
pop3-login 19866 1000            0           0          1578061502
   0
pop3-login 19865 1000            0           0          1578061503
   0
pop3-login 19864 1000            0           0          1578061503
   0
pop3-login 19853 1000            0           0          1578061504
   0
pop3-login 19852 1000            0           0          1578061504
   0
pop3-login 19851 1000            0           0          1578061504
   0
pop3-login 19849 1000            0           0          1578061503
   0
pop3-login 19847 1000            0           0          1578061503
   0
pop3-login 19846 1000            0           0          1578061503
   0
pop3-login 19845 1000            0           0          1578061503
   0
pop3-login 19844 1000            0           0          1578061503
   0
pop3-login 19843 1000            0           0          1578061503
   0
pop3-login 19842 1000            0           0          1578061503
   0
pop3-login 19839 1000            0           0          1578061502
   0
log        19841 2973            27          0          1578061502
   0
imap-login 19873 666             14942       0          1578305488
   0
imap-login 19872 523             32792       0          1578305489
   0
imap-login 19871 316             62876       0          1578305489
   0
imap-login 19870 167             78332       0          1578305489
   0
imap-login 19869 118             97989       0          1578305489
   0
imap-login 19863 0               184726      0          1578305489
   0
imap-login 19862 0               193094      0          1578305489
   0
imap-login 19861 0               186800      0          1578305487
   0
imap-login 19860 675             17806       0          1578305487
   0
imap-login 19859 0               169446      0          1578305489
   0
imap-login 19858 0               143517      0          1578305489
   0
imap-login 19857 0               119215      0          1578305488
   0
imap-login 19856 0               151958      0          1578305489
   0
imap-login 19855 483             47036       0          1578305488
   0
imap-login 19854 1               185240      0          1578305489
   0
imap-login 19840 567             23859       0          1578305485
   0
config     19874 2967            124482      0          1578305489
   0
auth       19885 2967            124185      0          1578305489
   0
anvil      10017 967             7567        0          1578305463
   0

--
Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
While I not certain that this is related to the problem we are
experiencing, we are also (more rarely) seeing these:

   dovecot: master: Warning: service(imap-login): process_limit (16)
reached, client connections are being dropped

imap-login was set to process_limit=16, client_limit=1250 (for a total
of 20000 concurrent clients) on 2.2.27. We never had any issues with
this then.

I tried raising it to client_limit=2500, for a total of 40000 concurrent
connections, but we are still seeing the errors logged every now and then.

Almost every single available imap-login socket is reported as being
used! Is something eating up the sockets processes and not freeing them?

~# doveadm process status | grep "^imap-login "
imap-login 8025  2197            419         0          1578406002
   0
imap-login 25790 120             5439        0          1578406002
   0
imap-login 24231 8               8573        0          1578406001
   0
imap-login 15707 0               91617       0          1578405978
   0
imap-login 15706 0               81715       0          1578406002
   0
imap-login 15705 0               57033       0          1578405992
   0
imap-login 15704 0               88287       0          1578405996
   0
imap-login 15703 0               68346       0          1578405999
   0
imap-login 15701 0               24825       0          1578406002
   0
imap-login 15695 0               15796       0          1578406001
   0
imap-login 15693 0               38740       0          1578405997
   0
imap-login 15692 0               11609       0          1578405998
   0
imap-login 15691 0               46832       0          1578406002
   0
imap-login 15690 0               10816       0          1578406001
   0
imap-login 15689 0               84235       0          1578406002
   0
imap-login 15683 0               30898       0          1578406001
   0

I am counting 37 675 used imap-login sockets (according to dovecot), on
a machine with only ~7500 active TCP connections and a similar number of
logged in users:

~# ss -t | wc -l
7430
~# doveadm process status | grep "^imap " | wc -l
7224

Here is `doveconf -n` for service imap-login again (setup in
"high-performance mode" as described in wiki):

service imap-login {
   client_limit = 2500
   inet_listener imap {
     port = 143
   }
   inet_listener imaps {
     port = 993
     ssl = yes
   }
   process_limit = 16
   process_min_avail = 16
   service_count = 0
   vsz_limit = 512 M
}

I would greatly appreciate some pointers, because this is mindboggling
confusing.

On 06/01/2020 11:28, Eirik Rye wrote:

> Hi,
>
> After upgrading Dovecot from version 2.2.27 to 2.3.9.2, we are
> sporadically seeing lots of these errors in the error log on many of our
> servers:
>
>    imap-login: Error: master(imap): net_connect_unix(imap) failed:
> Resource temporarily unavailable -
> http://wiki2.dovecot.org/SocketUnavailable
>
> The issue is causing significant delays and/or timeouts on login.
>
>  From what I can tell, this happens because the imap-login service is
> unable to connect to the imap service socket, however I am unable to
> identify the root cause of the issue.
>
> We have increased the client_limit for the stats service, as described
> in the upgrade document at (https://wiki2.dovecot.org/Upgrading/2.3). It
> is set to the same value as the process_limit for imap service.
>
> System CPU usage does not appear to be saturated. The CPU usage sits at
> around 50% when these errors appear.
>
> We have doubled the number of file descriptors after upgrading from
> 2.2.27. It was previously set to 16392 (which worked fine):
>
> ~# cat /proc/`pidof dovecot`/limits | grep "open files"
> Max open files            30000                30000                files
>
> We have tried increasing default_process_limit and default_client_limit
> from the default 1000 to 3000, but this has no effect.
>
> Current configuration (with irrelevant parts removed):
>
> ~# doveconf -n
>
> default_client_limit = 3000 # these were raised after upgrading in
> attempt to remedy
> default_process_limit = 3000
> default_vsz_limit = 512 M
> [...]
> service imap-login {
>    client_limit = 1250
>    inet_listener imap {
>      port = 143
>    }
>    inet_listener imaps {
>      port = 993
>      ssl = yes
>    }
>    process_limit = 16
>    process_min_avail = 16
>    service_count = 0
>    vsz_limit = 512 M
> }
> service imap {
>    client_limit = 1
>    process_limit = 20000
> }
> service stats {
>    client_limit = 20000
> }
> [...]
>
> We appear to be nowhere near the 20000 process_limit set for the imap
> service. We are also not seeing any warnings being logged with regards
> to the process_limit.
>
> ~# doveadm who -1 | wc -l
> 8765
>
> ~# doveadm process status | grep "^imap " | wc -l
> 7583
>
> ~# ps aux | grep "dovecot/imap" | wc -l
> 7449
>
> ~# doveadm process status | grep -v "^imap "
> name       pid   available_count total_count idle_start
> last_status_update last_kill_sent
> stats      19875 12384           582387      0          1578305489   0
> pop3-login 19868 1000            0           0          1578061504   0
> pop3-login 19867 1000            0           0          1578061503   0
> pop3-login 19866 1000            0           0          1578061502   0
> pop3-login 19865 1000            0           0          1578061503   0
> pop3-login 19864 1000            0           0          1578061503   0
> pop3-login 19853 1000            0           0          1578061504   0
> pop3-login 19852 1000            0           0          1578061504   0
> pop3-login 19851 1000            0           0          1578061504   0
> pop3-login 19849 1000            0           0          1578061503   0
> pop3-login 19847 1000            0           0          1578061503   0
> pop3-login 19846 1000            0           0          1578061503   0
> pop3-login 19845 1000            0           0          1578061503   0
> pop3-login 19844 1000            0           0          1578061503   0
> pop3-login 19843 1000            0           0          1578061503   0
> pop3-login 19842 1000            0           0          1578061503   0
> pop3-login 19839 1000            0           0          1578061502   0
> log        19841 2973            27          0          1578061502   0
> imap-login 19873 666             14942       0          1578305488   0
> imap-login 19872 523             32792       0          1578305489   0
> imap-login 19871 316             62876       0          1578305489   0
> imap-login 19870 167             78332       0          1578305489   0
> imap-login 19869 118             97989       0          1578305489   0
> imap-login 19863 0               184726      0          1578305489   0
> imap-login 19862 0               193094      0          1578305489   0
> imap-login 19861 0               186800      0          1578305487   0
> imap-login 19860 675             17806       0          1578305487   0
> imap-login 19859 0               169446      0          1578305489   0
> imap-login 19858 0               143517      0          1578305489   0
> imap-login 19857 0               119215      0          1578305488   0
> imap-login 19856 0               151958      0          1578305489   0
> imap-login 19855 483             47036       0          1578305488   0
> imap-login 19854 1               185240      0          1578305489   0
> imap-login 19840 567             23859       0          1578305485   0
> config     19874 2967            124482      0          1578305489   0
> auth       19885 2967            124185      0          1578305489   0
> anvil      10017 967             7567        0          1578305463   0
>


--
Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

@lbutlr
On 07 Jan 2020, at 07:15, Dovecot Mailing List <[hidden email]> wrote:
>  dovecot: master: Warning: service(imap-login): process_limit (16) reached, client connections are being dropped

Is this an actual problem? Most client connections are idle, so I doubt if the occasional drop is noticed at all.

Also, why a limit of 16? Have you tried, oh, 160 instead?





--
"Master, what is the difference between a humanistic, monastic system
        of belief in which wisdom is sought by means of an apparently
        nonsensical system of questions and answers, and a lot of mystic
        gibberish made up on the spur of the moment?" Wen considered this
        for some time, and a last said: "A fish!”

Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
On 08/01/2020 01:47, @lbutlr wrote:
> On 07 Jan 2020, at 07:15, Dovecot Mailing List <[hidden email]> wrote:
>>   dovecot: master: Warning: service(imap-login): process_limit (16) reached, client connections are being dropped
>
> Is this an actual problem? Most client connections are idle, so I doubt if the occasional drop is noticed at all.

I assume it would mean Dovecot is dropping new incoming connections
(e.g. new users trying to log in), not existing idle connections?
Dropping new incoming connections is a major problem, because some email
clients (e.g. Outlook) are notorious for giving the users password
prompts whenever something unexpected happens.

It is a problem because it is unexpected behaviour. Where are the
imap-login sockets going, if not to serve active TCP connections? On all
servers, the number total-available imap-login processes is
significantly larger than the number of TCP connections on the server.
In some cases, it is about

~# ss -nt | wc -l
7817
~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
7812
~# doveadm process status | grep "^imap-login "
imap-login 17384 1462            44553       0          1578475790
   0
imap-login 17383 550             66830       0          1578475790
   0
imap-login 17382 488             59074       0          1578475789
   0
imap-login 17381 929             58251       0          1578475790
   0
imap-login 17380 712             62790       0          1578475790
   0
imap-login 17377 1608            35143       0          1578475790
   0
imap-login 17376 1718            27955       0          1578475790
   0
imap-login 17375 1880            14903       0          1578475789
   0
imap-login 17374 1965            10754       0          1578475790
   0
imap-login 17373 1796            21137       0          1578475790
   0
imap-login 17372 1194            52360       0          1578475789
   0
imap-login 17371 1994            7482        0          1578475790
   0
imap-login 17370 2061            3793        0          1578475789
   0
imap-login 17369 2127            2992        0          1578475789
   0
imap-login 17368 2014            5152        0          1578475790
   0
imap-login 17362 2144            2723        0          1578475787
   0
~# doveadm process status | grep "^imap-login " | awk "{sum += \$3} END
{print 40000-sum}"
15385

Why is there a (growing) discrepancy between the number of active TCP
connections, and the number of reported free imap-login sockets? Is
something else using imap-login thr

> Also, why a limit of 16? Have you tried, oh, 160 instead?

The limit was set to match the process_limit of the imap service
(16*1250=20000).

We have made the assumption that imap-login should not require more
sockets than the maximum number of concurrent imap sessions. We have not
had any issues with this before upgrading, but it appears to not be the
case in 2.3.x?


--
Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
On 08/01/2020 10:49, Eirik Rye wrote:
> [...]

In any case, the biggest issue is the one described in the original
message in this thread, where logins are periodically failing for many
for users:

2020-01-08T10:54:57.436400+01:00 imap dovecot: imap-login: Error:
master(imap): net_connect_unix(imap) failed: Resource temporarily
unavailable - http://wiki2.dovecot.org/SocketUnavailable
2020-01-08T10:54:57.437550+01:00 imap dovecot: imap-login: Internal
login failure

At the same time:

~# ps aux | grep "dovecot/imap$" | wc -l
6637
~# doveadm process status | grep "^imap " | wc -l
6660
~# cat /proc/`pidof dovecot`/limits | grep "open files"
Max open files            32768                32768                files
~# doveconf -n
[...]
service imap-login {
   client_limit = 2500
   inet_listener imap {
     port = 143
   }
   inet_listener imaps {
     port = 993
     ssl = yes
   }
   process_limit = 16
   process_min_avail = 16
   service_count = 0
   vsz_limit = 512 M
}
service imap {
   client_limit = 1
   process_limit = 20000
}

--
Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Aki Tuomi-3
Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp output?

Aki

> On 08/01/2020 11:58 Eirik Rye <[hidden email]> wrote:
>
>  
> On 08/01/2020 10:49, Eirik Rye wrote:
> > [...]
>
> In any case, the biggest issue is the one described in the original
> message in this thread, where logins are periodically failing for many
> for users:
>
> 2020-01-08T10:54:57.436400+01:00 imap dovecot: imap-login: Error:
> master(imap): net_connect_unix(imap) failed: Resource temporarily
> unavailable - http://wiki2.dovecot.org/SocketUnavailable
> 2020-01-08T10:54:57.437550+01:00 imap dovecot: imap-login: Internal
> login failure
>
> At the same time:
>
> ~# ps aux | grep "dovecot/imap$" | wc -l
> 6637
> ~# doveadm process status | grep "^imap " | wc -l
> 6660
> ~# cat /proc/`pidof dovecot`/limits | grep "open files"
> Max open files            32768                32768                files
> ~# doveconf -n
> [...]
> service imap-login {
>    client_limit = 2500
>    inet_listener imap {
>      port = 143
>    }
>    inet_listener imaps {
>      port = 993
>      ssl = yes
>    }
>    process_limit = 16
>    process_min_avail = 16
>    service_count = 0
>    vsz_limit = 512 M
> }
> service imap {
>    client_limit = 1
>    process_limit = 20000
> }
>
> --
> Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
On 08/01/2020 11:09, Aki Tuomi wrote:
> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp output?

Thanks for the response!

What exactly did you want to compare? `ss -tp` does not appear to show
cmdline/process title. I enabled it for one server, but will have to
wait for users to reconnect and for the issue to reappear there.

In the meantime, on another server experiencing the same issue at this
moment (same configuration), but no proctitles:

root@imap03:~# doveadm process status | grep "^imap " | wc -l
7564
root@imap03:~# ps aux | grep "dovecot/imap$" | wc -l
7570
root@imap03:~# ss -tp | grep '"imap-login"' | wc -l
8009
root@imap03:~# ss -tp | grep '"imap"' | wc -l
536
root@imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
8739
root@imap03:~# doveadm who -1 | wc -l
7581

The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).

--
Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
With proctitles below.

Strangely, Dovecot is reporting 12679 imap-login sockets in "pre-login"
state, which is about 50% more than there are active TCP connections to
the server (8271, of which 7546 are in "imap-login" process according to
`ss -tp`).

On another server, not currently experiencing this issue, the number of
TCP sockets appears to more closely match the number of imap-login
sockets used (either in pre-login or TLS proxing).

imap:~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
UID        PID  PPID  C STIME TTY          TIME CMD
dovenull  5262  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
dovenull  5263  5260  2 11:48 ?        00:03:53 dovecot/imap-login [724
pre-login + 113 TLS proxies]
dovecot   5264  5260  0 11:48 ?        00:01:15 dovecot/anvil [21
connections]
root      5265  5260  1 11:48 ?        00:02:18 dovecot/log
dovenull  5266  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
dovenull  5267  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
dovenull  5268  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
dovenull  5269  5260 20 11:48 ?        00:35:34 dovecot/imap-login [1073
pre-login + 1199 TLS proxies]
dovenull  5270  5260  1 11:48 ?        00:02:25 dovecot/imap-login [615
pre-login + 67 TLS proxies]
dovenull  5271  5260  0 11:48 ?        00:01:39 dovecot/imap-login [489
pre-login + 44 TLS proxies]
dovenull  5272  5260 14 11:48 ?        00:24:44 dovecot/imap-login [938
pre-login + 720 TLS proxies]
dovenull  5273  5260  6 11:48 ?        00:10:30 dovecot/imap-login [845
pre-login + 242 TLS proxies]
dovenull  5274  5260  4 11:48 ?        00:06:54 dovecot/imap-login [817
pre-login + 209 TLS proxies]
dovenull  5275  5260  0 11:48 ?        00:01:23 dovecot/imap-login [445
pre-login + 36 TLS proxies]
dovenull  5276  5260 25 11:48 ?        00:43:14 dovecot/imap-login [1064
pre-login + 1434 TLS proxies]
dovenull  5277  5260  1 11:48 ?        00:02:19 dovecot/imap-login [523
pre-login + 58 TLS proxies]
dovenull  5278  5260  9 11:48 ?        00:16:19 dovecot/imap-login [937
pre-login + 462 TLS proxies]
dovenull  5279  5260 19 11:48 ?        00:33:24 dovecot/imap-login [937
pre-login + 823 TLS proxies]
dovenull  5280  5260  1 11:48 ?        00:03:04 dovecot/imap-login [655
pre-login + 92 TLS proxies]
dovenull  5281  5260 26 11:48 ?        00:45:20 dovecot/imap-login [969
pre-login + 1450 TLS proxies]
dovenull  5282  5260  6 11:48 ?        00:10:55 dovecot/imap-login [917
pre-login + 303 TLS proxies]
dovenull  5283  5260  4 11:48 ?        00:08:36 dovecot/imap-login [731
pre-login + 128 TLS proxies]
root      5284  5260  2 11:48 ?        00:03:55 dovecot/config
dovecot   5285  5260  1 11:48 ?        00:02:19 dovecot/stats [7968
connections]
dovecot   5286  5260  3 11:48 ?        00:05:53 dovecot/auth [138 wait,
0 passdb, 0 userdb]
528246   10322  5260  0 14:36 ?        00:00:00 [imap]
root     16688  5260  0 14:40 ?        00:00:00 dovecot/imap
imap02:~# doveadm process status | grep "^imap-login " | awk "{sum +=
2500-\$3} END {print sum}"
20081
imap:~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
7937
imap:~# doveadm process status | grep "^imap " | wc -l
7937
imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" | wc -l
559
imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep
"\"imap-login\"" | wc -l
7546
imap:~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
8271

On 08/01/2020 11:33, Eirik Rye wrote:

> On 08/01/2020 11:09, Aki Tuomi wrote:
>> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp
>> output?
>
> Thanks for the response!
>
> What exactly did you want to compare? `ss -tp` does not appear to show
> cmdline/process title. I enabled it for one server, but will have to
> wait for users to reconnect and for the issue to reappear there.
>
> In the meantime, on another server experiencing the same issue at this
> moment (same configuration), but no proctitles:
>
> root@imap03:~# doveadm process status | grep "^imap " | wc -l
> 7564
> root@imap03:~# ps aux | grep "dovecot/imap$" | wc -l
> 7570
> root@imap03:~# ss -tp | grep '"imap-login"' | wc -l
> 8009
> root@imap03:~# ss -tp | grep '"imap"' | wc -l
> 536
> root@imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
> 8739
> root@imap03:~# doveadm who -1 | wc -l
> 7581
>
> The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).
>


--
Eirik
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
As a workaround for the titular issue, I have tried enabling the
"imap-hibernate" service on a couple of servers to reduce the number of
running imap processes.

Since ~50-60% of clients are in IDLE at any one time, this allows us to
reduce the number of running imap processes to less than half of what it
was.

After this I have yet to see the "net_connect_unix(imap) failed" errors
on these servers, however instead we are now instead seeing a few of these:

   imap([redacted])<30266><[redacted]>: Error: imap-master: Failed to
import client state: Mailbox INBOX: Couldn't get recently expunged UIDs
(uidnext=13479 highest_modseq=1716)

and

   imap([redacted])<20055><[redacted]>: Error: imap-master: Failed to
import client state: Mailbox INBOX: Couldn't send flag changes
   imap([redacted])<20055><[redacted]>: NOTIFY failed in the middle of
FETCH reply in=0 out=0 deleted=0 expunged=0 trashed=0 hdr_count=0
hdr_bytes=0 body_count=0 body_bytes=0

The former I suspect may be related to our NFS-backed store and
simultaneous access from different, since many of the errors are
accompanied by other corruption related errors:

   imap([redacted])<30266><[redacted]>: Error: Mailbox INBOX: vsize-hdr
has invalid size: 36

This whole upgrade is part of a migration to a director-backed system,
because we are seeing a too large amount of corruption errors
(locking/indexes, etc) assumed to be caused by NFS.

Once we introduce the director cluster in front, these errors should
hopefully be strongly reduced. :-)

--
Eirik

On 08/01/2020 15:14, Eirik Rye wrote:

> With proctitles below.
>
> Strangely, Dovecot is reporting 12679 imap-login sockets in "pre-login"
> state, which is about 50% more than there are active TCP connections to
> the server (8271, of which 7546 are in "imap-login" process according to
> `ss -tp`).
>
> On another server, not currently experiencing this issue, the number of
> TCP sockets appears to more closely match the number of imap-login
> sockets used (either in pre-login or TLS proxing).
>
> imap:~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
> UID        PID  PPID  C STIME TTY          TIME CMD
> dovenull  5262  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> dovenull  5263  5260  2 11:48 ?        00:03:53 dovecot/imap-login [724
> pre-login + 113 TLS proxies]
> dovecot   5264  5260  0 11:48 ?        00:01:15 dovecot/anvil [21
> connections]
> root      5265  5260  1 11:48 ?        00:02:18 dovecot/log
> dovenull  5266  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> dovenull  5267  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> dovenull  5268  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> dovenull  5269  5260 20 11:48 ?        00:35:34 dovecot/imap-login [1073
> pre-login + 1199 TLS proxies]
> dovenull  5270  5260  1 11:48 ?        00:02:25 dovecot/imap-login [615
> pre-login + 67 TLS proxies]
> dovenull  5271  5260  0 11:48 ?        00:01:39 dovecot/imap-login [489
> pre-login + 44 TLS proxies]
> dovenull  5272  5260 14 11:48 ?        00:24:44 dovecot/imap-login [938
> pre-login + 720 TLS proxies]
> dovenull  5273  5260  6 11:48 ?        00:10:30 dovecot/imap-login [845
> pre-login + 242 TLS proxies]
> dovenull  5274  5260  4 11:48 ?        00:06:54 dovecot/imap-login [817
> pre-login + 209 TLS proxies]
> dovenull  5275  5260  0 11:48 ?        00:01:23 dovecot/imap-login [445
> pre-login + 36 TLS proxies]
> dovenull  5276  5260 25 11:48 ?        00:43:14 dovecot/imap-login [1064
> pre-login + 1434 TLS proxies]
> dovenull  5277  5260  1 11:48 ?        00:02:19 dovecot/imap-login [523
> pre-login + 58 TLS proxies]
> dovenull  5278  5260  9 11:48 ?        00:16:19 dovecot/imap-login [937
> pre-login + 462 TLS proxies]
> dovenull  5279  5260 19 11:48 ?        00:33:24 dovecot/imap-login [937
> pre-login + 823 TLS proxies]
> dovenull  5280  5260  1 11:48 ?        00:03:04 dovecot/imap-login [655
> pre-login + 92 TLS proxies]
> dovenull  5281  5260 26 11:48 ?        00:45:20 dovecot/imap-login [969
> pre-login + 1450 TLS proxies]
> dovenull  5282  5260  6 11:48 ?        00:10:55 dovecot/imap-login [917
> pre-login + 303 TLS proxies]
> dovenull  5283  5260  4 11:48 ?        00:08:36 dovecot/imap-login [731
> pre-login + 128 TLS proxies]
> root      5284  5260  2 11:48 ?        00:03:55 dovecot/config
> dovecot   5285  5260  1 11:48 ?        00:02:19 dovecot/stats [7968
> connections]
> dovecot   5286  5260  3 11:48 ?        00:05:53 dovecot/auth [138 wait,
> 0 passdb, 0 userdb]
> 528246   10322  5260  0 14:36 ?        00:00:00 [imap]
> root     16688  5260  0 14:40 ?        00:00:00 dovecot/imap
> imap02:~# doveadm process status | grep "^imap-login " | awk "{sum +=
> 2500-\$3} END {print sum}"
> 20081
> imap:~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
> 7937
> imap:~# doveadm process status | grep "^imap " | wc -l
> 7937
> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" |
> wc -l
> 559
> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep
> "\"imap-login\"" | wc -l
> 7546
> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
> 8271
>
> On 08/01/2020 11:33, Eirik Rye wrote:
>> On 08/01/2020 11:09, Aki Tuomi wrote:
>>> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp
>>> output?
>>
>> Thanks for the response!
>>
>> What exactly did you want to compare? `ss -tp` does not appear to show
>> cmdline/process title. I enabled it for one server, but will have to
>> wait for users to reconnect and for the issue to reappear there.
>>
>> In the meantime, on another server experiencing the same issue at this
>> moment (same configuration), but no proctitles:
>>
>> root@imap03:~# doveadm process status | grep "^imap " | wc -l
>> 7564
>> root@imap03:~# ps aux | grep "dovecot/imap$" | wc -l
>> 7570
>> root@imap03:~# ss -tp | grep '"imap-login"' | wc -l
>> 8009
>> root@imap03:~# ss -tp | grep '"imap"' | wc -l
>> 536
>> root@imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
>> 8739
>> root@imap03:~# doveadm who -1 | wc -l
>> 7581
>>
>> The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Aki Tuomi-3
Uh. It's not actually supported to access same user concurrently from two servers even if you are using NFS. It will cause problems. You should use dovecot director or some other arrangement to make sure this does not happen.

Aki

> On 09/01/2020 16:56 Eirik Rye <[hidden email]> wrote:
>
>  
> As a workaround for the titular issue, I have tried enabling the
> "imap-hibernate" service on a couple of servers to reduce the number of
> running imap processes.
>
> Since ~50-60% of clients are in IDLE at any one time, this allows us to
> reduce the number of running imap processes to less than half of what it
> was.
>
> After this I have yet to see the "net_connect_unix(imap) failed" errors
> on these servers, however instead we are now instead seeing a few of these:
>
>    imap([redacted])<30266><[redacted]>: Error: imap-master: Failed to
> import client state: Mailbox INBOX: Couldn't get recently expunged UIDs
> (uidnext=13479 highest_modseq=1716)
>
> and
>
>    imap([redacted])<20055><[redacted]>: Error: imap-master: Failed to
> import client state: Mailbox INBOX: Couldn't send flag changes
>    imap([redacted])<20055><[redacted]>: NOTIFY failed in the middle of
> FETCH reply in=0 out=0 deleted=0 expunged=0 trashed=0 hdr_count=0
> hdr_bytes=0 body_count=0 body_bytes=0
>
> The former I suspect may be related to our NFS-backed store and
> simultaneous access from different, since many of the errors are
> accompanied by other corruption related errors:
>
>    imap([redacted])<30266><[redacted]>: Error: Mailbox INBOX: vsize-hdr
> has invalid size: 36
>
> This whole upgrade is part of a migration to a director-backed system,
> because we are seeing a too large amount of corruption errors
> (locking/indexes, etc) assumed to be caused by NFS.
>
> Once we introduce the director cluster in front, these errors should
> hopefully be strongly reduced. :-)
>
> --
> Eirik
>
> On 08/01/2020 15:14, Eirik Rye wrote:
> > With proctitles below.
> >
> > Strangely, Dovecot is reporting 12679 imap-login sockets in "pre-login"
> > state, which is about 50% more than there are active TCP connections to
> > the server (8271, of which 7546 are in "imap-login" process according to
> > `ss -tp`).
> >
> > On another server, not currently experiencing this issue, the number of
> > TCP sockets appears to more closely match the number of imap-login
> > sockets used (either in pre-login or TLS proxing).
> >
> > imap:~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
> > UID        PID  PPID  C STIME TTY          TIME CMD
> > dovenull  5262  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> > dovenull  5263  5260  2 11:48 ?        00:03:53 dovecot/imap-login [724
> > pre-login + 113 TLS proxies]
> > dovecot   5264  5260  0 11:48 ?        00:01:15 dovecot/anvil [21
> > connections]
> > root      5265  5260  1 11:48 ?        00:02:18 dovecot/log
> > dovenull  5266  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> > dovenull  5267  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> > dovenull  5268  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> > dovenull  5269  5260 20 11:48 ?        00:35:34 dovecot/imap-login [1073
> > pre-login + 1199 TLS proxies]
> > dovenull  5270  5260  1 11:48 ?        00:02:25 dovecot/imap-login [615
> > pre-login + 67 TLS proxies]
> > dovenull  5271  5260  0 11:48 ?        00:01:39 dovecot/imap-login [489
> > pre-login + 44 TLS proxies]
> > dovenull  5272  5260 14 11:48 ?        00:24:44 dovecot/imap-login [938
> > pre-login + 720 TLS proxies]
> > dovenull  5273  5260  6 11:48 ?        00:10:30 dovecot/imap-login [845
> > pre-login + 242 TLS proxies]
> > dovenull  5274  5260  4 11:48 ?        00:06:54 dovecot/imap-login [817
> > pre-login + 209 TLS proxies]
> > dovenull  5275  5260  0 11:48 ?        00:01:23 dovecot/imap-login [445
> > pre-login + 36 TLS proxies]
> > dovenull  5276  5260 25 11:48 ?        00:43:14 dovecot/imap-login [1064
> > pre-login + 1434 TLS proxies]
> > dovenull  5277  5260  1 11:48 ?        00:02:19 dovecot/imap-login [523
> > pre-login + 58 TLS proxies]
> > dovenull  5278  5260  9 11:48 ?        00:16:19 dovecot/imap-login [937
> > pre-login + 462 TLS proxies]
> > dovenull  5279  5260 19 11:48 ?        00:33:24 dovecot/imap-login [937
> > pre-login + 823 TLS proxies]
> > dovenull  5280  5260  1 11:48 ?        00:03:04 dovecot/imap-login [655
> > pre-login + 92 TLS proxies]
> > dovenull  5281  5260 26 11:48 ?        00:45:20 dovecot/imap-login [969
> > pre-login + 1450 TLS proxies]
> > dovenull  5282  5260  6 11:48 ?        00:10:55 dovecot/imap-login [917
> > pre-login + 303 TLS proxies]
> > dovenull  5283  5260  4 11:48 ?        00:08:36 dovecot/imap-login [731
> > pre-login + 128 TLS proxies]
> > root      5284  5260  2 11:48 ?        00:03:55 dovecot/config
> > dovecot   5285  5260  1 11:48 ?        00:02:19 dovecot/stats [7968
> > connections]
> > dovecot   5286  5260  3 11:48 ?        00:05:53 dovecot/auth [138 wait,
> > 0 passdb, 0 userdb]
> > 528246   10322  5260  0 14:36 ?        00:00:00 [imap]
> > root     16688  5260  0 14:40 ?        00:00:00 dovecot/imap
> > imap02:~# doveadm process status | grep "^imap-login " | awk "{sum +=
> > 2500-\$3} END {print sum}"
> > 20081
> > imap:~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
> > 7937
> > imap:~# doveadm process status | grep "^imap " | wc -l
> > 7937
> > imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" |
> > wc -l
> > 559
> > imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep
> > "\"imap-login\"" | wc -l
> > 7546
> > imap:~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
> > 8271
> >
> > On 08/01/2020 11:33, Eirik Rye wrote:
> >> On 08/01/2020 11:09, Aki Tuomi wrote:
> >>> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp
> >>> output?
> >>
> >> Thanks for the response!
> >>
> >> What exactly did you want to compare? `ss -tp` does not appear to show
> >> cmdline/process title. I enabled it for one server, but will have to
> >> wait for users to reconnect and for the issue to reappear there.
> >>
> >> In the meantime, on another server experiencing the same issue at this
> >> moment (same configuration), but no proctitles:
> >>
> >> root@imap03:~# doveadm process status | grep "^imap " | wc -l
> >> 7564
> >> root@imap03:~# ps aux | grep "dovecot/imap$" | wc -l
> >> 7570
> >> root@imap03:~# ss -tp | grep '"imap-login"' | wc -l
> >> 8009
> >> root@imap03:~# ss -tp | grep '"imap"' | wc -l
> >> 536
> >> root@imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
> >> 8739
> >> root@imap03:~# doveadm who -1 | wc -l
> >> 7581
> >>
> >> The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).
> >>
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
Yes, I realize this. That is why we are introducing the directors to the
cluster. We are also working on doing Dovecot LDA/LMTP delivery, to
improve Dovecot's ability to react to new messages. Currently we are
dropping email directly into the Maildir through other facilities, which
works -fine-.

I realize however that IDLE is most likely not working properly with
"imap-hibernate" in our current setup. I assume the "imap" service is
normally responsible for monitoring the user's mailbox for changes done
outside Dovecot?

Am I correct in my assumption that email must be delivered using
LMTP/LDA from the same server in order for IDLE in imap-hibernate to
work properly? Setting up a separate Dovecot instance purely for
LDA/LMTP will not work, as the imap-hibernate process will not be
notified of user mailbox activity in that case?

--
- Eirik

Den 09.01.2020 15:59, skrev Aki Tuomi:

> Uh. It's not actually supported to access same user concurrently from two servers even if you are using NFS. It will cause problems. You should use dovecot director or some other arrangement to make sure this does not happen.
>
> Aki
>
>> On 09/01/2020 16:56 Eirik Rye <[hidden email]> wrote:
>>
>>  
>> As a workaround for the titular issue, I have tried enabling the
>> "imap-hibernate" service on a couple of servers to reduce the number of
>> running imap processes.
>>
>> Since ~50-60% of clients are in IDLE at any one time, this allows us to
>> reduce the number of running imap processes to less than half of what it
>> was.
>>
>> After this I have yet to see the "net_connect_unix(imap) failed" errors
>> on these servers, however instead we are now instead seeing a few of these:
>>
>>     imap([redacted])<30266><[redacted]>: Error: imap-master: Failed to
>> import client state: Mailbox INBOX: Couldn't get recently expunged UIDs
>> (uidnext=13479 highest_modseq=1716)
>>
>> and
>>
>>     imap([redacted])<20055><[redacted]>: Error: imap-master: Failed to
>> import client state: Mailbox INBOX: Couldn't send flag changes
>>     imap([redacted])<20055><[redacted]>: NOTIFY failed in the middle of
>> FETCH reply in=0 out=0 deleted=0 expunged=0 trashed=0 hdr_count=0
>> hdr_bytes=0 body_count=0 body_bytes=0
>>
>> The former I suspect may be related to our NFS-backed store and
>> simultaneous access from different, since many of the errors are
>> accompanied by other corruption related errors:
>>
>>     imap([redacted])<30266><[redacted]>: Error: Mailbox INBOX: vsize-hdr
>> has invalid size: 36
>>
>> This whole upgrade is part of a migration to a director-backed system,
>> because we are seeing a too large amount of corruption errors
>> (locking/indexes, etc) assumed to be caused by NFS.
>>
>> Once we introduce the director cluster in front, these errors should
>> hopefully be strongly reduced. :-)
>>
>> --
>> Eirik
>>
>> On 08/01/2020 15:14, Eirik Rye wrote:
>>> With proctitles below.
>>>
>>> Strangely, Dovecot is reporting 12679 imap-login sockets in "pre-login"
>>> state, which is about 50% more than there are active TCP connections to
>>> the server (8271, of which 7546 are in "imap-login" process according to
>>> `ss -tp`).
>>>
>>> On another server, not currently experiencing this issue, the number of
>>> TCP sockets appears to more closely match the number of imap-login
>>> sockets used (either in pre-login or TLS proxing).
>>>
>>> imap:~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
>>> UID        PID  PPID  C STIME TTY          TIME CMD
>>> dovenull  5262  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
>>> dovenull  5263  5260  2 11:48 ?        00:03:53 dovecot/imap-login [724
>>> pre-login + 113 TLS proxies]
>>> dovecot   5264  5260  0 11:48 ?        00:01:15 dovecot/anvil [21
>>> connections]
>>> root      5265  5260  1 11:48 ?        00:02:18 dovecot/log
>>> dovenull  5266  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
>>> dovenull  5267  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
>>> dovenull  5268  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
>>> dovenull  5269  5260 20 11:48 ?        00:35:34 dovecot/imap-login [1073
>>> pre-login + 1199 TLS proxies]
>>> dovenull  5270  5260  1 11:48 ?        00:02:25 dovecot/imap-login [615
>>> pre-login + 67 TLS proxies]
>>> dovenull  5271  5260  0 11:48 ?        00:01:39 dovecot/imap-login [489
>>> pre-login + 44 TLS proxies]
>>> dovenull  5272  5260 14 11:48 ?        00:24:44 dovecot/imap-login [938
>>> pre-login + 720 TLS proxies]
>>> dovenull  5273  5260  6 11:48 ?        00:10:30 dovecot/imap-login [845
>>> pre-login + 242 TLS proxies]
>>> dovenull  5274  5260  4 11:48 ?        00:06:54 dovecot/imap-login [817
>>> pre-login + 209 TLS proxies]
>>> dovenull  5275  5260  0 11:48 ?        00:01:23 dovecot/imap-login [445
>>> pre-login + 36 TLS proxies]
>>> dovenull  5276  5260 25 11:48 ?        00:43:14 dovecot/imap-login [1064
>>> pre-login + 1434 TLS proxies]
>>> dovenull  5277  5260  1 11:48 ?        00:02:19 dovecot/imap-login [523
>>> pre-login + 58 TLS proxies]
>>> dovenull  5278  5260  9 11:48 ?        00:16:19 dovecot/imap-login [937
>>> pre-login + 462 TLS proxies]
>>> dovenull  5279  5260 19 11:48 ?        00:33:24 dovecot/imap-login [937
>>> pre-login + 823 TLS proxies]
>>> dovenull  5280  5260  1 11:48 ?        00:03:04 dovecot/imap-login [655
>>> pre-login + 92 TLS proxies]
>>> dovenull  5281  5260 26 11:48 ?        00:45:20 dovecot/imap-login [969
>>> pre-login + 1450 TLS proxies]
>>> dovenull  5282  5260  6 11:48 ?        00:10:55 dovecot/imap-login [917
>>> pre-login + 303 TLS proxies]
>>> dovenull  5283  5260  4 11:48 ?        00:08:36 dovecot/imap-login [731
>>> pre-login + 128 TLS proxies]
>>> root      5284  5260  2 11:48 ?        00:03:55 dovecot/config
>>> dovecot   5285  5260  1 11:48 ?        00:02:19 dovecot/stats [7968
>>> connections]
>>> dovecot   5286  5260  3 11:48 ?        00:05:53 dovecot/auth [138 wait,
>>> 0 passdb, 0 userdb]
>>> 528246   10322  5260  0 14:36 ?        00:00:00 [imap]
>>> root     16688  5260  0 14:40 ?        00:00:00 dovecot/imap
>>> imap02:~# doveadm process status | grep "^imap-login " | awk "{sum +=
>>> 2500-\$3} END {print sum}"
>>> 20081
>>> imap:~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
>>> 7937
>>> imap:~# doveadm process status | grep "^imap " | wc -l
>>> 7937
>>> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" |
>>> wc -l
>>> 559
>>> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep
>>> "\"imap-login\"" | wc -l
>>> 7546
>>> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
>>> 8271
>>>
>>> On 08/01/2020 11:33, Eirik Rye wrote:
>>>> On 08/01/2020 11:09, Aki Tuomi wrote:
>>>>> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp
>>>>> output?
>>>>
>>>> Thanks for the response!
>>>>
>>>> What exactly did you want to compare? `ss -tp` does not appear to show
>>>> cmdline/process title. I enabled it for one server, but will have to
>>>> wait for users to reconnect and for the issue to reappear there.
>>>>
>>>> In the meantime, on another server experiencing the same issue at this
>>>> moment (same configuration), but no proctitles:
>>>>
>>>> root@imap03:~# doveadm process status | grep "^imap " | wc -l
>>>> 7564
>>>> root@imap03:~# ps aux | grep "dovecot/imap$" | wc -l
>>>> 7570
>>>> root@imap03:~# ss -tp | grep '"imap-login"' | wc -l
>>>> 8009
>>>> root@imap03:~# ss -tp | grep '"imap"' | wc -l
>>>> 536
>>>> root@imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
>>>> 8739
>>>> root@imap03:~# doveadm who -1 | wc -l
>>>> 7581
>>>>
>>>> The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).
>>>>
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Aki Tuomi-3
Yes, the same server should be used for delivery too. For maildir, some leniency is allowed but it will still work suboptimally if you do not use dovecot for delivery. And hibernate will not work.

Aki

> On 09/01/2020 17:57 Eirik Rye <[hidden email]> wrote:
>
>  
> Yes, I realize this. That is why we are introducing the directors to the
> cluster. We are also working on doing Dovecot LDA/LMTP delivery, to
> improve Dovecot's ability to react to new messages. Currently we are
> dropping email directly into the Maildir through other facilities, which
> works -fine-.
>
> I realize however that IDLE is most likely not working properly with
> "imap-hibernate" in our current setup. I assume the "imap" service is
> normally responsible for monitoring the user's mailbox for changes done
> outside Dovecot?
>
> Am I correct in my assumption that email must be delivered using
> LMTP/LDA from the same server in order for IDLE in imap-hibernate to
> work properly? Setting up a separate Dovecot instance purely for
> LDA/LMTP will not work, as the imap-hibernate process will not be
> notified of user mailbox activity in that case?
>
> --
> - Eirik
>
> Den 09.01.2020 15:59, skrev Aki Tuomi:
> > Uh. It's not actually supported to access same user concurrently from two servers even if you are using NFS. It will cause problems. You should use dovecot director or some other arrangement to make sure this does not happen.
> >
> > Aki
> >
> >> On 09/01/2020 16:56 Eirik Rye <[hidden email]> wrote:
> >>
> >>  
> >> As a workaround for the titular issue, I have tried enabling the
> >> "imap-hibernate" service on a couple of servers to reduce the number of
> >> running imap processes.
> >>
> >> Since ~50-60% of clients are in IDLE at any one time, this allows us to
> >> reduce the number of running imap processes to less than half of what it
> >> was.
> >>
> >> After this I have yet to see the "net_connect_unix(imap) failed" errors
> >> on these servers, however instead we are now instead seeing a few of these:
> >>
> >>     imap([redacted])<30266><[redacted]>: Error: imap-master: Failed to
> >> import client state: Mailbox INBOX: Couldn't get recently expunged UIDs
> >> (uidnext=13479 highest_modseq=1716)
> >>
> >> and
> >>
> >>     imap([redacted])<20055><[redacted]>: Error: imap-master: Failed to
> >> import client state: Mailbox INBOX: Couldn't send flag changes
> >>     imap([redacted])<20055><[redacted]>: NOTIFY failed in the middle of
> >> FETCH reply in=0 out=0 deleted=0 expunged=0 trashed=0 hdr_count=0
> >> hdr_bytes=0 body_count=0 body_bytes=0
> >>
> >> The former I suspect may be related to our NFS-backed store and
> >> simultaneous access from different, since many of the errors are
> >> accompanied by other corruption related errors:
> >>
> >>     imap([redacted])<30266><[redacted]>: Error: Mailbox INBOX: vsize-hdr
> >> has invalid size: 36
> >>
> >> This whole upgrade is part of a migration to a director-backed system,
> >> because we are seeing a too large amount of corruption errors
> >> (locking/indexes, etc) assumed to be caused by NFS.
> >>
> >> Once we introduce the director cluster in front, these errors should
> >> hopefully be strongly reduced. :-)
> >>
> >> --
> >> Eirik
> >>
> >> On 08/01/2020 15:14, Eirik Rye wrote:
> >>> With proctitles below.
> >>>
> >>> Strangely, Dovecot is reporting 12679 imap-login sockets in "pre-login"
> >>> state, which is about 50% more than there are active TCP connections to
> >>> the server (8271, of which 7546 are in "imap-login" process according to
> >>> `ss -tp`).
> >>>
> >>> On another server, not currently experiencing this issue, the number of
> >>> TCP sockets appears to more closely match the number of imap-login
> >>> sockets used (either in pre-login or TLS proxing).
> >>>
> >>> imap:~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
> >>> UID        PID  PPID  C STIME TTY          TIME CMD
> >>> dovenull  5262  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> >>> dovenull  5263  5260  2 11:48 ?        00:03:53 dovecot/imap-login [724
> >>> pre-login + 113 TLS proxies]
> >>> dovecot   5264  5260  0 11:48 ?        00:01:15 dovecot/anvil [21
> >>> connections]
> >>> root      5265  5260  1 11:48 ?        00:02:18 dovecot/log
> >>> dovenull  5266  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> >>> dovenull  5267  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> >>> dovenull  5268  5260  0 11:48 ?        00:00:00 dovecot/pop3-login
> >>> dovenull  5269  5260 20 11:48 ?        00:35:34 dovecot/imap-login [1073
> >>> pre-login + 1199 TLS proxies]
> >>> dovenull  5270  5260  1 11:48 ?        00:02:25 dovecot/imap-login [615
> >>> pre-login + 67 TLS proxies]
> >>> dovenull  5271  5260  0 11:48 ?        00:01:39 dovecot/imap-login [489
> >>> pre-login + 44 TLS proxies]
> >>> dovenull  5272  5260 14 11:48 ?        00:24:44 dovecot/imap-login [938
> >>> pre-login + 720 TLS proxies]
> >>> dovenull  5273  5260  6 11:48 ?        00:10:30 dovecot/imap-login [845
> >>> pre-login + 242 TLS proxies]
> >>> dovenull  5274  5260  4 11:48 ?        00:06:54 dovecot/imap-login [817
> >>> pre-login + 209 TLS proxies]
> >>> dovenull  5275  5260  0 11:48 ?        00:01:23 dovecot/imap-login [445
> >>> pre-login + 36 TLS proxies]
> >>> dovenull  5276  5260 25 11:48 ?        00:43:14 dovecot/imap-login [1064
> >>> pre-login + 1434 TLS proxies]
> >>> dovenull  5277  5260  1 11:48 ?        00:02:19 dovecot/imap-login [523
> >>> pre-login + 58 TLS proxies]
> >>> dovenull  5278  5260  9 11:48 ?        00:16:19 dovecot/imap-login [937
> >>> pre-login + 462 TLS proxies]
> >>> dovenull  5279  5260 19 11:48 ?        00:33:24 dovecot/imap-login [937
> >>> pre-login + 823 TLS proxies]
> >>> dovenull  5280  5260  1 11:48 ?        00:03:04 dovecot/imap-login [655
> >>> pre-login + 92 TLS proxies]
> >>> dovenull  5281  5260 26 11:48 ?        00:45:20 dovecot/imap-login [969
> >>> pre-login + 1450 TLS proxies]
> >>> dovenull  5282  5260  6 11:48 ?        00:10:55 dovecot/imap-login [917
> >>> pre-login + 303 TLS proxies]
> >>> dovenull  5283  5260  4 11:48 ?        00:08:36 dovecot/imap-login [731
> >>> pre-login + 128 TLS proxies]
> >>> root      5284  5260  2 11:48 ?        00:03:55 dovecot/config
> >>> dovecot   5285  5260  1 11:48 ?        00:02:19 dovecot/stats [7968
> >>> connections]
> >>> dovecot   5286  5260  3 11:48 ?        00:05:53 dovecot/auth [138 wait,
> >>> 0 passdb, 0 userdb]
> >>> 528246   10322  5260  0 14:36 ?        00:00:00 [imap]
> >>> root     16688  5260  0 14:40 ?        00:00:00 dovecot/imap
> >>> imap02:~# doveadm process status | grep "^imap-login " | awk "{sum +=
> >>> 2500-\$3} END {print sum}"
> >>> 20081
> >>> imap:~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
> >>> 7937
> >>> imap:~# doveadm process status | grep "^imap " | wc -l
> >>> 7937
> >>> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" |
> >>> wc -l
> >>> 559
> >>> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | grep
> >>> "\"imap-login\"" | wc -l
> >>> 7546
> >>> imap:~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
> >>> 8271
> >>>
> >>> On 08/01/2020 11:33, Eirik Rye wrote:
> >>>> On 08/01/2020 11:09, Aki Tuomi wrote:
> >>>>> Can you enable 'verbose_proctitle=yes' and maybe compare with ss -tp
> >>>>> output?
> >>>>
> >>>> Thanks for the response!
> >>>>
> >>>> What exactly did you want to compare? `ss -tp` does not appear to show
> >>>> cmdline/process title. I enabled it for one server, but will have to
> >>>> wait for users to reconnect and for the issue to reappear there.
> >>>>
> >>>> In the meantime, on another server experiencing the same issue at this
> >>>> moment (same configuration), but no proctitles:
> >>>>
> >>>> root@imap03:~# doveadm process status | grep "^imap " | wc -l
> >>>> 7564
> >>>> root@imap03:~# ps aux | grep "dovecot/imap$" | wc -l
> >>>> 7570
> >>>> root@imap03:~# ss -tp | grep '"imap-login"' | wc -l
> >>>> 8009
> >>>> root@imap03:~# ss -tp | grep '"imap"' | wc -l
> >>>> 536
> >>>> root@imap03:~# ss -nt "( sport = :143 or sport = :993 )" | wc -l
> >>>> 8739
> >>>> root@imap03:~# doveadm who -1 | wc -l
> >>>> 7581
> >>>>
> >>>> The `ss -tp` output appears normal (e.g. 8009+536 ~= 8739).
> >>>>
> >>>
> >>>
> >
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
In reply to this post by Eirik Rye-2
We are unfortunately still seeing a lot of these errors once the machine
reaches a high number of concurrent users/logins (just below 20k
simultaneous IMAP connections on a powerful 24 core machine with 128GB RAM):

2020-01-14T09:18:58.661349+01:00 dovecot: imap-login: Warning:
net_connect_unix(imap) succeeded only after retrying - took 140330 us
2020-01-14T09:18:58.854692+01:00 dovecot: imap-login: Error:
master(imap): net_connect_unix(imap) failed: Resource temporarily
unavailable - http://wiki2.dovecot.org/SocketUnavailable 
(client-pid=107932, client-id=780262, rip=x, created 500 msecs ago,
received 0/4 bytes)
2020-01-14T09:18:58.888228+01:00 dovecot: imap-login: Warning:
net_connect_unix(imap) succeeded only after retrying - took 440762 us

The machine is at insignificant load numbers, and the dovecot process is
somewhere near 25% CPU usage when the problem occurs. It is not close to
saturating its core from what I can tell.

To make sure the issues are not task/fd-limit related, I have set this
in /etc/systemd/system/dovecot.service.d/service.conf:

[Service]
LimitNOFILE=infinity
TasksMax=infinity

~# egrep "processes|files" /proc/`pidof dovecot`/limits
Max processes             514051               514051
Max open files            1048576              1048576
~# cat /proc/sys/net/core/somaxconn
65536
~# cat /proc/sys/kernel/pid_max
278528

Dovecot is configured with NFS backed storage, and locally stored
passwdfile userdb/passdb are used for authentication.

Backends are (now) behind directors. The directors/proxies are having no
issues dealing with the traffic whatsoever.

At the time of failure, the process list looks like this:

~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
UID         PID   PPID  C STIME TTY          TIME CMD
274264    44753 138506  0 12:43 ?        00:00:00 [imap]
308665   104852 138506  0 13:01 ?        00:00:00 [imap]
308665   104853 138506  0 13:01 ?        00:00:00 [imap]
dovenull 138508 138506  1 10:31 ?        00:03:00 dovecot/pop3-login [6
pre-login + 36 TLS proxies]
dovenull 138509 138506  0 10:31 ?        00:00:07 dovecot/imap-login
dovecot  138510 138506  0 10:31 ?        00:01:10 dovecot/anvil [20
connections]
root     138511 138506  1 10:31 ?        00:02:14 dovecot/log
dovenull 138512 138506  1 10:31 ?        00:01:48 dovecot/pop3-login [1
pre-login + 15 TLS proxies]
dovenull 138513 138506  0 10:31 ?        00:00:08 dovecot/imap-login
[redacted TLS proxy]
dovenull 138514 138506  0 10:31 ?        00:00:07 dovecot/imap-login [0
pre-login + 3 TLS proxies]
dovenull 138515 138506  0 10:31 ?        00:00:10 dovecot/imap-login
[redacted TLS proxy]
dovenull 138516 138506  0 10:31 ?        00:01:14 dovecot/imap-login [27
pre-login + 12 TLS proxies]
dovenull 138517 138506  0 10:31 ?        00:00:31 dovecot/imap-login [2
pre-login + 2 TLS proxies]
dovenull 138518 138506  0 10:31 ?        00:01:28 dovecot/imap-login [56
pre-login + 20 TLS proxies]
dovenull 138519 138506  0 10:31 ?        00:00:09 dovecot/imap-login [0
pre-login + 4 TLS proxies]
dovenull 138520 138506  0 10:31 ?        00:00:06 dovecot/imap-login
[redacted TLS proxy]
dovenull 138521 138506  0 10:31 ?        00:00:11 dovecot/imap-login [0
pre-login + 3 TLS proxies]
dovenull 138522 138506  0 10:31 ?        00:00:16 dovecot/imap-login [2
pre-login + 2 TLS proxies]
dovenull 138523 138506  0 10:31 ?        00:00:13 dovecot/imap-login [1
pre-login + 2 TLS proxies]
dovenull 138524 138506  0 10:31 ?        00:00:24 dovecot/imap-login [1
pre-login + 3 TLS proxies]
dovenull 138525 138506  0 10:31 ?        00:01:13 dovecot/imap-login [36
pre-login + 23 TLS proxies]
dovenull 138526 138506  0 10:31 ?        00:00:41 dovecot/imap-login [10
pre-login + 12 TLS proxies]
dovenull 138527 138506  0 10:31 ?        00:00:20 dovecot/imap-login [1
pre-login + 7 TLS proxies]
root     138528 138506  2 10:31 ?        00:04:45 dovecot/config
dovecot  138529 138506  1 10:31 ?        00:02:22 dovecot/stats [19389
connections]
dovecot  138530 138506  3 10:31 ?        00:05:53 dovecot/auth [137
wait, 0 passdb, 0 userdb]
root     148675 138506  0 13:13 ?        00:00:00 dovecot/doveadm-server
[redacted]

Other stats:

~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
19328
~# doveadm process status | grep "^imap " | wc -l
19307
~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" | wc -l
19141
~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap-login\"" |
wc -l
333
~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
19530

~# doveconf -n

# 2.3.9.2 (cf2918cac): /etc/dovecot/dovecot.conf
# OS: Linux 4.9.0-9-amd64 x86_64 Debian 9.11
# Hostname: [redacted]
default_vsz_limit = 768 M
disable_plaintext_auth = no
imap_id_log = *
log_timestamp = "%F %T %z "
login_trusted_networks = [redacted]
mail_fsync = always
mail_location = maildir:~/Maildir
mail_nfs_index = yes
mail_nfs_storage = yes
mmap_disable = yes
namespace inbox {
   inbox = yes
   location =
   mailbox Drafts {
     auto = subscribe
     special_use = \Drafts
   }
   mailbox Sent {
     auto = subscribe
     special_use = \Sent
   }
   mailbox "Sent Messages" {
     auto = no
     special_use = \Sent
   }
   mailbox Spam {
     auto = create
     special_use = \Junk
   }
   mailbox Trash {
     auto = subscribe
     special_use = \Trash
   }
   prefix =
   separator = /
}
passdb {
   args = username_format=%Lu /etc/dovecot/aliases
   default_fields = noauthenticate
   driver = passwd-file
}
passdb {
   args = /etc/dovecot/passwd
   driver = passwd-file
}
protocols = imap pop3
service doveadm {
   inet_listener {
     port = 24245
   }
   inet_listener http {
     port = 8080
   }
}
service imap-login {
   client_limit = 4096
   inet_listener imap {
     port = 143
   }
   inet_listener imaps {
     port = 993
     ssl = yes
   }
   process_limit = 16
   process_min_avail = 16
   service_count = 0
   vsz_limit = 768 M
}
service imap {
   client_limit = 1
   process_limit = 65536
}
service pop3-login {
   client_limit = 4096
   inet_listener pop3 {
     port = 110
   }
   inet_listener pop3s {
     port = 995
     ssl = yes
   }
   process_limit = 8
   process_min_avail = 2
   service_count = 0
}
service pop3 {
   process_limit = 16384
}
service stats {
   client_limit = 65536
}
ssl_cert = <[redacted]
ssl_cipher_list = ALL:!LOW:!SSLv2:!EXP:!aNULL
ssl_dh = # hidden, use -P to show it
ssl_key = # hidden, use -P to show it
userdb {
   args = /etc/dovecot/passwd
   driver = passwd-file
}
verbose_proctitle = yes
protocol imap {
   mail_max_userip_connections = 20
   rawlog_dir = [redacted]
}

Are there any other tunables either in Dovecot or in the kernel that may
relate to this issue that we may have missed?

--
Eirik

On 06/01/2020 11:28, Eirik Rye wrote:

> Hi,
>
> After upgrading Dovecot from version 2.2.27 to 2.3.9.2, we are
> sporadically seeing lots of these errors in the error log on many of our
> servers:
>
>    imap-login: Error: master(imap): net_connect_unix(imap) failed:
> Resource temporarily unavailable -
> http://wiki2.dovecot.org/SocketUnavailable
>
> The issue is causing significant delays and/or timeouts on login.
>
>  From what I can tell, this happens because the imap-login service is
> unable to connect to the imap service socket, however I am unable to
> identify the root cause of the issue.
>
> We have increased the client_limit for the stats service, as described
> in the upgrade document at (https://wiki2.dovecot.org/Upgrading/2.3). It
> is set to the same value as the process_limit for imap service.
>
> System CPU usage does not appear to be saturated. The CPU usage sits at
> around 50% when these errors appear.
>
> We have doubled the number of file descriptors after upgrading from
> 2.2.27. It was previously set to 16392 (which worked fine):
>
> ~# cat /proc/`pidof dovecot`/limits | grep "open files"
> Max open files            30000                30000                files
>
> We have tried increasing default_process_limit and default_client_limit
> from the default 1000 to 3000, but this has no effect.
>
> Current configuration (with irrelevant parts removed):
>
> ~# doveconf -n
>
> default_client_limit = 3000 # these were raised after upgrading in
> attempt to remedy
> default_process_limit = 3000
> default_vsz_limit = 512 M
> [...]
> service imap-login {
>    client_limit = 1250
>    inet_listener imap {
>      port = 143
>    }
>    inet_listener imaps {
>      port = 993
>      ssl = yes
>    }
>    process_limit = 16
>    process_min_avail = 16
>    service_count = 0
>    vsz_limit = 512 M
> }
> service imap {
>    client_limit = 1
>    process_limit = 20000
> }
> service stats {
>    client_limit = 20000
> }
> [...]
>
> We appear to be nowhere near the 20000 process_limit set for the imap
> service. We are also not seeing any warnings being logged with regards
> to the process_limit.
>
> ~# doveadm who -1 | wc -l
> 8765
>
> ~# doveadm process status | grep "^imap " | wc -l
> 7583
>
> ~# ps aux | grep "dovecot/imap" | wc -l
> 7449
>
> ~# doveadm process status | grep -v "^imap "
> name       pid   available_count total_count idle_start
> last_status_update last_kill_sent
> stats      19875 12384           582387      0          1578305489   0
> pop3-login 19868 1000            0           0          1578061504   0
> pop3-login 19867 1000            0           0          1578061503   0
> pop3-login 19866 1000            0           0          1578061502   0
> pop3-login 19865 1000            0           0          1578061503   0
> pop3-login 19864 1000            0           0          1578061503   0
> pop3-login 19853 1000            0           0          1578061504   0
> pop3-login 19852 1000            0           0          1578061504   0
> pop3-login 19851 1000            0           0          1578061504   0
> pop3-login 19849 1000            0           0          1578061503   0
> pop3-login 19847 1000            0           0          1578061503   0
> pop3-login 19846 1000            0           0          1578061503   0
> pop3-login 19845 1000            0           0          1578061503   0
> pop3-login 19844 1000            0           0          1578061503   0
> pop3-login 19843 1000            0           0          1578061503   0
> pop3-login 19842 1000            0           0          1578061503   0
> pop3-login 19839 1000            0           0          1578061502   0
> log        19841 2973            27          0          1578061502   0
> imap-login 19873 666             14942       0          1578305488   0
> imap-login 19872 523             32792       0          1578305489   0
> imap-login 19871 316             62876       0          1578305489   0
> imap-login 19870 167             78332       0          1578305489   0
> imap-login 19869 118             97989       0          1578305489   0
> imap-login 19863 0               184726      0          1578305489   0
> imap-login 19862 0               193094      0          1578305489   0
> imap-login 19861 0               186800      0          1578305487   0
> imap-login 19860 675             17806       0          1578305487   0
> imap-login 19859 0               169446      0          1578305489   0
> imap-login 19858 0               143517      0          1578305489   0
> imap-login 19857 0               119215      0          1578305488   0
> imap-login 19856 0               151958      0          1578305489   0
> imap-login 19855 483             47036       0          1578305488   0
> imap-login 19854 1               185240      0          1578305489   0
> imap-login 19840 567             23859       0          1578305485   0
> config     19874 2967            124482      0          1578305489   0
> auth       19885 2967            124185      0          1578305489   0
> anvil      10017 967             7567        0          1578305463   0
>

Reply | Threaded
Open this post in threaded view
|

Re: Upgrade 2.2.27 to 2.3.9.2: master(imap): net_connect_unix(imap) failed: Resource temporarily unavailable

Eirik Rye-2
Embarrassingly, the issue does appear to have been
/proc/sys/net/core/somaxconn setting all along. It was increased from
128, to 512, to 1024 with full restarts of Dovecot inbetween. However,
this was apparently not enough to handle our login bursts.

The latest increase, from 1024 to 65536, only had a `doveadm reload`
issued, which seemingly does not cause the socket to rebind and
therefore still used the old settings.

After fully restarting dovecot, the issue -appears- to be gone. At least
we survived through the morning spikes today without errors.

Apologies for the list spam.

On 14/01/2020 13:38, Eirik Rye wrote:

> We are unfortunately still seeing a lot of these errors once the machine
> reaches a high number of concurrent users/logins (just below 20k
> simultaneous IMAP connections on a powerful 24 core machine with 128GB
> RAM):
>
> 2020-01-14T09:18:58.661349+01:00 dovecot: imap-login: Warning:
> net_connect_unix(imap) succeeded only after retrying - took 140330 us
> 2020-01-14T09:18:58.854692+01:00 dovecot: imap-login: Error:
> master(imap): net_connect_unix(imap) failed: Resource temporarily
> unavailable - http://wiki2.dovecot.org/SocketUnavailable 
> (client-pid=107932, client-id=780262, rip=x, created 500 msecs ago,
> received 0/4 bytes)
> 2020-01-14T09:18:58.888228+01:00 dovecot: imap-login: Warning:
> net_connect_unix(imap) succeeded only after retrying - took 440762 us
>
> The machine is at insignificant load numbers, and the dovecot process is
> somewhere near 25% CPU usage when the problem occurs. It is not close to
> saturating its core from what I can tell.
>
> To make sure the issues are not task/fd-limit related, I have set this
> in /etc/systemd/system/dovecot.service.d/service.conf:
>
> [Service]
> LimitNOFILE=infinity
> TasksMax=infinity
>
> ~# egrep "processes|files" /proc/`pidof dovecot`/limits
> Max processes             514051               514051
> Max open files            1048576              1048576
> ~# cat /proc/sys/net/core/somaxconn
> 65536
> ~# cat /proc/sys/kernel/pid_max
> 278528
>
> Dovecot is configured with NFS backed storage, and locally stored
> passwdfile userdb/passdb are used for authentication.
>
> Backends are (now) behind directors. The directors/proxies are having no
> issues dealing with the traffic whatsoever.
>
> At the time of failure, the process list looks like this:
>
> ~# ps -f --ppid `pidof dovecot` | egrep -v "dovecot/(imap|pop3) "
> UID         PID   PPID  C STIME TTY          TIME CMD
> 274264    44753 138506  0 12:43 ?        00:00:00 [imap]
> 308665   104852 138506  0 13:01 ?        00:00:00 [imap]
> 308665   104853 138506  0 13:01 ?        00:00:00 [imap]
> dovenull 138508 138506  1 10:31 ?        00:03:00 dovecot/pop3-login [6
> pre-login + 36 TLS proxies]
> dovenull 138509 138506  0 10:31 ?        00:00:07 dovecot/imap-login
> dovecot  138510 138506  0 10:31 ?        00:01:10 dovecot/anvil [20
> connections]
> root     138511 138506  1 10:31 ?        00:02:14 dovecot/log
> dovenull 138512 138506  1 10:31 ?        00:01:48 dovecot/pop3-login [1
> pre-login + 15 TLS proxies]
> dovenull 138513 138506  0 10:31 ?        00:00:08 dovecot/imap-login
> [redacted TLS proxy]
> dovenull 138514 138506  0 10:31 ?        00:00:07 dovecot/imap-login [0
> pre-login + 3 TLS proxies]
> dovenull 138515 138506  0 10:31 ?        00:00:10 dovecot/imap-login
> [redacted TLS proxy]
> dovenull 138516 138506  0 10:31 ?        00:01:14 dovecot/imap-login [27
> pre-login + 12 TLS proxies]
> dovenull 138517 138506  0 10:31 ?        00:00:31 dovecot/imap-login [2
> pre-login + 2 TLS proxies]
> dovenull 138518 138506  0 10:31 ?        00:01:28 dovecot/imap-login [56
> pre-login + 20 TLS proxies]
> dovenull 138519 138506  0 10:31 ?        00:00:09 dovecot/imap-login [0
> pre-login + 4 TLS proxies]
> dovenull 138520 138506  0 10:31 ?        00:00:06 dovecot/imap-login
> [redacted TLS proxy]
> dovenull 138521 138506  0 10:31 ?        00:00:11 dovecot/imap-login [0
> pre-login + 3 TLS proxies]
> dovenull 138522 138506  0 10:31 ?        00:00:16 dovecot/imap-login [2
> pre-login + 2 TLS proxies]
> dovenull 138523 138506  0 10:31 ?        00:00:13 dovecot/imap-login [1
> pre-login + 2 TLS proxies]
> dovenull 138524 138506  0 10:31 ?        00:00:24 dovecot/imap-login [1
> pre-login + 3 TLS proxies]
> dovenull 138525 138506  0 10:31 ?        00:01:13 dovecot/imap-login [36
> pre-login + 23 TLS proxies]
> dovenull 138526 138506  0 10:31 ?        00:00:41 dovecot/imap-login [10
> pre-login + 12 TLS proxies]
> dovenull 138527 138506  0 10:31 ?        00:00:20 dovecot/imap-login [1
> pre-login + 7 TLS proxies]
> root     138528 138506  2 10:31 ?        00:04:45 dovecot/config
> dovecot  138529 138506  1 10:31 ?        00:02:22 dovecot/stats [19389
> connections]
> dovecot  138530 138506  3 10:31 ?        00:05:53 dovecot/auth [137
> wait, 0 passdb, 0 userdb]
> root     148675 138506  0 13:13 ?        00:00:00 dovecot/doveadm-server
> [redacted]
>
> Other stats:
>
> ~# ps -f --ppid `pidof dovecot` | grep "dovecot/imap " | wc -l
> 19328
> ~# doveadm process status | grep "^imap " | wc -l
> 19307
> ~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap\"" | wc -l
> 19141
> ~# ss -ntp "( sport = :143 or sport = :993 )" | grep "\"imap-login\"" |
> wc -l
> 333
> ~# ss -ntp "( sport = :143 or sport = :993 )" | wc -l
> 19530
>
> ~# doveconf -n
>
> # 2.3.9.2 (cf2918cac): /etc/dovecot/dovecot.conf
> # OS: Linux 4.9.0-9-amd64 x86_64 Debian 9.11
> # Hostname: [redacted]
> default_vsz_limit = 768 M
> disable_plaintext_auth = no
> imap_id_log = *
> log_timestamp = "%F %T %z "
> login_trusted_networks = [redacted]
> mail_fsync = always
> mail_location = maildir:~/Maildir
> mail_nfs_index = yes
> mail_nfs_storage = yes
> mmap_disable = yes
> namespace inbox {
>    inbox = yes
>    location =
>    mailbox Drafts {
>      auto = subscribe
>      special_use = \Drafts
>    }
>    mailbox Sent {
>      auto = subscribe
>      special_use = \Sent
>    }
>    mailbox "Sent Messages" {
>      auto = no
>      special_use = \Sent
>    }
>    mailbox Spam {
>      auto = create
>      special_use = \Junk
>    }
>    mailbox Trash {
>      auto = subscribe
>      special_use = \Trash
>    }
>    prefix =
>    separator = /
> }
> passdb {
>    args = username_format=%Lu /etc/dovecot/aliases
>    default_fields = noauthenticate
>    driver = passwd-file
> }
> passdb {
>    args = /etc/dovecot/passwd
>    driver = passwd-file
> }
> protocols = imap pop3
> service doveadm {
>    inet_listener {
>      port = 24245
>    }
>    inet_listener http {
>      port = 8080
>    }
> }
> service imap-login {
>    client_limit = 4096
>    inet_listener imap {
>      port = 143
>    }
>    inet_listener imaps {
>      port = 993
>      ssl = yes
>    }
>    process_limit = 16
>    process_min_avail = 16
>    service_count = 0
>    vsz_limit = 768 M
> }
> service imap {
>    client_limit = 1
>    process_limit = 65536
> }
> service pop3-login {
>    client_limit = 4096
>    inet_listener pop3 {
>      port = 110
>    }
>    inet_listener pop3s {
>      port = 995
>      ssl = yes
>    }
>    process_limit = 8
>    process_min_avail = 2
>    service_count = 0
> }
> service pop3 {
>    process_limit = 16384
> }
> service stats {
>    client_limit = 65536
> }
> ssl_cert = <[redacted]
> ssl_cipher_list = ALL:!LOW:!SSLv2:!EXP:!aNULL
> ssl_dh = # hidden, use -P to show it
> ssl_key = # hidden, use -P to show it
> userdb {
>    args = /etc/dovecot/passwd
>    driver = passwd-file
> }
> verbose_proctitle = yes
> protocol imap {
>    mail_max_userip_connections = 20
>    rawlog_dir = [redacted]
> }
>
> Are there any other tunables either in Dovecot or in the kernel that may
> relate to this issue that we may have missed?
>


--
Eirik