Re: Sorting a folder (by THREAD) takes a long time

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Sorting a folder (by THREAD) takes a long time

Asheesh Laroia
On Fri, 7 Mar 2008, Timo Sirainen wrote:

> How large is your INBOX? Unless you've disabled index files the threading
> should go pretty fast even for thousands of messages.

(now it's "archive" and "Lists" that are so big)

ca. 100K messages

> But if it really is because of THREAD, you're in luck. I've almost
> finished thread index implementation and I'm also considering making it
> available for v1.1 as a plugin.

That would rule.  (Unless I can already get it as part of the 1.2 hg tree,
in which case I'm not averse to upgrading)

On Fri, 7 Mar 2008, Benjamin R. Haskell wrote:

> I'd do the traces if I were you. Maybe it's something to do with "Thread
> Sorts By Arrival" or similar Alpine threading quirks. There was also traffic
> a couple months ago about threading scaling non-linearly in Alpine depending
> on that setting. I assume you're running a recent snapshot, but I don't
> recall whether that issue was resolved.

Good question about "Thread sorts by arrival".  I disabled that for now.

Attaching to the imap process in gdb and interrupting to get a stack trace
every once in a while, I find it's mostly (80%) in this:

#2  0xb7e28ff3 in readdir64 () from /lib/tls/i686/cmov/libc.so.6
#3  0x0806fbe1 in maildir_scan_dir (ctx=0x80f6170, new_dir=false) at
maildir-sync.c:421
#4  0x0807060d in maildir_sync_context (ctx=0x80f6170, forced=<value
optimized out>, lost_files_r=0xbf9a7aab) at maildir-sync.c:778
#5  0x080706f6 in maildir_storage_sync_init (box=0x81076a0, flags=<value
optimized out>) at maildir-sync.c:837
#6  0x080ba49f in mailbox_sync (box=0x81076a0,
flags=MAILBOX_SYNC_FLAG_FULL_READ, status_items=111, status_r=0xbf9a7b18)
at mail-storage.c:536
#7  0x0805fb75 in cmd_select_full (cmd=0x8100c78, readonly=false) at
cmd-select.c:39
#8  0x0805fd19 in cmd_select (cmd=0x8100c78) at cmd-select.c:87
#9  0x08061389 in client_command_input (cmd=0x8100c78) at client.c:505

and maybe (maybe less) 20% in this:

(gdb) bt
#0  mail_cache_lookup_iter_next (ctx=0xbf9a78b0, field_r=0xbf9a78d0) at
mail-cache-lookup.c:226
#1  0x0809e208 in mail_cache_field_exists (view=0x8109820, seq=106065,
field=1) at mail-cache-lookup.c:260
#2  0x080929fe in index_mail_set_seq (_mail=0x8116390, seq=106065) at
index-mail.c:1054
#3  0x0809656a in index_storage_search_next_nonblock (_ctx=0x8108358,
mail=0x8116390, tryagain_r=0xbf9a7a7b) at index-search.c:1039
#4  0xb7f0d441 in fts_mailbox_search_next_nonblock (ctx=0x8108358,
mail=0x8116390, tryagain_r=0xbf9a7a7b) at fts-storage.c:617
#5  0x080ba642 in mailbox_search_next (ctx=0x8108358, mail=0x8116390) at
mail-storage.c:633
#6  0x080680e9 in imap_thread (cmd=0x8100c78, charset=0x8104f50
"US-ASCII", args=0x8109bf8, type=MAIL_THREAD_REFERENCES) at
imap-thread.c:142

Opening the folder takes some 10 seconds.  FWIW, I know that only one
message has changed in the folder since I last SELECTed it - that was an
APPEND, which Dovecot knew about - maybe it could have resynchronized the
folder in the background during that APPEND so the next open would be
instantaneous.

Can maybe Dovecot show me the last cached version of the folder so I can
see the index immediately, and in the background re-sync that?  That way
opening folders could be instantaneous, and each next IMAP action could
give me a chance to see the synchronized state once it's ready.

-- Asheesh.

--
we:
  The single most important word in the world.
Reply | Threaded
Open this post in threaded view
|

Re: Sorting a folder (by THREAD) takes a long time

Timo Sirainen
On Mon, 2008-08-18 at 01:42 -0400, Asheesh Laroia wrote:
> > But if it really is because of THREAD, you're in luck. I've almost
> > finished thread index implementation and I'm also considering making it
> > available for v1.1 as a plugin.
>
> That would rule.  (Unless I can already get it as part of the 1.2 hg tree,
> in which case I'm not averse to upgrading)

v1.2 hg tree does have thread indexing code, but it's a bit buggy and
I've already rewritten most of it. I did several tests yesterday and
it's beginning to look pretty good, although there are still some bugs.
I'll try to get those fixed soon and commit the code. After that it's
actually time for v1.2.alpha1 release.

> Attaching to the imap process in gdb and interrupting to get a stack trace
> every once in a while, I find it's mostly (80%) in this:
>
> #2  0xb7e28ff3 in readdir64 () from /lib/tls/i686/cmov/libc.so.6
> #3  0x0806fbe1 in maildir_scan_dir (ctx=0x80f6170, new_dir=false) at
> maildir-sync.c:421

I guess the problem isn't then threading but maildir syncing.. There
would be two solutions to avoid this:

1) Switch to dbox format. :)

2) Add a new mail_assume_only_dovecots=yes setting that makes Dovecot
not rescan the maildir unless it clearly sees that the timestamp has
changed under it. That also means that if there was an external
non-Dovecot modification to cur/ at the same second as Dovecot was doing
its own changes the external modification wouldn't be seen (until
something forced a rescan).

I guess 2) would be nice, but I don't know when I'll have time to
implement it.

> Can maybe Dovecot show me the last cached version of the folder so I can
> see the index immediately, and in the background re-sync that?  That way
> opening folders could be instantaneous, and each next IMAP action could
> give me a chance to see the synchronized state once it's ready.

I guess that would be the 3) possibility. :) But way too much trouble.

signature.asc (204 bytes) Download Attachment