mbox: extra linefeed after Content-Length header in 1.1.rc8

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

mbox: extra linefeed after Content-Length header in 1.1.rc8

Diego Liziero-2
mbox messages gets header corruption caused by an extra linefeed after
Content-Length

Users sees their mails in Sent mbox folder without the from and to
fields, without attachments and with the date of 1/1/1970

Diego.
---
Here is an anonymized header:

>From [hidden email]  Tue Jun 03 09:14:33 2008
Message-ID: <[hidden email]>
X-UID: 3913
Status: RO
X-Keywords:
Content-Length: 6817

xxxx: xxx, xx xxx xxxx xx:xx:xx +xxxx
xxxx: xxxxxxx xxxxxxxx <[hidden email]>
xxxx-xxxxx: xxxxxxxxxxx x.x.x.x (xxxxxxx/xxxxxxxx)
xxxx-xxxxxxx: x.x
xx: "[hidden email]" <[hidden email]>
xx:  [hidden email],
 xxxxxx xxxxx <[hidden email]>,
 xxxxxxx xxxxxxxxxxx <[hidden email]>
xxxxxxx: xx: x: xx: xxxxxxxxx
xxxxxxxxxx: <[hidden email]>
xx-xxxxx-xx: <[hidden email]>
xxxxxxx-xxxx: xxxx/xxxxx; xxxxxxx=xxx-x; xxxxxx=xxxxxx
xxxxxxx-xxxxxxxx-xxxxxxxx: xxxx
Reply | Threaded
Open this post in threaded view
|

Re: mbox: extra linefeed after Content-Length header in 1.1.rc8

Timo Sirainen
On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
> mbox messages gets header corruption caused by an extra linefeed after
> Content-Length

Fixed: http://hg.dovecot.org/dovecot-1.1/rev/e043135e971d

I guess 1.1.rc9 will still come. But I'll wait a couple of days before
releasing it to see if there are more bugs..


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: mbox: extra linefeed after Content-Length header in 1.1.rc8

Diego Liziero-2
On Tue, Jun 3, 2008 at 3:05 PM, Timo Sirainen <[hidden email]> wrote:
> On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
>> mbox messages get header corruption caused by an extra linefeed after
>> Content-Length
>
> Fixed: http://hg.dovecot.org/dovecot-1.1/rev/e043135e971d

Works, thank you.

Now I have to fix users mbox files.

As the extra linefeed is between Content-Length and Subject headers,
I'm thinking about using a regexp based replace such as
s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s
but I can't find how to make multiple lines matching work.

Any suggestion?

Regards,
Diego.
Reply | Threaded
Open this post in threaded view
|

Re: mbox: extra linefeed after Content-Length header in 1.1.rc8

Asheesh Laroia
On Wed, 4 Jun 2008, Diego Liziero wrote:

> On Tue, Jun 3, 2008 at 3:05 PM, Timo Sirainen <[hidden email]> wrote:
>> On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
>>> mbox messages get header corruption caused by an extra linefeed after
>>> Content-Length
>>
>> Fixed: http://hg.dovecot.org/dovecot-1.1/rev/e043135e971d
>
> Works, thank you.
>
> Now I have to fix users mbox files.
>
> As the extra linefeed is between Content-Length and Subject headers,
> I'm thinking about using a regexp based replace such as
> s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s
> but I can't find how to make multiple lines matching work.

Python has an re.MULTILINE option you can pass to the regular expression
so that it can cross lines.  Perhaps Perl or your favorite regular
expression toolkit has something similar?

If not, Python it is! (-;

-- Asheesh.

--
Do not drink coffee in early A.M.  It will keep you awake until noon.
Reply | Threaded
Open this post in threaded view
|

Re: mbox: extra linefeed after Content-Length header in 1.1.rc8

Timo Sirainen
In reply to this post by Diego Liziero-2
On Wed, 2008-06-04 at 23:59 +0200, Diego Liziero wrote:

> On Tue, Jun 3, 2008 at 3:05 PM, Timo Sirainen <[hidden email]> wrote:
> > On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
> >> mbox messages get header corruption caused by an extra linefeed after
> >> Content-Length
> >
> > Fixed: http://hg.dovecot.org/dovecot-1.1/rev/e043135e971d
>
> Works, thank you.
>
> Now I have to fix users mbox files.
>
> As the extra linefeed is between Content-Length and Subject headers,
> I'm thinking about using a regexp based replace such as
> s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s
> but I can't find how to make multiple lines matching work.
>
> Any suggestion?
Perl maybe? Something like (not tested):

perl -pe 'BEGIN { $/ = ""; } s/^(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/g' < mbox > mbox2

$/ changes the line separator.


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: mbox: extra linefeed after Content-Length header in 1.1.rc8

Tomas Zerolo
In reply to this post by Asheesh Laroia
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Jun 04, 2008 at 03:03:34PM -0700, Asheesh Laroia wrote:

[...]

> Python has an re.MULTILINE option you can pass to the regular expression so
> that it can cross lines.  Perhaps Perl or your favorite regular expression
> toolkit has something similar?

That would be the s modifier for a Perl regexp (treat string as a single
line):

  $x =~ /.../s

(This basically changes the meaning of . to also match end-of-line
chars. To control whether ^ and $ match beginning/end of string or
beginning/end of line whithin the string, see the m modifier).

> If not, Python it is! (-;

Nah ;-)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIR4SmBcgs9XrR2kYRAiiXAJ43v4e7kJcztLeET+6DUfKYxgZGHgCeJ1zi
YGYHYtPMsd8W2wy6M2tQOPA=
=lbOV
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: [solved] mbox: extra linefeed after Content-Length header in 1.1.rc8

Diego Liziero-2
In reply to this post by Timo Sirainen
> On Wed, 2008-06-04 at 23:59 +0200, Diego Liziero wrote:
> As the extra linefeed is between Content-Length and Subject headers,
> I'm thinking about using a regexp based replace such as
> s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s
> but I can't find how to make multiple lines matching work.
>
> Any suggestion?

Thank you everyone for your help.
After some quick tries, and following your suggestions, I ended up in
writing a silly perl script that matched one by one each of the three
lines and printed only the first and third one.

On Thu, Jun 5, 2008 at 12:07 AM, Timo Sirainen <[hidden email]> wrote:
>
> Perl maybe? Something like (not tested):
>
> perl -pe 'BEGIN { $/ = ""; } s/^(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/g' < mbox > mbox2
>
> $/ changes the line separator.

Almost right. But this deletes all empty lines, not just the ones in
the header. I didn't try to have a deeper look.

On Thu, Jun 5, 2008 at 8:16 AM,  <[hidden email]> wrote:
> That would be the s modifier for a Perl regexp (treat string as a single
> line):
>
>  $x =~ /.../s

This should be the right way.. see below.

On Thu, Jun 5, 2008 at 12:03 AM, Asheesh Laroia <[hidden email]> wrote:
>Python has an re.MULTILINE option you can pass to the regular expression so that it can cross lines.  Perhaps Perl >or your favorite regular expression toolkit has something similar?

Yes, but with perl I didn't find quickly a solution to read multiple
lines from a file without filling all system memory when files are
some gigabytes big.

Regards,
Diego.