Taking email messages from a POP server and putting each into its own file?

Folks: I’d like to create a file system archive of an email account. Suppose that I have 100 messages in an email account (for concreteness sake, let’s assume they are in gmail). I can configure the email account as a POP or IMAP server. From a Windows, Mac, or Linux machine, I would like to fetch all 100 of the email messages and write each one out to a file. I would like it if the file were sensibly named, but that isn’t critical. How to do this?

I think that I’ve tried the obvious solution, which is to pull the messages down into Microsoft Outlook and then “export” a folder. The result is a .csv file containing all the messages from the folder.

A friend suggested using Thunderbird and the following extension: http://www.nic-nac-project.de/~kaosmos/mboximport-en.html

Better ideas?

20 thoughts on “Taking email messages from a POP server and putting each into its own file?

  1. I use a not too complex method for this on Linux.

    1. Configure mail system to use “Maildir” Format. This is one file per message.
    2. “Fetchmail” to retrieve mail from pop/imap.

    Fetchmail can be configured to deal with multiple mail sources with various setup scripts. I use fetchmail because I want manual control over when it polls for mail rather than have the automatic fetching of email.

    The filenames are randomly chosen gibberish.

  2. Mail.app on the Mac stores them this way by default (so spotlight can index them). If you’re already using it, just look in ~/Library/Mail/MAILBOXNAME.

    Otherwise just setup the IMAP (or POP) account in mail app. It will download all the messages in the account and store them in ~/Library/Mail/ in their IMAP folder structure. The filenames for each individual mail are just just integers though, but the files are basically plaintext (there’s a parallel attachments directory which contains all the decoded attachments with the same index ids as the messages).

    Looks like ~/Library/Mail/Envelope Index probably has indexing info, and it’s just a SQL lite database, so you might be able to easily script out something to rename the files to a subject line or something if you -had- to have better filenames. (interesting link: http://www.javarants.com/2008/12/26/build-your-own-mail-analyzer-for-mac-mailapp/#more-943)

  3. A number of years ago I wrote a Python script to back up my Gmail into folders because I wanted to backup my email and also make it easy to archive (ie, not have to use a specific email client), I went ahead and uploaded the script to Github since I thought you might find it useful:

    http://github.com/Jaymon/Popbak

  4. You can do this so easily in Thunderbird without any special extensions. Pull all the emails down, then drag ’em into a new folder, right-click on that folder, and check out the export options. As I recall, you can export as txt files, eml files, html files (with html index), a big csv file… Anyway, you have several options right out of the box for exporting as individual files.

  5. It seems to me a case where a 5 line ruby script is far easier than any app, from the cookbook:

    require ‘net/pop’
    conn = Net::POP3.new(‘mail.myhost.com’)
    conn.start(‘username’, ‘password’)
    conn.mails.each {|msg| File.open(msg.uidl, ‘w’) { |f| f.write msg.pop }}
    conn.finish

  6. I’ve used RJH’s method using getmail (which has one of the world’s most attentive developers working on it for way too long) & maildrop instead of fetchmail + procmail – which is a more popular combination, and used Mutt to read it. I think with maildrop you can write a few regexes to perhaps take the subject line and use that as a filename.

    Other than the naming thing, any combo of the two sets is very easy on Unix.

  7. There’s already a program that does it for GMail. Google “back up gmail” and you’ll find it. It creates a single file for each message in specific folder.

  8. Gee, I do this all the time with good ol’ Outlook Express 6. Within OE6, just highlight all the messages you want to save, then click-and-drag them to a folder. Each message is saved as a separate file, with the subject as a filename. It intelligently adds (n) where n is an integer to prevent same-name conflicts. Each file has an .eml suffix but is basically a text file. It uses the “received” time for the timestamp of the file. Couldn’t be simpler. I believe this works in OE6’s successors (Windows Mail and Windows Live Mail), but I haven’t thoroughly tested this.

Comments are closed.