Did you know it was possible to import a Yahoo Group into Confluence?
Previously, our main arena of collaboration was a Yahoo Group that went back to 1998. It currently has more than 27,000 messages in it.
Here's how I ported them over to Confluence:
1. Using yahoo2mbox, a small Perl script, "rip" all your messages from Yahoo. You may need to spread this out a little bit, because Yahoo imposes a download limit (about 1000 msgs every 24 hrs I think).
2. Load the resulting file into Mozilla Thunderbird. You can do this in one of two ways:
- Copy the message file(s) directly into Thunderbird's email directory and they will be processed when the program is restarted. (In Windows XP, this is something like C: -> Documents and Settings -> Owner -> Application Data -> Thunderbird -> Profiles -> xxx.default -> Mail -> Local Folders)
- Use Thunderbird's MBOX import extension to select and import the files from any directory.
(I'm sure there's another way to do this, but I had trouble importing directly from the original file to Confluence.)
This step is neat in itself because then you have a fully searchable, manipulable, and threaded database of your Yahoo group on your computer.
If you have a lot of emails you might want to break them up into separate folders. I'm not sure what the optimum number is. I kept experimenting, but mostly went in increments of a couple thousand. I had more problems when I had less memory (no suprise there).I encountered occasional crashes (memory issues/timeouts?), and some glitches (one message had the current date rather than the original date – I looked at the source and still can't understand why!) but basically the process worked.
3. The final step is simply going to one of the Space Admin pages and importing the files from the directory where Thunderbird stores it's messages (again, on Windows, it's something like C: -> Documents and Settings -> Owner -> Application Data -> Thunderbird -> Profiles -> xxx.default -> Mail -> Local Folders, and then you want to choose the regularly named file (not the .sbd or .msf files).
- Idea: It would be nice to be able to search within or view email message headers (as options only!). E.g. Yahoo identifies each message with a unique ID #, and this can be picked up by the yahoo2mbox script. This provides a nice "primary key" with which to quickly identify a particular message in the archives.
------
Was this useful to you?

Comments (2)
Jul 18, 2005
Charles Miller says:
Thanks for this, John. Hopefully other people will find it useful as well. For ...Thanks for this, John. Hopefully other people will find it useful as well.
For the searching, you might want to investigate Extractor Plugins which allow you to add custom content to Confluence's search index. It wouldn't be hard to write an extractor plugin that looked through a Mail message for its Yahoo! ID, and stored it in the index. You could then type yahooId:12345632 into Confluence's search field to find the message.
Jul 23, 2005
John S. says:
Thanks for the tip. I'll keep this in mind as site grows.Thanks for the tip. I'll keep this in mind as site grows.