Word import perl script

Name Word import perl script
Author(s) Matthew Janulewicz (email)
Categories File Importers
Version 1.0.3
State Production
Price Free
License GPL v3.0
Download word2confluence.pl

Description/Features

Perl script to import Micsosoft Word documents (including graphics.)

Exports text file with Confluenece Wiki formatting as well as graphics. Final output directory is suitable for importing via WebDAV.

Properly recognizes and converts:

Headings (1-6)
Tables
Images
Bold text
Italic text
Underlined text
Strikethrough text
Superscript text
Subscript text

Installation

Note that this script uses Word itself for part of the conversion, so it is Windows-only.

Requirements:

Microsoft Word (tested with Office 2003 SP2.)
Perl with Win32 and Win32::OLE packages installed.

Both ActiveState Perl and Cygwin's Perl (be sure to install all Perl packages) should work 'out of the box'.

Usage

perl word2confluence.pl MyDoc.doc 

Examples

Note that this script will work in a DOS window as well as in Cygwin. Cygwin is especially fun because you can easily batch convert many files:

find . -name *.doc -exec perl word2confluence.doc {} \;

All output for each document is saved to it's own directory which can be cut and pasted into a Confluence page or simply copied to any location via WebDav (recommended method.) 

To Do/Limitations

Drop extensions from folder/file names.
Properly parse out lists.
Detect and import colored text.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.