The Office Connector in Confluence allows you to import an Office document into Confluence, so that the document's content is copied onto one or more Confluence pages.
This is just one of the ways Confluence can interact with Office documents. For an overview of all Office Connector features, please refer to Working with the Office Connector.
Your System Administrator can enable or disable the Office Connector or parts of it. The Office Connector options can appear in different places on your Confluence page, depending on the themes and configuration of your Confluence site. Please refer to Configuring the Office Connector in the Confluence Administration Guide and discuss any configuration problems with your administrator.
The simplest way to import an Office document is to import the entire content of the document into a single wiki page. By default, the content of the document will be created as a new wiki page.
More advanced options allow you to import the content into a new page, to split a single document into more than one wiki page, and to resolve conflicts in the titles of your pages.
The simplest way to import an Office document is to import the entire content of the document into a single wiki page.
This method will replace any existing content on the wiki page.
To import an Office document onto a single wiki page:
Create a page in Confluence (see Creating a Page) or go to an existing page whose content you want replaced.
Choose Tools > Import Word Document.
Click Browse and find the Office document on your local drive or network.
Click the Open or Upload button provided by your browser. The path and file name of the document will now appear in the text box on the Office Connector import screen.
Click Next on the Office Connector import screen. The import document options screen will display.
The import document options are:
Option
Description
Root page title
The title of the wiki page that will contain the information from your imported document.
Import as a new page in the current space
A new wiki page will be created with the page title specified above.
Replace <pagename>
The contents of the existing page will be replaced. The page will be renamed to the page title specified above.
Delete existing children of <pagename>
The existing child pages of the page you are replacing will be deleted.
Rename imported pages if page name already exists
Assign new names to any new page which would otherwise have a duplicate name. The content of existing pages will remain unchanged.
Replace existing pages with imported pages of the same title
If imported pages have titles equal to existing pages, then the content of the Office document will overwrite the content on the existing page. Page history will be preserved.
Remove existing pages with the same title as imported pages
If imported pages have titles equal to existing pages, then the existing pages will be deleted. This will remove the page history as well as the content.
Split by heading
The content of the Office document will be split over multiple wiki pages. If you don't want to split your document into multiple wiki pages, leave the default Don't split option selected. For more information on splitting your document, please see below.
Click Import.
When the upload has finished, the content of the Office document will have been transformed into Confluence page content. You can now view and edit this page in the usual way. There is no connection between the original Office document and this wiki page.
Screenshot: Empty page ready for import
Screenshot: Selecting Office document to import
Screenshot: Importing an Office document
Screenshot: Page after import
Splitting an Office Document into Multiple Wiki Pages
When importing an Office document, you can split a single document into more than one wiki page, based on the heading styles in the document.
By default, the page names will be the same as the heading text. This may result in a conflict, if a page already exists with the same title. You can instruct the importer how to handle such conflicts.
To import an Office document into multiple wiki pages:
Import an Office document as described above. On the import document options screen, choose how to split your document in the 'Split by heading' field:
'Split by heading' — If you want to split the content under each heading in your document into separate child pages, select the desired heading level to split by. A preview of the page hierarchy that will be created by the split will be displayed under 'Document Outline'. Each bullet point in the 'Document Outline' represents a new page after import into Confluence.
Click 'Import' to import your document.
When the upload has finished, the content of the Office document will have been transformed into Confluence page content. You can now view and edit this page in the usual way. There is no connection between the original Office document and this wiki page.
Screenshot: Splitting a single Office document into multiple wiki pages
Are there any APIs associated with the doc importer? We are trying to move all of our documentation out of word docs and into confluence, but a manual process would take way to long. Is there a programatic way to handle the content migration?
I am sorry that I can't think of any workaround, but I am pretty sure that you are not alone. Please add your comments in the request to truly reflect your requirements. Hope that helps!
Do you know when you will be providing support for .docx and well as docs that are created using Office for Mac as you still cannot save it as a .doc file on the Mac and have it be able to import. You get the same error as you do when you import a .docx. It would be very helpful, given the number of that are now 2007 or are Mac, to be able to utilize this feature.
At Atlassian, we usually implement new features based on their level of popularity (i.e. the number of votes which have been cast on a feature request on our JIRA site). Feature requests with more votes often have a greater chance of being implemented than those with fewer votes.
We currently have a feature request on our JIRA site for Office 2007 file format support. Please feel free to cast your vote on it to improve its chances of being implemented.
The page introduction say "import an Office document" but further down the page it says "can import documents of the file type .doc. These must be valid binary Word 97-2003 documents".
There are a lot of office documents that aren't word documents. Are there plans to expand the range of documents that can be imported?
As far as I am concerned, there is no work being done to incorporate this feature into Confluence at the moment, however, your vote and comments may increase the chances for it to be implemented
When importing, if you split to multiple pages, it doesn't seem to link from the parent page to the newly created child pages. It has them marked as child pages, but the ordering is lost - they become sorted alphabetically, rather than in the order they should be.
And if the word document has a TOC before the first Heading, that gets turned into a bunch of links - the formatting is stunningly ugly - no indenting of subheadings, too much use of bold - but it might be a workaround for some people.
The children display macro was very helpful. But if you sort by creation, it doesn't automatically change the order of your table of contents (we're using the Documentation theme). So you have to browse pages, then drag the titles around in the tree view. This is definitely not efficient when you've imported 100 pages. Does anyone have a better way to do this? I've requested this in the Confluence item tracker: Import Adobe RoboHelp and Madcap Flare files http://jira.atlassian.com/browse/CONF-18210
Large is indeed 10M. This is configurable, in Dashboard -> Administration -> General Configuration. I did a test update of 30M (and yes, if we proceed, we will have doc files this big) and it seems to have worked when split to child pages. Uploading 15M as a single page failed, not that we'd actually want to do that.
So that's good. But not good that too-large uploads just get a silent fail with no explanation of what has gone wrong.
I've imported a Word doc using the defaults (single page) but would like to divide that page into several pages after import (at the Heading 1 level). Is there there a way to easily do this without manually creating child pages and copying content into them? I've searched and can't figure out how to do this. I'm using the evaluation edition (3.0) and have admin access. Thanks. Like what I'm seeing so far.
Is it an option to re-import it using the advanced options, and this time tell it to split it into child pages?
Yes, Ben you are right. You will be able to Split an Office Document into Multiple Wiki Pages based on the heading styles in the document. Please refer to this documentation:
Hi, We also want to start using Confluence for our technical writing so need to import all our current Word documents. You said that you can create individual pages (which I would like to do) by the Word Heading. However, the Word template we're using doesn't use Headings per se, but rather has mapped them to MT1, MT2, MT3 etc. Does that matter or should it still work?
I am simply having too many problems with this plugin:
When I try to create as a new page, I get a Page Not Found error. Why not just set the default to create a new page, which makes more sense.
I tried importing a document but there were no line breaks shown at all. Everything was condensed into one big paragraph.
My document had "headings" but not using the headings styles but just with text that was underlined, so I could not break into multiple pages. Perhaps give ppl the option of selecting where to break up a page and what headings to use in the TOC, even if you don't detect headings.
Will you provide the ability to import PDF files? I have already voted in favor of Word 2007 files.
Make sure to remove the index of any Word docs before you import them. If not I'm getting a system error: java.lang.IndexOutOfBoundsException: Index: 15, Size: 15
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
To do this, you cannot simply go to the end of the document and delete the index. You must 'Show paragraph marks' in Microsoft Word and find and delete all of the index tags. They will look similiar to this:
{xe.} plus some text that you indexed. You must delete that line and save your document. Then try to import it again.
How do you preserve hyperlinks in a Word document after it's been imported? I generated a 200 page Word doc from RoboHelp 7, then imported it into Confluence (split into different pages by the 6 headings). However, the internal hyperlinks don't work (the external links are fine). Is there a way to preserve the hyperlinks?
Hi Gina,
Confluence supports hyperlink in a valid Ms Word 97-2007 document. I haven't tested any word document exported by RoboHelp 7, would you able to open the exported word document with Ms Word > Re-save doc into another document > attach it into Confluence to try again?
Please note that there is a known bug in Confluence 3.0.x, in which we need to simply edit the hyperlink on the URL and click ok (no changes is required, only view the edit menu from word document > ok) this appears to reapply the link and the hyperlink is working now. This issue has been fixed in Confluence 3.1.
I hope it helps. Please feel free to ask if you need further assistant.
When you split a Word doc during the Doc Import process, the pages are imported in alphabetical order, not in the order of the document's headings. When I used ePublisher to import the Word doc I had generated from RoboHelp 7, then exported from ePublisher to Confluence, ePublisher kept the page ordering. I use Confluence's Documentation Theme and the left-hand navigation pane shows the page titles in the correct order.
Is Atlassian working on correcting the page ordering issue?
This actually happens because Confluence is expecting all the text contents of Word document to be text-only, not a wiki-markup that contains macro. So Confluence will render all text content in word Document as normal text. Therefore it will insert escape character to any macros so that it will be shown as text in the page.
But if you are keen on this, feel free to raise a feature request at http://jira.atlassian.com and describe in detail how you require this feature to work. For further details on how we include new features and improvements, you might want to read this page.
I think it would be helpful to be able to do this, in the case of adding in both the table of contents, and the 'code' macros.
Especially the code/no format macros - as the support teams had to go in after the import and remove all the code that got marked up in the import, and 'copy/paste' it back in - and adding the code or noformat macro at that time.
This is an issue for people who are converting/importing their documentation from word files into wiki pages.
I used the 'Export to Word' function to open a few of them, and then cut and paste to put them back into one big Word file. There is also the 'Export to PDF' function that Administrators have. I wonder if you could select the pages you want, and then tell it to pdf them?
Not sure. I know in the Export to HTML function, that you can select which pages to send.
Importing a document once is nice... but, consider that our Office documents tend to live on, and be updated in, an ECM server. Importing a document once could make it out-of-date.
So it would be useful as well to store a URL to that document, and have Confluence download and render it dynamically.
Is there any way to control the generation of images when importing from a word document? We have a lot of documents with embedded visio diagrams, and the import creates images with very low resolution/size, the images are so small that they're not really usable.
Is there a way to specify which style to split the Word document on? When my documents are generated from another tool headings are assigned h1, h2, and h3. However, Confluence seems to be looking for Heading 1, Heading 2, and Heading 3. Can I tell Confluence to treat h1 as Heading 1?
38 Comments
Hide/Show CommentsMar 30, 2009
TroyGrosfield
Are there any APIs associated with the doc importer? We are trying to move all of our documentation out of word docs and into confluence, but a manual process would take way to long. Is there a programatic way to handle the content migration?
Mar 31, 2009
Azwandi Mohd Aris [Atlassian]
I am sorry that I can't think of any workaround, but I am pretty sure that you are not alone. Please add your comments in the request to truly reflect your requirements. Hope that helps!
Apr 17, 2009
Anonymous
Do you know when you will be providing support for .docx and well as docs that are created using Office for Mac as you still cannot save it as a .doc file on the Mac and have it be able to import. You get the same error as you do when you import a .docx. It would be very helpful, given the number of that are now 2007 or are Mac, to be able to utilize this feature.
Jun 23, 2009
Giles Gaskell [Atlassian Technical Writer]
Hello there,
At Atlassian, we usually implement new features based on their level of popularity (i.e. the number of votes which have been cast on a feature request on our JIRA site). Feature requests with more votes often have a greater chance of being implemented than those with fewer votes.
We currently have a feature request on our JIRA site for Office 2007 file format support. Please feel free to cast your vote on it to improve its chances of being implemented.
Best regards,
Giles Gaskell
Technical Writer
ggaskell@atlassian.com
ATLASSIAN - http://www.atlassian.com
Jun 09, 2009
Anonymous
The page introduction say "import an Office document" but further down the page it says "can import documents of the file type .doc. These must be valid binary Word 97-2003 documents".
There are a lot of office documents that aren't word documents. Are there plans to expand the range of documents that can be imported?
Jun 15, 2009
Azwandi Mohd Aris [Atlassian]
There is a similar request in regards to this issue:
As far as I am concerned, there is no work being done to incorporate this feature into Confluence at the moment, however, your vote and comments may increase the chances for it to be implemented
Cheers,
Azwandi
Jun 15, 2009
Ben Aveling
Is there any workaround for the following:
Jun 15, 2009
Ben Aveling
You can use the children macro as a workaround for my point #1 above, if you use sort=creation. It's a bit of a kludge, but it seems to work...
Eg {children:sort=creation}
Jun 15, 2009
Ben Aveling
And if the word document has a TOC before the first Heading, that gets turned into a bunch of links - the formatting is stunningly ugly - no indenting of subheadings, too much use of bold - but it might be a workaround for some people.
Jan 11, 2010
Gina Wadley
The children display macro was very helpful. But if you sort by creation, it doesn't automatically change the order of your table of contents (we're using the Documentation theme). So you have to browse pages, then drag the titles around in the tree view. This is definitely not efficient when you've imported 100 pages. Does anyone have a better way to do this? I've requested this in the Confluence item tracker: Import Adobe RoboHelp and Madcap Flare files http://jira.atlassian.com/browse/CONF-18210
Jun 15, 2009
Ben Aveling
FWIW: There is bulk import of text files: http://confluence.atlassian.com/display/DOC/Importing+Pages+from+Disk
Jun 15, 2009
Ben Aveling
It seems to fail for large documents - with no explanation of why... (Where 'large' is perhaps approx 10M)
Question: how to tell what the maximum import size is? Can it be increased? How?
Jun 15, 2009
Ben Aveling
Large is indeed 10M. This is configurable, in Dashboard -> Administration -> General Configuration. I did a test update of 30M (and yes, if we proceed, we will have doc files this big) and it seems to have worked when split to child pages. Uploading 15M as a single page failed, not that we'd actually want to do that.
So that's good. But not good that too-large uploads just get a silent fail with no explanation of what has gone wrong.
Aug 12, 2009
Azwandi Mohd Aris [Atlassian]
Hi Ben,
In case this problem remains unresolved, feel free to raise a support ticket.
Cheers,
Azwandi
Jun 19, 2009
Anonymous
I've imported a Word doc using the defaults (single page) but would like to divide that page into several pages after import (at the Heading 1 level). Is there there a way to easily do this without manually creating child pages and copying content into them? I've searched and can't figure out how to do this. I'm using the evaluation edition (3.0) and have admin access. Thanks. Like what I'm seeing so far.
Jun 19, 2009
Ben Aveling
Is it an option to re-import it using the advanced options, and this time tell it to split it into child pages?
Jul 02, 2009
Zed Yap [Atlassian]
Hi,
Yes, Ben you are right. You will be able to Split an Office Document into Multiple Wiki Pages based on the heading styles in the document. Please refer to this documentation:
Hope that I did not misinterpreted your question.
Best rgds,
Zed
Oct 18, 2009
Anonymous
Hi, We also want to start using Confluence for our technical writing so need to import all our current Word documents. You said that you can create individual pages (which I would like to do) by the Word Heading. However, the Word template we're using doesn't use Headings per se, but rather has mapped them to MT1, MT2, MT3 etc. Does that matter or should it still work?
Thanks, Esther
Jul 22, 2009
Mark Salamon
I am simply having too many problems with this plugin:
Jan 11, 2010
Gina Wadley
Make sure to remove the index of any Word docs before you import them. If not I'm getting a system error: java.lang.IndexOutOfBoundsException: Index: 15, Size: 15
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
Nov 03, 2011
Anonymous
To do this, you cannot simply go to the end of the document and delete the index. You must 'Show paragraph marks' in Microsoft Word and find and delete all of the index tags. They will look similiar to this:
{xe.} plus some text that you indexed. You must delete that line and save your document. Then try to import it again.
Jan 11, 2010
Gina Wadley
How do you preserve hyperlinks in a Word document after it's been imported? I generated a 200 page Word doc from RoboHelp 7, then imported it into Confluence (split into different pages by the 6 headings). However, the internal hyperlinks don't work (the external links are fine). Is there a way to preserve the hyperlinks?
Feb 23, 2010
Jack Low [Atlassian]
Hi Gina,
Confluence supports hyperlink in a valid Ms Word 97-2007 document. I haven't tested any word document exported by RoboHelp 7, would you able to open the exported word document with Ms Word > Re-save doc into another document > attach it into Confluence to try again?
Please note that there is a known bug in Confluence 3.0.x, in which we need to simply edit the hyperlink on the URL and click ok (no changes is required, only view the edit menu from word document > ok) this appears to reapply the link and the hyperlink is working now. This issue has been fixed in Confluence 3.1.
I hope it helps. Please feel free to ask if you need further assistant.
Thanks & regards,
Jack
Mar 22, 2010
Gina Wadley
When you split a Word doc during the Doc Import process, the pages are imported in alphabetical order, not in the order of the document's headings. When I used ePublisher to import the Word doc I had generated from RoboHelp 7, then exported from ePublisher to Confluence, ePublisher kept the page ordering. I use Confluence's Documentation Theme and the left-hand navigation pane shows the page titles in the correct order.
Is Atlassian working on correcting the page ordering issue?
Thanks.
May 14, 2010
Anonymous
We're using 3.1.2 - and when I import a Word document that already has some wiki mark-up in it, the conversion seems to cancel the code:
comes out on the page, then I have to go back into the page and take out the '/'.
Does anyone know why this happens/hot to get around it?
Thanks!
AAP
May 17, 2010
Husein Alatas [Atlassian]
Hi AAP,
This actually happens because Confluence is expecting all the text contents of Word document to be text-only, not a wiki-markup that contains macro. So Confluence will render all text content in word Document as normal text. Therefore it will insert escape character to any macros so that it will be shown as text in the page.
But if you are keen on this, feel free to raise a feature request at http://jira.atlassian.com and describe in detail how you require this feature to work. For further details on how we include new features and improvements, you might want to read this page.
Hope that clarifies
Cheers,
Husein
Aug 26, 2010
Anonymous
Thanks Husein,
I think it would be helpful to be able to do this, in the case of adding in both the table of contents, and the 'code' macros.
Especially the code/no format macros - as the support teams had to go in after the import and remove all the code that got marked up in the import, and 'copy/paste' it back in - and adding the code or noformat macro at that time.
This is an issue for people who are converting/importing their documentation from word files into wiki pages.
I'll look into adding the feature request.
Thanks again for the response.
Kind regards,
AAP
Jul 10, 2011
David Stephensen
I just added it - https://jira.atlassian.com/browse/OFFCONN-47
Oct 19, 2010
Lis Riba
Is there any way to batch-edit imported pages? Find and Replace?
I'm trying to import a glossary from Word (splitting pages) and requiring tags to be manually added to every page is not terribly appealing.
Jun 16, 2010
Anonymous
Hi,
I imported a Word document and split it.
Now I want to print it. Is there a way to show the split pages as a single page (like on the original document), so i can print it as a single file?
Aug 26, 2010
Anonymous
I used the 'Export to Word' function to open a few of them, and then cut and paste to put them back into one big Word file. There is also the 'Export to PDF' function that Administrators have. I wonder if you could select the pages you want, and then tell it to pdf them?
Not sure. I know in the Export to HTML function, that you can select which pages to send.
AAP
Nov 16, 2010
Jon Garfunkel
Importing a document once is nice... but, consider that our Office documents tend to live on, and be updated in, an ECM server. Importing a document once could make it out-of-date.
So it would be useful as well to store a URL to that document, and have Confluence download and render it dynamically.
Feb 16, 2011
Alastair Bain
Is there any way to control the generation of images when importing from a word document? We have a lot of documents with embedded visio diagrams, and the import creates images with very low resolution/size, the images are so small that they're not really usable.
May 09, 2011
Anonymous
Same here! This renders the Word importer close to useless for us.
Jun 16, 2011
Anonymous
Is it possible to import a Word document and use a global Confluence template for the page formatting?
Jun 22, 2011
S. Dineen
So as I import Word documents in, My old headers (and now page titles) are appended with numbers (0,1,2 over and over).
Any ideas on this?
Oct 07, 2011
nijingjie
I imported a word document(MS word) like blow.
-----------------------------------------------------------------------
-----------------------------------------------------------------------
But the titles become
atitle
btitle
ctitle
after imported.
How can I keep the same title's sort like "ctitle","btitle","atitle" as the word document when importing?
Jan 11, 2012
Dan Milton
Is there a way to specify which style to split the Word document on? When my documents are generated from another tool headings are assigned h1, h2, and h3. However, Confluence seems to be looking for Heading 1, Heading 2, and Heading 3. Can I tell Confluence to treat h1 as Heading 1?
Add Comment