Removing invalid characters from XML backups
JIRA 3.1 and above should not suffer from this problem unless migrating to postgreSQL from another database such as MySQL. Invalid characters otherwise are automatically stripped from imported data.
In older versions of JIRA it was possible to cut & paste text containing control characters into JIRA issue fields. This causes problems, because JIRA's backup format is XML, and XML does not allow for the storage of most control characters.When XML containing control characters is imported into JIRA, the import fails with an error:
To fix this, the control characters will need to be removed from the JIRA backup file. This can be done with the following:
- Download atlassian-xml-cleaner-0.1.jar
- Open a command prompt and locate the XML or ZIP backup file on your computer, ensuring that it is extracted if it's within a ZIP file. In this example, we will use
Run the application with the below:
java -jar atlassian-xml-cleaner-0.1.jar entities.xml > entities-clean.xml
This will create a copy of
entities-clean.xmlwith the invalid characters removed.
- Copy the
entities-clean.xmlfile into another directory, rename it back to
entities.xmland create a new ZIP with the newly created
entities.xmlfile and the
- Import the new ZIP file, ensuring that it contains both XML files.
If you are seeing an error specifically with 0xffff as the affected character, please use this perl command to fix the file:
perl -i -pe 's/\xef\xbf\xbf//g' entities.xml
And if experiencing the error with 0xfffe, use the below perl command:
perl -i -pe 's/\xef\xbf\xbe//g' entities.xml