Confluence Throws 'The XML Content Could Not Be Parsed' Error When Viewing A Page

Still need help?

The Atlassian Community is here for you.

Ask the community

Symptoms

When viewing a page, it will not render and we receive the following error in the user interface.

The following appears in the atlassian-confluence.log:

Error: The XML content could not be parsed. There is a problem at
line x, column y. Parser message: String ']]>' not allowed in
textual content, except as the end marker of CDATA section at [row,col
 
{unknown-source}
]: [x,y] 

Cause 1

This happens due to an existing character which conflicts with XML standards. The offending character appears in the error message as String ']]>' from the above log entry.  In this example, '>' is the offending character. 

Cause 2

This error can also be caused by a NULL value in bodytypeid after the XHTML Migration when upgrading Confluence. Running the following query will identify pages that are affected by this problem:

SELECT *
FROM BODYCONTENT
WHERE contentid IN (
    SELECT c.contentid FROM CONTENT AS c WHERE CONTENTTYPE IN ('PAGE', 'COMMENT', 'BLOGPOST')
)
  AND bodytypeid IS NULL;

Resolution 1

(warning) This resolution covers specifically > and replacing it with it's XML encoding of &gt.  For a list of other characters and their encodings please see List of XML and HTML character entity references.


As an initial troubleshooting step, we could also run the following SQL query from the Database for us to review what is being stored in Confluence Database for the affected page (source of truth):
(info) Please modify the pageName and spaceName parameters accordingly.

SELECT c.contentid, c.title, s.spacekey, bc.body 
FROM CONTENT c JOIN BODYCONTENT bc 
ON c.contentid = bc.contentid JOIN SPACES s 
ON c.spaceid = s.spaceid 
WHERE c.prevver IS NULL 
AND c.contenttype IN ('PAGE', 'BLOGPOST') 
AND c.title LIKE '<pageName>' 
AND s.spacename LIKE '<spaceName>';
  1. Install the Confluence source editor plugin
  2. Navigate to the offending page and click ... > View Storage Format, to get storage format of the affected page
  3. Copy the resultant text to the clipboard
  4. On the same page click Edit > <>  to open the source editor    
  5. Delete the existing content in the source editor
  6. Paste the contents of the clipboard into the source editor
  7. Find any instance of >  that is not being used in a proper XML tag and replace with &gt 
  8. Save these changes

Resolution 2

(info) Before making any changes to your database, ensure that you have a full database backup and that Confluence is shut down. Then, execute the following query:

UPDATE
    BODYCONTENT
SET bodytypeid = 0
WHERE contentid IN (
    SELECT c.contentid FROM CONTENT AS c WHERE CONTENTTYPE IN ('PAGE', 'COMMENT', 'BLOGPOST')
)
  AND bodytypeid IS NULL;



Last modified on Apr 24, 2023

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.