Non english characters, umlauts and diaeresis missing or appear as boxes in Confluence Data Center PDF export

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform Notice: Data Center Only - This article only applies to Atlassian products on the data center platform.

Summary

When exporting a Confluence page out to PDF, on english characters, umlauts and diaeresis might appear missing, or as boxes or even garbled in the PDF content.

Environment

Confluence Data Center installation on a Linux Server.


Diagnosis

  1. The following encoding are already set to UTF-8
    1. -Dsun.jnu.encoding=UTF-8
    2. -Dfile.encoding=UTF-8
  2. Database encoding set correctly.
  3. Issue does not persist when the PDF Conversion Sandbox process for Confluence Data Center is disabled.
    • -Dpdf.export.sandbox.disable=true


Cause

  1. The PDF conversion process is Confluence Data Center is controlled by a separate sandbox process, even on a DC instance running on a single node.
  2. In some cases this process might not pick up the encoding that is set for Confluence, and we will need to manually parse these valus.


Solution

  1. Add the following parameter to the setenv.sh file for each node and restart Confluence.
    • CATALINA_OPTS="-Dconversion.sandbox.java.options=-Xmx512m,-Xss2m,-Dsun.jnu.encoding=UTF-8,-Dfile.encoding=UTF-8 ${CATALINA_OPTS}"




Last modified on Feb 17, 2020

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.