Non english characters, umlauts and diaeresis missing or appear as boxes in Confluence Data Center PDF export
Platform Notice: Data Center Only - This article only applies to Atlassian products on the data center platform.
Summary
When exporting a Confluence page out to PDF, on english characters, umlauts and diaeresis might appear missing, or as boxes or even garbled in the PDF content.
Environment
Confluence Data Center installation on a Linux Server.
Diagnosis
- The following encoding are already set to UTF-8
- -Dsun.jnu.encoding=UTF-8
- -Dfile.encoding=UTF-8
- Database encoding set correctly.
- Issue does not persist when the PDF Conversion Sandbox process for Confluence Data Center is disabled.
-Dpdf.export.sandbox.disable=true
Cause
- The PDF conversion process is Confluence Data Center is controlled by a separate sandbox process, even on a DC instance running on a single node.
- In some cases this process might not pick up the encoding that is set for Confluence, and we will need to manually parse these valus.
Solution
- Add the following parameter to the setenv.sh file for each node and restart Confluence.
-
CATALINA_OPTS="-Dconversion.sandbox.java.options=-Xmx512m,-Xss2m,-Dsun.jnu.encoding=UTF-8,-Dfile.encoding=UTF-8 ${CATALINA_OPTS}"
-
Last modified on Feb 17, 2020
Powered by Confluence and Scroll Viewport.