Accented or extended UTF-8 characters cause "Malformed input or input contains unmappable characters" error

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Problem

Extended UTF-8 or accented characters could cause unexpected behavior in Bitbucket server. For example a branch with these characters can cause unexpected behavior and error similar to following one in the atlassian.bitbucket.log

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: <repo path>/1052/refs/heads/大家好
        at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:145)
        at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:69)
        at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:280)
        at java.base/java.io.File.toPath(File.java:2290)
        at com.atlassian.stash.internal.scm.git.RawGitAgent.execute(RawGitAgent.java:437)
        at com.atlassian.stash.internal.scm.git.RawGitAgent.execute(RawGitAgent.java:433)
        at com.atlassian.stash.internal.scm.git.RawGitAgent.resolveBranch(RawGitAgent.java:585)
        at com.atlassian.stash.internal.scm.git.RawGitAgent.resolveHead(RawGitAgent.java:222)
        at com.atlassian.stash.internal.scm.git.DefaultGitCommandFactory$2.call(DefaultGitCommandFactory.java:297)
        at com.atlassian.stash.internal.scm.git.DefaultGitCommandFactory$2.call(DefaultGitCommandFactory.java:293)
        at com.atlassian.stash.internal.repository.DefaultRefService.getDefaultBranch(DefaultRefService.java:191)
..

Diagnosis

Environment

  • Bitbucket is hosted on Windows and MacOS is unaffected.

  • Impacts Bitbucket server 6.0+ installed on Linux servers, running Java 11 and LANG environment variable set to non-utf8 locale

Cause

Java 11 won't support setting sun.jnu.encoding to UTF-8 via the command line to use UTF-8 for encoding file paths. It will silently ignore it and will not have any effect.

Workaround

  • If Bitbucket is running as service set LANG="en_US.UTF-8" in /etc/init.d/atlbitbucket and will be honoured.
  • Set LANG="en_US.UTF-8" in the environment of the user with which Bitbucket is started


Descriptionextended UTF-8 characters cause "Malformed input or input contains unmappable characters"
ProductBitbucket Server
Last modified on Feb 11, 2019

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.