Branch names with emoji or 4 byte UTF8 characters create hundreds of plan branches in Bamboo

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Problem

When a plan is configured to detect new repository branches, and one of those branches contains a supplementary unicode character such as emojis or other 4-byte UTF-8 characters, Bamboo continuously creates new duplicate plan branches for the same branch. This may result in tens or hundreds of duplicate plan branches for the branch containing the special character.

Diagnosis

Environment

  • MySQL database

The branch name in Bamboo will display with ?? in place of the 4-byte unicode character in Bamboo:

 

Cause

This issue is due to the way MySQL databases handle utf8. The default utf8 encoding can only use a maximum of 3 bytes per character, which is not enough to store supplementary unicode characters such as emoji or other symbols. This results in the branch name being stored incorrectly in MySQL. When your plan next runs and checks all the branch names for new branches, the branch containing the 4 byte character will always be seen as new because it will not correctly match any existing entry in the database.

Resolution

To resolve this issue, you must either:

  • Remove 4-byte characters from branch names so that they can be stored in your MySQL database
  • Migrate to another RDBMS that supports the full unicode character set

Note: Although MySQL 5.5.3+ supports additional character set and collation options such as utf8mb4 which allow more than 3 bytes per character to be stored in the database, utf8mb4 is not supported in BambooChanging the character set / collation to use utf8mb4 also means that column indexes and key lengths will need to store 4 bytes per character. This pushes many table keys over the default 767 byte-length limit. Whilst you can work around this by setting innodb_large_prefix = ON, this only sets the max length at 3072 which may still be exceeded. In addition, the default {{ROW_FORMAT}} limits the max column index size to 767 bytes and this limit is also exceeded when trying to use utf8mb4. Therefore we do not recommend using anything other than the utf8 character set and utf8_bin collation when using MySQL.

 

Last modified on May 17, 2016

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.