How to Fix the Collation and Character Set of a MySQL Database manually
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the server and data center platforms.
Direct database manipulation is not covered by our Atlassian Support Offerings and should be up to your DBAs discretion.
Our recommended method for migrating databases is as follows
- Create a new database with the required collation as per the appropriate documentation (for example Connecting JIRA to a Database)
- Follow our Switching Databases using an XML backup to migrate from the old database (with the incorrect collation) to the new one, with the correct collation.
If the recommended method for some reason is not suitable for your scenario, please follow this article to manually fix the collation at the database server side. After the solution is implemented, please test the application thoroughly to ensure everything works correctly and as expected.
What is Collation?
The collation determines how results are sorted and ordered. In newer versions of Atlassian applications, collation changes may become more strict - i.e, an application requires a certain collation. You must ensure your database has the correct collation for the application it will be used with.
Collation in MySQL can be complicated because you can have a separate collation set at:
- The database level
- The table level
- The column level
Additionally, the information inside a column may be encoded incorrectly as well - causing the data in that column to be displayed incorrectly.
Which collation can you use?
Not all versions of Jira and Confluence support utf8mb4 (which provides support for 4-byte characters). You may need to use utf8.
Must use utf8mb4 | Can use utf8mb4 | Must use utf8 |
---|---|---|
|
|
|
Setup Guides for MySQL
To setup your MySQL database correctly, see the following resources for each product:
Always back up your data before performing any modifications to the database. If possible, test any alter, insert, update, or delete SQL commands on a staging server first.
You may wish to add all the ALTER TABLE statements to a single file for easier execution.
Checking the collation and character set
It may be necessary throughout the process to re-check the settings at the database, table, and column level.
Here are the commands for checking it:
To check database collation: use jiradb; SELECT @@character_set_database, @@collation_database; To check Table collation: SELECT TABLE_SCHEMA , TABLE_NAME , TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES To check Column collation: SELECT TABLE_NAME , COLUMN_NAME , COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS
Dealing with Foreign Key constraints
It may be necessary to ignore foreign key constraints when making changes to a large number of columns.
mysql> ALTER TABLE AO_1FA2A8_SCRUM_POKER_SESSION CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
ERROR 3780 (HY000): Referencing column 'SESSION_ID' and referenced column 'ISSUE_KEY' in foreign key constraint 'fk_ao_1fa2a8_scrum_poker_vote_session_id' are incompatible.
mysql> ALTER TABLE AO_1FA2A8_SCRUM_POKER_VOTE CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
ERROR 3780 (HY000): Referencing column 'SESSION_ID' and referenced column 'ISSUE_KEY' in foreign key constraint 'fk_ao_1fa2a8_scrum_poker_vote_session_id' are incompatible.
You can use the SET FOREIGN_KEY_CHECKS
command to ignore foreign key constraints while you update the database.
SET FOREIGN_KEY_CHECKS=0;
-- Insert your other SQL Queries here...
SET FOREIGN_KEY_CHECKS=1;
This should be done for each session that is opened to the database
Changing the database collation
In the example below, change:
<yourDB>
to your actual database name<charset>
to eitherutf8
orutf8mb4
<collation>
to eitherutf8_bin
orutf8mb4_bin
To change the database collation:
ALTER DATABASE <yourDB> CHARACTER SET <charset> COLLATE <collation>
Changing table collation
Please note, the query below will produce a series of ALTER TABLE
statements, which you must then run against your database.
In the example below, change:
<yourDB>
to your actual database name<charset>
to eitherutf8
orutf8mb4
<collation>
to eitherutf8_bin
orutf8mb4_bin
To create the ALTER SQLs, run the below SQL and use the ALTER SQLs in the resultset to update the table collation:
SELECT CONCAT('ALTER TABLE `', table_name, '` CHARACTER SET <charset> COLLATE <collation>;')
FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
WHERE C.collation_name = T.table_collation
AND T.table_schema = '<yourDB>'
AND
(
C.CHARACTER_SET_NAME != '<charset>'
OR
C.COLLATION_NAME != '<collation>'
);
Changing column collation
Please note, similar to the query above, the queries below (one for varchar
columns, and one for non-varchar
columns) will produce a series of ALTER TABLE
statements, which you must then run against your database.
In the examples below, change:
<yourDB>
to your actual database name<charset>
to eitherutf8
orutf8mb4
<collation>
to eitherutf8_bin
orutf8mb4_bin
To create the ALTER SQLs, run the below SQL and use the ALTER SQLs in the resultset to update the collation for varchar
columns:
SELECT CONCAT('ALTER TABLE `', table_name, '` MODIFY `', column_name, '` ', DATA_TYPE, '(', CHARACTER_MAXIMUM_LENGTH, ') CHARACTER SET <charset> COLLATE <collation>', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';')
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = '<yourDB>'
AND DATA_TYPE = 'varchar'
AND
(
CHARACTER_SET_NAME != '<charset>'
OR
COLLATION_NAME != '<collation>'
);
To create the ALTER SQLs, run the below SQL and use the ALTER SQLs in the resultset to update the collation for non-varchar
columns:
SELECT CONCAT('ALTER TABLE `', table_name, '` MODIFY `', column_name, '` ', DATA_TYPE, ' CHARACTER SET <charset> COLLATE <collation>', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';')
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = '<yourDB>'
AND DATA_TYPE != 'varchar'
AND
(
CHARACTER_SET_NAME != '<charset>'
OR
COLLATION_NAME != '<collation>'
);
Optional Step
Upon executing the above queries, a list of individual ALTER statements is generated for each table and column. For performance reasons you may wish to optimize the resultant queries by hand before execution on the database, particularly if the tables being modified have hundreds of thousands to millions of rows. Query execution time can be reduced by combining multiple ALTER statements for the same TABLE (but different columns) into a singular statement, which avoids MySQL having to process the whole table multiple times. For example,
ALTER TABLE `changeitem` MODIFY `FIELDTYPE` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
ALTER TABLE `changeitem` MODIFY `FIELD` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
...
and so on, would become
ALTER TABLE `changeitem`
MODIFY `FIELDTYPE` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin,
MODIFY `FIELD` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin, ...
The "DATA_TYPE" query currently fails for enum columns. When "DATA_TYPE" is replaced with "COLUMN_TYPE", the generated SQL is also valid for enum columns.
Server character set considerations for Confluence
If you want to use utf8mb4, and character_set_server
is not set to utf8mb4
in the my.cnf
or my.ini
file on your MySQL Server and you can't change this (for example utf8
is required for a database used by another application) you will need to add the connectionCollation=utf8mb4_bin
parameter to your connection URL in order to use utf8mb4
. See the Connector/J 8.0 or Connector/J 5.1 documentation for more information.
You might also need to make sure that your collation and character set are properly defined at your my.cnf file (e.g. using character-set-server = utf8mb4 collation-server = utf8mb4_bin)