Cluster Index Replication health check fails in Jira Data Center due to Jira Charting Plugin
Platform Notice: Data Center Only - This article only applies to Atlassian products on the data center platform.
Jira Data Center throws a warning regarding the Cluster Index Replication health check failing. Different node's indexes may fall out of sync with each other, resulting in inconsistent Issue Navigator search results, gadget results, and other issue-related symptoms.
The Cluster Index Replication health check may report delays like the following:
Name: Cluster Index Replication Is healthy: false Failure reason: ["Index replication for cluster node 'node2' is behind by 26,067 seconds.","Index replication for cluster node 'node3' is behind by 33,220 seconds.","Index replication for cluster node 'node4' is behind by 9,658 seconds."] Severity: WARNING
Jira is configured as a multi-node Data Center
The Jira Charting Plugin is installed
Capture thread dumps from affected nodes (for example: Troubleshooting Performance Issues with thread dumps)
Verify whether the NodeReindexServiceThread thread shows a stack trace similar to the following:
"NodeReindexServiceThread:thread-1" prio=5 tid=0x000000000000015f nid=0 runnable java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) ... at com.sun.proxy.$Proxy390.updateValues(Unknown Source) at com.atlassian.jira.ext.charting.field.TimeInStatusCFType.storeDatabaseValue(TimeInStatusCFType.java:98) at com.atlassian.jira.ext.charting.field.TimeInStatusCFType.getValueFromIssue(TimeInStatusCFType.java:77) at com.atlassian.jira.issue.fields.ImmutableCustomField.getValue(ImmutableCustomField.java:350) ... at com.sun.proxy.$Proxy41.reIndexIssueObjects(Unknown Source) at com.atlassian.jira.index.ha.DefaultNodeReindexService.updateIssueIndex(DefaultNodeReindexService.java:453) at com.atlassian.jira.index.ha.DefaultNodeReindexService.updateAffectedIndexes(DefaultNodeReindexService.java:341) at com.atlassian.jira.index.ha.DefaultNodeReindexService.applyIndexOperations(DefaultNodeReindexService.java:279) at com.atlassian.jira.index.ha.DefaultNodeReindexService.reIndex(DefaultNodeReindexService.java:265) at com.atlassian.jira.index.ha.DefaultNodeReindexService$$Lambda$352/669392084.run(Unknown Source) ... at java.lang.Thread.run(Thread.java:748)
- The key section of the stack trace is the existence of a method call containing "com.atlassian.jira.ext.charting".
The Jira Charting Plugin is an experimental plugin developed by Atlassian which is no longer maintained nor supported. It is also not classified as Data Center Compatible, and is not recommended for any Production environment. (see - JCHART-479Getting issue details... STATUS )
The plugin is unsafe for use in Data Center environments as it may cause a deadlock in the database when multiple nodes attempt to perform the same functionality at the same time. The issue manifests when the following scenario occurs:
- An issue operation is performed to an issue on Node A
- Node A replicates the index operation to Node B and Node C
- Node B and Node C attempts to reindex the issue simultaneously
- Reindexing an issue when the Jira Charting Plugin is installed will reindex the Time in Status Custom Field, causing it to recalculate, so that the recalculated value may be written into the node's index. This recalculation begins with a deletion to the custom field's value in the database.
- The same delete statement is made to the database's customfieldvalue table from multiple nodes, resulting in a deadlock
Resolving the immediate Index Replication delays will involve removing the database deadlock, such that the NodeReindexServiceThread is able to proceed with replicating the rest of the index operations.
Consult your Database Administrators for assistance in identifying the deadlocked queries and terminating them. Typically, these appear to be long-running delete queries against the customfieldvalue table.
Shut down all but one node in the cluster, as the queries will be released when the owner node is shut down. The nodes may then be brought back up.
Follow any of the Workaround steps to unclog the index replication, and then permanently remove the Jira Charting Plugin.