Index Replication Jira Data Center Troubleshooting

Still need help?

The Atlassian Community is here for you.

Ask the community


Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Jira keeps all the copies of the indexes up to date automatically. The synchronization aims for eventual consistency and is not synchronous, which means that there can be some delay before the index changes are seen on other nodes in the cluster. For instance, during busy hour or during issue/project import it is expected to see a temporary gap in the index count between nodes; though the gap should be closing over time. For details on how nodes keeps the index in sync, see Keeping Lucene Index Synchronised

Where indexes are stored

Local Home

Like Jira Server, the indexes are stored in the Jira Home or Jira Local Home which is local to the node. This will contain the indexes files. It is recommended to be placed on the fastest available disk and within the range of “Excellent” and “OK” grade reference to Disk Access Speed.

Shared Home

Jira Data Center introduce Jira Shared Home which is shared among the nodes in the same cluster. This will contain the Index Snapshot ZIP files. This shared location can be stored on an NFS filesystem with sufficient read/write permission by the user running Jira application on each node. This is configured in cluster.properties.

Common Problems

The following are the list of common problems found with Index Replications in Jira Data Center.

  • Health-check failing on Index Replication

  • Inconsistency of search results between the nodes in the same cluster. Examples:

    • Issue "A" appears on Node-1 but not on Node-3

    • Searching "Project Taco" returns 100 issues on Node-1 but only 89 issues on Node-2

    • Gadget returning different results between nodes

Sending Information to Support

Please raise a case at https://getsupport.atlassian.com and provide the following information:

  1. Clarify the behavior of “Common problems”.

  2.  Produce Thread Dumps with the command provided in Generating a thread dump.

    for i in $(seq 6); do top -b -H -p $Jira_PROCESSID -n 1 > app_cpu_usage.`date +%s`.txt; kill -3 $Jira_PROCESSID; sleep 10; done

    (info) In the example above, you would replace $Jira_PROCESSID with the Process ID of Jira.

  3. Generate Jira support zip of all nodes in the cluster.

  4. Share the screenshot result of the Jira Data Center Health Check.

  5. Share the output of http://<node-url>/rest/api/2/index/summary from each node.

    1. Use indexsummary.zip script to gather the data!

      Index Summary Script Details

      Update nodeurl.config and list the bypassed URL of each node.

      Run ./index.sh -m to gather Index summary data for each node.

      Share indexsummary.node#.txt files generated to support.

Index Summary REST Endpoint

The Index Summary endpoint /rest/api/2/index/summary helps to understand a more detailed index status for each node. It gives insight of where the particular node is against the database and the other nodes.

Index Summary gives the example of the JSON output and the details of what each value represents.

As a starting point, the following values are to be noted:

  • countInDatabase and countInIndex to match, and continuously increasing to match

  • lastConsumedOperation and lastOperationInQueue to match, and continuously increasing to match

  • queueSize value not increasing drastically

Database Validation Endpoint

Index replication communication is done between the nodes and the database. The following checks are to see if the nodes are sending heartbeats to the database continuously.

  1. Check CLUSTERNODE table if the node registered in the cluster

    1. ACTIVE node status has a recent timestamp

    2. Many inactive nodes may cause a delay in Index replication due to

  2. Check CLUSTERNODEHEARTBEAT table if the node responding to the heartbeat message

Index Replication on Re-index

A re-indexing either foreground, background or project will be performed on a node that it is being triggered on. Once it is completed, here is how it gets replicated across other nodes (based on Jira 7.13.1):

Foreground (locked) re-indexing

On completion of a foreground re-indexing, an Index Snapshot will be created and stored in Jira Shared Home directory. The other node will be informed on the completion of the re-indexing process via the database replicatedindexoperation table with operation column value of FULL_REINDEX_END. They will copy the same Index Snapshot from Shared Home to node's Local Home and unpack the Lucene indexes.

Background re-indexing

The same process is used for Background re-indexing. An Index Snapshot will be created and stored in Jira Shared Home directory. The other node will be informed on the completion of the re-indexing process via the database replicatedindexoperation table with operation column value of BACKGROUND_REINDEX_END.

Project re-indexing

Project re-indexing process is different where it will not create an Index Snapshot. The node where the Project re-indexing is triggered will inform other nodes that Project re-indexing needs to be done via the database replicatedindexoperation table with operation column value of PROJECT_REINDEX. Other nodes will then trigger their own local Project re-indexing operation.

Using Jira Statistics for troubleshooting

Since Jira 8.12, we can find periodic Jira stats logs for DBR in atlassian-jira.log, which can be used to understand the replication operations.

Look for entries with [JIRA-STATS] [DBR] in the logs.

More details about Jira stats on Troubleshooting performance with Jira Stats.

Also check Document-based replication in Jira Data Center for more details on DBR and the metrics in the logs.

Enable Additional Logging for Index Replication

To get detailed logging of what Jira is doing add a new logger to Jira’s Logging and Profiling page

Logging for Replication communication

level: DEBUG
package: com.atlassian.jira.index.ha.DefaultReplicatedIndexManager

Logging for Local indexing

level: DEBUG
package: com.atlassian.jira.issue.index.DefaultIndexManager

Logging for Scheduler for indexing needs

level: DEBUG
package: com.atlassian.jira.index.ha.DefaultNodeReindexService


Description
Product

Last modified on Jun 7, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.