Jira node inaccessible for Maintenance but node is not re-indexing
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Some or all of Jira Data Center's nodes are inaccessible and cannot serve users.
Normally, a node on a Jira Data Center cluster will show the "maintenance" status when the node is being reindexed, as explained in Jira cluster monitoring:
{"state":"MAINTENANCE"}
However, the node is showing the "maintenance" status even though it is running and not performing a re-indexing operation.
Environment
Any Jira Data Center version on 7.x or 8.x.
Diagnosis and resolution
Check for cache replication failures
Check the Jira application log to see if you can find any trace of cache replication failure:
Example of error 1
2021-11-10 08:59:35,390-0800 localq-reader-16 ERROR [c.a.j.c.distribution.localq.LocalQCacheOpReader] [LOCALQ] [VIA-COPY] Abandoning sending: LocalQCacheOp{cacheName='com.atlassian.jira.crowd.embedded.ofbiz.EagerOfBizUserCache.userCache', action=PUT, key={10100,brian_campbell}, value == null ? false, replicatePutsViaCopy=true, creationTimeInMillis=1636563571212} from cache replication queue: [queueId=queue_node2_5_78882aaeb08e9a4c81687b5de2add74f_put, queuePath=/vxxxx/atlassian/application-data/jira/localq/queue_node2_5_78882aaeb08e9a4c81687b5de2add74f_put], failuresCount: 1/1. Removing from queue. Error: java.rmi.NoSuchObjectException: no such object in table com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpSender$UnrecoverableFailure: java.rmi.NoSuchObjectException: no such object in table at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.send(LocalQCacheOpRMISender.java:90) at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpReader.run(LocalQCacheOpReader.java:96) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: java.rmi.NoSuchObjectException: no such object in table
Example of error 2
2021-11-10 18:34:18,323-0800 HealthCheck:thread-6 WxxxxN ServiceRunner [c.a.t.j.healthcheck.cluster.ClusterReplicationHealthCheck] Node node3 does not seem to replicate its cache 2021-11-10 18:34:18,324-0800 HealthCheck:thread-6 WxxxxN ServiceRunner [c.a.t.j.healthcheck.cluster.ClusterReplicationHealthCheck] Node node1 does not seem to replicate its cache 2021-11-10 18:34:18,328-0800 support-zip ERROR [c.a.t.healthcheck.concurrent.SupportHealthCheckProcess] Health check 'Cluster Cache Replication' failed with severity 'critical': '["The node node3 is not replicating","The node node1 is not replicating"]'
For each Jira node, check if the -Djava.rmi.server.hostname JVM startup parameter is in use. If it's in use, then check if it is using a correct IP address or a resolvable hostname.
If the IP is incorrect of the hostname is not resolvable, then for each affected node:
- Remove the parameter -Djava.rmi.server.hostname from the JVM startup parameter, if a correct hostname value is already set up in the <JIRA_HOME>/cluster.properties file, and re-start the node
- OR
- Change the value of this parameter to a correct IP address or resolvable hostname
For more detailed information, refer to article JIRA Data Center Asynchronous Cache replication failing health check.
Check for unsupported database collation
- Navigate to ⚙ > System > Troubleshooting and support tools
- Select Instance health > Database
- Check for warnings or errors regarding unsupported collation
If the health check flags unsupported collations, the bug JRASERVER-65708 applies.
Within the bug, we suggest following this documentation to resolve the issue: Database Collation Health Check fails in Jira.
Jira 8.19.1+: Index Consistency failures
For Jira versions 8.19.1 or higher, Index consistency checks can fail by design when Jira nodes showing in the MAINTENANCE status have indexes which are in an inconsistent state (out of sync with the Jira Database).
The following WARNING/INFO will be found in the Jira application logs:
2021-11-22 14:29:38,069+0000 http-nio-8080-exec-10 url: /status WARN anonymous XXXxXXXxX - XX.XXX.X.XXX /status [c.a.j.issue.index.IndexConsistencyUtils] Index consistency check failed for index 'Issue': expectedCount=875155; actualCount=713032
2021-11-22 14:29:38,070+0000 http-nio-8080-exec-10 url: /status INFO anonymous XXXxXXXxX - XX.XXX.X.XXX /status [c.a.jira.servlet.ApplicationStateResolverImpl] Checking index consistency. Time taken: 160.9 ms
2021-11-22 14:29:38,070+0000 http-nio-8080-exec-10 url: /status WARN anonymous XXXxXXXxX - XX.XXX.X.XXX /status [c.a.jira.servlet.ApplicationStateResolverImpl] The issue index is inconsistent. This node will report its status as MAINTENANCE. You will find information on how to resolve this problem here: https://jira.atlassian.com/browse/JRASERVER-66970
To fix index inconsistencies:
- Access the problematic node using its IP address via a browser
- Go to ⚙ > System > Indexing
- Select Full re-index and click Re-index
- Wait until the re-indexing completes and confirm that the status of this node changes to RUNNING
Note that it is possible to prevent the node from going into MAINTENANCE mode when the indexes are out of sync as explained in the Current status section of the feature request JRASERVER-66970 - /status should indicate when indexes are broken on a node. If you want to ensure that in the future, the node remains in RUNNING mode while having inconsistent indexes (which was the expected behavior prior to Jira 8.19.1), you will need to add the following JVM startup parameter to each Jira node and re-start each node:
-Dcom.atlassian.jira.status.index.check=false
Note: Even if Jira node is in MAINTENANCE mode, that specific node will still be accessible when browsing Jira through IP address / hostname; Only when browsing Jira through base URL bound to Load Balancer will this node be inaccessible since Load Balancer will not route any requests to it.