Failed getting index on start
JIRA 9.1 DC
Platform Notice: Data Center - This article applies to Atlassian products on the Data Center platform.
Note that this knowledge base article was created for the Data Center version of the product. Data Center knowledge base articles for non-Data Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Learn more about the general concept of index management on startup: Changes to index management on the Jira startup in version 9.1
Troubleshooting
Acquiring cluster lock com.atlassian.jira.start.index.lock
Each node has to acquire the com.atlassian.jira.start.index.lock
cluster lock for the whole duration of the index startup procedure. If the lock is not taken by any other node, Jira will claim it and hold it until the index startup procedure is complete. Otherwise it will wait for the current holder of the lock to finish the startup procedure and then acquire the lock.
In case you kill the node holding the lock, it will take 5 minutes by default for other nodes to consider the lock expired.
Step 1 - Local index rebuild
If a local index exists, Jira checks how far behind the DB it is lagging:
- If this is less than 10%, Jira will rebuild the missing index locally
- If this is greater than 10%, Jira will try recovering the index via step 2
Since 9.3 the threshold is configurable with:
-Dcom.atlassian.jira.index.consistency.tolerance.percentage
This step can be skipped with system property
-Dcom.atlassian.jira.startup.rebuild.local.index=false
Step 2 - Get index snapshot from shared and catch-up synchronously
If a fresh snapshot can be found in shared-home, Jira loads it and catches up with the most recent changes in the DB.
The maximum accepted age of a snapshot by default is 24 hours (since Jira 9.4.1 it has been increased to 8 days) but can be configured with system property:
-Dcom.atlassian.jira.startup.max.age.of.usable.index.snapshot.in.hours
This step can be skipped with system property:
-Dcom.atlassian.jira.startup.pick.indexsnapshot.from.shared=false
Step 3 - Trigger full-reindex synchronously
If previous steps fail, Jira triggers a full foreground reindex.
This step can be skipped with system property
-Dcom.atlassian.jira.startup.allow.full.reindex=false
End state: Retrieving index successful
Logs confirming successful index retrieval:
INFO Local index is healthy. Jira can proceed with the start procedure.
Jira will also check if there is a healthy snapshot in shared-home. If one exists, it will log:
INFO Current node: [NODE-ID]. Ensuring that a fresh enough index snapshot exists.
INFO Current node: [NODE-ID]. A fresh snapshot already exists.
If a snapshot is missing or is not fresh, Jira will produce a new one and copy it to shared-home:
INFO Current node: [NODE-ID]. An index snapshot does not exist, or is not fresh. Creating a fresh index snapshot.
INFO Current node: [NODE-ID]. Created index snapshot at [INDEX-SNAPSHOT-FILE-NAME].
End state: Retrieving index failed
Logs when retrieving index failed:
ERROR Failed to prepare local index. Jira is in an unhealthy state.
This Jira instance will be blocked. The warning is presented to the admin explaining the problem:
Users should not be allowed to access this node, since at this point /status endpoint returns MAINTENANCE .
To get a fresh index on this node, you can use the following methods:
- Request the index from another node via the admin panel / copy the search index from another node
- Restore the index from index backup
- Bypass any proxy and go into the node directly to trigger a reindex
- Restart the node
If it continues to fail, please reach out to our Support teams for further assistance.
NODE-START summary log JIRA 9.1.1
Once the index is ready, the following will be logged:
{
"_statsName": "NODE-START",
"_statsType": "total",
"_time": "2022-07-08T11:06:15.120Z",
"_timestamp": 1657278375120,
"_duration": "PT5M0.016S",
"_invocations": 5,
"_statsOverhead": "n/a",
"configuration": {
"isRebuildLocalIndex": true,
"isPickSnapshotFromSharedHome": true,
"isRequestIndexSnapshotFromAnotherNode": false,
"isTriggerFullReindex": true
},
"checkIndexOnStart": {
"time": "2022-07-08T11:03:09.938Z",
"result": "SUCCESS",
"timeInSeconds": 39
},
"getIndexBy": {
"rebuildLocalIndex": {
"result": "FAILED",
"timeInSeconds": 0
},
"pickIndexSnapshotFromSharedHome": {
"snapshotName": "IndexSnapshot_12400_220707-235024.tar.sz",
"result": "SUCCESS",
"timeInSeconds": 39
},
"requestIndexSnapshotFromAnotherNode": {
"result": "NOT_RUN",
"timeInSeconds": 0
},
"performFullForegroundReindex": {
"result": "NOT_RUN",
"timeInSeconds": 0
}
},
"ensureFreshIndexSnapshot": {
"result": "SUCCESS",
"snapshotExisted": true,
"snapshotCreated": false,
"timeInSeconds": 0
}
}