Lucene Index Synchronization in Jira Data Center
Platform Notice: Data Center Only - This article only applies to Atlassian products on the data center platform.
This page will explain in a high-level on how Lucene indexes will be managed and kept in sync by JIRA applications in a Data Center environment.
Where are the Lucene indexes
The Lucene indexes are held in a number of directories in the local JIRA home under "caches"
| |-- portalpage
| `-- searchrequest
Each node in the cluster will have its own set of indexes.
JIRA will automatically keep all the copies of the index up to date automatically. This synchronisation is not fully synchronous, but aims for eventual consistency. This means that there is some delay before index changes are seen on other nodes in the cluster.
How often are the indexes synchronised?
The indexes are synchronised continuously, actually each node polls for changes once per second. If there is a lot of activity then the index synchronisation could start to fall behind. We don't expect this to happen but it could occur.
How does it work?
Each indexing operation writes a row to the database table
replicatedindexoperation. All of the nodes then look for entries in this table that were inserted by nodes other than themselves. They then apply the changes to their local Lucene index.
Each node also keeps a record in
nodeindexcounter of the latest operation it processed, so that next time it just needs to read new operations.
Won't this table get very big?
There is a background task (Replicated index flush service - ReplicatedIndexCleaningService) that runs on each node and removes messages that have been there more than a set period of time. Currently that period is set to 2880 minutes (2 days).
How can modify time records are kept in the table?
You can modify Replicated index flush service (com.atlassian.jira.service.services.index.ReplicatedIndexCleaningService) and change RETENTION_PERIOD:
What happens if a node is offline for an extended period?
If a node goes offline and so has to recover more than 2880 minutes worth of changes, it can tell from the last index operation it has recorded in the
nodeindexcounter and from the current operations to apply in the
replicatedindexoperation table that it is a long way out of date. In this case it will request an active node to send it a full index replica.
When a node first joins the cluster or if it has been offline for and extended period, then it will get a copy of an up to date index from another node. To do this it:
- Sends a "Backup Index" to the the cluster
- An active node (other than the sender) will claim the message, removing it from the message queue and create a backup of the index in the shared home.
- The node that created the backup will then send an "Index Backed Up" message to the node requesting the backup.
- The requesting node will then replace its current index with backed up index.
- The requesting node will then reapply any changes that have occurred since the backup was requested.
Should I copy an index to a new node, I'm setting up?
Never copy an index from a running JIRA. If you try this you will probably end up with an unusable corrupted index. See above.
What happens when I do a foreground reindex?
Reindexing will occur on the node where it was triggered and this node will no longer serve requests actively in the cluster. The other nodes will be active as normal. When the index completes the index will be copied to all nodes.
I'm running low on disk on my shared-home server can I delete the caches directory?
No, at least not without being very careful, and you should be aware it could be replaced at any time and whatever space it is using may be required in the future.
Can I copy the Lucene index from a running load to a new one I am building?
No. Copying Lucene indexes is inherently unreliable as they are being constantly updated and the result of such a copy is usually inconsistent.
JIRA will make a consistent copy of the index from a snapshot in time when a new instance is installed and then apply any changes that have occurred since the snapshot was taken.
One node has a corrupted index.
- Shutdown the node.
- Delete the index directories.
- Restart the node.