Jira Data Center search indexing
To provide fast searching, Jira creates indexes of the text entered into issue fields. These indexes are stored on the file system, and updated whenever a piece of text is added or modified. They're called Lucene indexes, because they are provided by a third-party library with that name. This page explains how indexes are managed and kept in sync in Jira Data Center.
Where are the indexes stored?
The indexes are stored in a number of directories in the local Jira home directory under
caches. Each node in the cluster has its own set of indexes.
| |-- portalpage
| `-- searchrequest
Synchronizing the indexes
Jira keeps all the copies of the indexes up to date automatically. The synchronization is not fully synchronous but aims for eventual consistency, which means that there is some delay before the index changes are seen on other nodes in the cluster.
The indexes are synchronized continuously – each node polls for the changes once per second. But where are these changes recorded?
Indexes and database
- Database table:
Each index operation writes a row to this database table. All nodes then look for entries that were written by other nodes in the cluster. After finding such changes, the nodes apply them to their local Lucene index.
- Database table:
To avoid the nodes checking for all possible operations all the time, they always record the latest processed operation in this table, so that during the next check, they only need to read new operations.
With the number of indexing changes, the database tables might get very big. To avoid that, we've introduced a service that runs on each node and removes messages that have been there longer than a set period of time. The default retention period is set to 2800 minutes (2 days), which works well with indexing, but you can customize it in Jira.
- In Jira, go to > System.
- In the Advanced section, select Services.
- Edit the
com.atlassian.jira.service.services.index.ReplicatedIndexCleaningServiceservice, and enter a new retention period.
Replicating the indexes
When a node joins the cluster for the first time, or if it has been offline for an extended period of time, it will receive a copy of an up-to-date index from another active node instead of applying all these changes from the database. It's just simpler and more effective. The indexes are replicated in the following way:
- A node sends a "backup index request" to the cluster.
- One of the active nodes receives the request, removes it from the message queue, and creates a backup of the index in the shared home directory.
- The node that created the backup sends an "index backed up" message to the node that requested the backup.
- The requesting node replaces its current index with the backup.
- The requesting node also applies any changes that have occurred since the backup was created.
Checking the health of the indexes
Jira Data Center provides a health check that helps you make sure the indexes are replicated without any issues. The knowledge base article describing it also contains some troubleshooting information, as well as links to specific issues related to indexes. For more info, see HealthCheck: Cluster Index Replication.
You can also use the API to check the condition of the index on a particular node. For more info, see Get index summary.
We've also gathered here some basic questions that we often get about the indexes.
|I'm running low on disk space on the server that stores the shared home directory. Can I delete the
||The deleted indexes will be replaced soon, so for the purpose of getting some extra space, there's no point in that. You can delete this directory only if the indexes on one node are corrupted, and you need a fresh copy.|
One of the nodes in the cluster has corrupted index.
The easiest way to fix this is to copy the index from one node to another. To do this:
Can I copy the indexes from a running Jira instance?
No, copying the indexes is unreliable, because they're being constantly updated. Such a copy would be inconsistent.
Jira makes a consistent copy of the index from a snapshot in time when a new instance is added to the cluster, and then applies all changes that have occurred since then.