Background reindex is slow after upgrading to Jira 8.10 and later

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

If you're experiencing other problems with background reindex, such as issue creation hanging or slow and user requests timing out, you might be being affected by JRASERVER-72045 - Getting issue details... STATUS .

Problem

After upgrading to JIRA 8.10.0 and later, including the LTS release 8.13.0, the background reindex operations may be slower (0-20%) than it was in previous Jira versions. The extra cost is caused by higher index consistency. In previous Jira versions (< 8.10.0) background indexing could override (in local index) concurrent user data updates actions. The (recommended) foreground indexing is not affected.

Diagnosis

The reason behind this is due to the indexing versioning which was introduced in Jira 8.10, preparing for the DBR functionality available in Jira 8.12 onwards. Whilst DBR functionality is mainly for Jira Data Center, index versioning is used also by Jira Server to keep a consistent index. 

The new background reindex process now is different than it was in Jira 8.8.1 because now all index updates are checking the version of the entity in the index. This guarantees index consistency when there are concurrent index updates. Previously, any concurrent user actions modifying those entities would be potentially overwritten (in the index) by the old data. So a task was executed in the end of the background re-index process to try to fix that possible inconsistent index.

However, with versioning background reindex introduced in Jira 8.10, we do a conditional update when updating the index like any other user action modifying the index. Thus we have a consistent index but with a slightly higher cost. This check happens for each item and during this, the index is constantly flushed to the disk.

During our benchmarks, we identified that the new background reindex process might be 1 to 20% slower when compared to previous Jira versions. See below the results from a test performed with 60k issues with ~15 comments per issue. The time per issue seems to be constant regardless of the number of issues on the environment (i.e: is not aggravated when you have a lot of issues on the database):

Jira VersionBackground reindex time
8.8.17 milliseconds per issue
8.13.016 milliseconds per issue


Cause

If you're experiencing 1-20% increase in the time required to perform a background reindex, this is expected due to the new index versioning feature. If the difference is much more significant, then a more thorough investigation might be required to identify any performance bottlenecks.


Root Cause #1


We have observed that on instances with poor disk performance, the background reindex process is significantly impacted. Before the upgrade, it is very likely that other symptoms of poor disk throughput were displayed, such as slow Jira issue searches. You can verify your instance disk's performance using our Test disk access speed for Jira server performance troubleshooting documentation. 

To help in the troubleshooting of this, there's new logging introduced to Jira that you can verify. You can search the $JIRA-HOME/log/atlassian-jira.log log file after the background index is completed and search for the following log trace:

2020-10-21 16:33:20,057+0000 index-writer-stats-ISSUE-9-0 INFO rbaldasso 761x67822x1 sfdsi932 127.0.0.1 /secure/admin/jira/IndexReIndex!reindex.jspa [c.a.jira.index.WriterWithStats] [index-writer-stat
s] ISSUE : total stats: {"addDocumentsMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{}},"deleteDocumentsMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{}},"updateDocumentsM
illis":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{}},"updateDocumentConditionallyMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{}},"updateDocumentsWithVersionMillis":{"count":
25884,"min":24,"max":5674,"sum":13765800,"avg":531,"distributionCounter":{"10":0,"100":1,"500":19264,"1000":3838,"5000":2780,"10000":1,"30000":0,"60000":0}},"updateDocumentsWithVersionSize":{"count":25884,"min":1,"max":1,"sum
":25884,"avg":1,"distributionCounter":{"1":25884,"10":0,"100":0,"1000":0,"10000":0}},"replaceDocumentsWithVersionMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{"10":0,"100":0,"500":0,"1000":0,"5000"
:0,"10000":0,"30000":0,"60000":0}},"replaceDocumentsWithVersionSize":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{"1":0,"10":0,"100":0,"1000":0,"10000":0}},"optimizeMillis":{"count":1,"min":0,"max":0,"sum
":0,"avg":0,"distributionCounter":{}},"closeMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0,"distributionCounter":{}},"commitMillis":{"count":460,"min":916,"max":4329,"sum":970038,"avg":2108,"distributionCounter":{}}}, ind
ex writer version cache stats: {"put":25884,"get":25885,"getFound":0,"clear":0}


During this poor disk throughput case investigation, it was identified that Jira is taking too much time to update the indexing files:

"updateDocumentsWithVersionMillis": {
    "count": 1036,
    "min": 24,
    "max": 2439,
    "sum": 579572,
    "avg": 559,
    "distributionCounter": {
      "10": 0,
      "100": 1,
      "500": 729,
      "1000": 190,
      "5000": 116,
      "10000": 0,
      "30000": 0,
      "60000": 0
    }
  }


updateDocumentsWithVersionMillis measures the time of updating the Lucene index with a single issue (single document). Usually, we expect the average to be less than 10 milliseconds, and 99% to be less than 100 milliseconds.

You can also collect thread dumps during the background reindex process and look for stuck threads that might be delaying the process.

Resolution

Root Cause #1

  • Please increase disk throughput by migrating to fast storage, i.e using SSD disks.


For any further assistance, please contact Support.

Last modified on Feb 8, 2021

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.