Reindex progresses very slowly after Postgres upgrade or restore

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

After upgrading the Postgres database version, or restoring from a database backup, Jira's reindex progresses very slowly.

Environment

Jira Data CenteR
PostgreSQL database (running on AWS RDS or on-premise)

Diagnosis

CPU usage in the database is usually high but on the application server, it's fairly low. After collecting thread dumps, it can be confirmed that Jira is reading data from the database and that all indexing threads are runnable. To verify which table Jira is reading data from we can run the following SQL query:

SELECT pid, age(clock_timestamp(), query_start), usename, query 
FROM pg_stat_activity 
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' 
ORDER BY query_start desc;

In the output, we can notice that all the threads are reading data from a specific table, and, in the example below, from the changeitem and changegroup tables:

jiradb   | 26403 |    16386 | postgres | PostgreSQL JDBC Driver | active | SELECT CG.ID, CG.issueid, CG.AUTHOR, CG.CREATED, CI.ID, CI.groupid, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING,CI.NEWVALUE, CI.NEWSTRING FROM public.changegroup CG INNER JOIN public.changeitem CI ON CG.ID = CI.groupid WHERE CG.issueid=$1 ORDER BY CG.CREATED ASC, CI.ID ASC
jiradb   | 10757 |    16386 | postgres | PostgreSQL JDBC Driver | active | SELECT CG.ID, CG.issueid, CG.AUTHOR, CG.CREATED, CI.ID, CI.groupid, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING FROM public.changegroup CG INNER JOIN public.changeitem CI ON CG.ID = CI.groupid WHERE CG.issueid=$1 ORDER BY CG.CREATED ASC, CI.ID ASC
jiradb   | 27169 |    16386 | postgres | PostgreSQL JDBC Driver | active | SELECT CG.ID, CG.issueid, CG.AUTHOR, CG.CREATED, CI.ID, CI.groupid, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING FROM public.changegroup CG INNER JOIN public.changeitem CI ON CG.ID = CI.groupid WHERE CG.issueid=$1 ORDER BY CG.CREATED ASC, CI.ID ASC
jiradb   | 10752 |    16386 | postgres | PostgreSQL JDBC Driver | active | SELECT CG.ID, CG.issueid, CG.AUTHOR, CG.CREATED, CI.ID, CI.groupid, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING FROM public.changegroup CG INNER JOIN public.changeitem CI ON CG.ID = CI.groupid WHERE CG.issueid=$1 ORDER BY CG.CREATED ASC, CI.ID ASC
jiradb   | 27168 |    16386 | postgres | PostgreSQL JDBC Driver | active | SELECT CG.ID, CG.issueid, CG.AUTHOR, CG.CREATED, CI.ID, CI.groupid, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING FROM public.changegroup CG INNER JOIN public.changeitem CI ON CG.ID = CI.groupid WHERE CG.issueid=$1 ORDER BY CG.CREATED ASC, CI.ID ASC
jiradb   |  1686 |    16386 | postgres | PostgreSQL JDBC Driver | active | SELECT CG.ID, CG.issueid, CG.AUTHOR, CG.CREATED, CI.ID, CI.groupid, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING FROM public.changegroup CG INNER JOIN public.changeitem CI ON CG.ID = CI.groupid WHERE CG.issueid=$1 ORDER BY CG.CREATED ASC, CI.ID ASC

Cause

During the upgrade or restore, the database data and indexes get fragmented, and SELECT queries use a non-optimized query plan.

Also, stale data are left in the database that won't be cleared up until the autovacuum is executed (by the RDS which happens only on a scheduled basis or manually in on-premise DBs).

This leads to queries taking much more time than they should, leading to a slow reindex on tables that have a large amount of data.

Solution

To avoid conflicting database updates or corrupted data, it is preferable to run these commands during a maintenance window when the application is stopped.

VACUUM FULLrequires exclusive lock on the table it is working on, and therefore cannot be done in parallel with other use of the table.

It is necessary to perform 2 steps to fix the poor database performance:

1. To VACUUM the database to reclaim space used by stale data, with the analyze option, so a new query plan is created for Jira's database, and queries are optimized to use the best indexes available.

2. To REINDEX the database to rebuild the index using the data stored in the index's table, replacing the old copy of the index.

Option 1 - Using Postgres tools (vacuumdb and reindexdb) in shell command line

VACUUM

The vacuumdb tool provided by Postgres. This tool is provided by the Postgres client, which can be installed in Linux environments by installing the package using yum or apt (it's not required to install the Postgres server).

The command below will run the vacuum on all tables in the specified database:

vacuumdb --echo --full --verbose --analyze -h <RDS-IP> -p <PORT> -U <database-username> -d <jira-database>

REINDEX

The reindexdb tool provided by Postgres. This tool is provided by the Postgres client, which can be installed in Linux environments by installing the package using yum or apt (it's not required to install the Postgres server).

The command below will reindex the specified database:

reindexdb --echo --verbose -h <RDS-IP> -p <PORT> -U <database-username> -d <jira-database>

Option 2 - If the Postgres tools (vacuumdb and indexdb) can't be installed, it is possible to run PSQL queries directly in the database.

VACUUM

The PSQL below will run the vacuum on every table in the current database that the current user has permission to vacuum.

1
2

\c <jiradb>;
VACUUM FULL VERBOSE ANALYZE;

Alternative vacuum commands for individual tables - The following SQL statements will create a set of vacuum statements, one for each table in the database, and you need to execute them in your PSQL after having entered Jira's database.

SELECT 
   'VACUUM(FULL, ANALYZE, VERBOSE) '|| '"' || table_name || '"' || ';'
FROM 
   information_schema.tables
WHERE 
  table_catalog = '<jira-database>' AND 
  table_schema = 'public';

REINDEX

The PSQL below will run the reindex operation:

REINDEX (VERBOSE) DATABASE <jira-database>;

After the commands finish their execution, you may trigger a new Jira Full reindex to verify the speed.

In case the problem persists, please contact Atlassian Support for further investigation.

Atlassian Support

Reindex progresses very slowly after Postgres upgrade or restore

Summary

Environment

Diagnosis

Cause

Solution

Related documents

Still need help?