High CPU usage on Jira DB server
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
High CPU usage is noticed on RDS (Relational Database Service) instance configured as the database server for Jira which may cause an outage.
Environment
AWS RDS for Postgres 11.16
Diagnosis
- High CPU usage is noticed on the RDS server hosting Jira's database and the following traces are noticed in application logs:
Caused by: org.postgresql.util.PSQLException: FATAL: remaining connection slots are reserved for non-replication superuser and rds_superuser connections
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2565)
at org.postgresql.core.v3.QueryExecutorImpl.readStartupMessages(QueryExecutorImpl.java:2677)
at org.postgresql.core.v3.QueryExecutorImpl.<init>(QueryExecutorImpl.java:147)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:273)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.Driver.makeConnection(Driver.java:465)
at org.postgresql.Driver.connect(Driver.java:264)
- There are some other errors that suggest closing calls is getting failed :
ERROR anonymous 437x158417x1 - 185.63.75.17,64.252.144.158,172.28.116.65 /rest/com-spartez-support-chat/1.0/presence/count [o.o.core.entity.GenericDAO] Error closing SQLProcessor[commitMode=READONLY,connection=null,sql=SELECT NODE_ID, NODE_STATE, TIMESTAMP, IP, CACHE_LISTENER_PORT, NODE_BUILD_NUMBER, NODE_VERSION FROM public.clusternode,parameters=[]]; info=[SELECT NODE_ID, NODE_STATE, TIMESTAMP, IP, CACHE_LISTENER_PORT, NODE_BUILD_NUMBER, NODE_VERSION FROM public.clusternode]
java.lang.NullPointerException
at com.atlassian.jira.ofbiz.sql.ConnectionWrapper.close(ConnectionWrapper.java:77)
at com.atlassian.jira.diagnostic.connection.DiagnosticConnection.close(DiagnosticConnection.java:91)
at org.ofbiz.core.entity.jdbc.SQLProcessor.closeConnection(SQLProcessor.java:287)
at org.ofbiz.core.entity.jdbc.SQLProcessor.close(SQLProcessor.java:242)
at org.ofbiz.core.entity.GenericDAO.closeSafely(GenericDAO.java:1501)
- Lastly, there are multiple warnings of " Dangerous use of multiple connections: taken/replaced" for different plugins:
WARN [c.a.jira.ofbiz.ConnectionPoolHealthSqlInterceptor] Dangerous use of multiple connections: taken => count=2; marks=[1-0]; pool=71/40
WARN [c.a.jira.ofbiz.ConnectionPoolHealthSqlInterceptor] Dangerous use of multiple connections: replaced => count=1; marks=[0-0]; pool=70/40
- Determine the number of connections available within Postgres by executing the following SQL:
SELECT name, current_setting(name)
FROM pg_settings
WHERE name = 'max_connections';
Understand the hardware needs of the instance based on Infrastructure recommendations for enterprise Jira instances on AWS. This page provides recommendations for deploying appropriate infrastructure for Large and XLarge instances.
Check if there is a JMX exporter configured as an agent in the setenv file:
JAVA_OPTS="-javaagent:/var/atlassian/application-data/jira/jmx-exporter.jar=8888:/var/atlassian/application-data/jira/jmx-exporter-config.yml ${JAVA_OPTS}"
Cause
There could be multiple reasons altogether:
- By default, the max_connections setting within Postgres is set to 100 connections. If Jira is configured to use more connections than this setting allocates, this can result in a Jira outage.
- RDS hardware resources might not be sufficient and you might need to hike up the RDS hardware.
- In a large instance with many users and groups, JMX exporter can cause this problem.
Solution
- You will need to increase the max_connections setting within the Postgres configuration file. For more information, please see Tuning Your PostgreSQL Server or contact a DBA to assist.
- Instead of increasing the max_connections, you can also consider hiking up the RDS hardware resources both in regard to CPU and more importantly RAM which will automatically increase the max_connection.
- If the above 2 do not help, disable the JMX exporter temporarily to see if this improves the situation. This step is just to rule out if the JXM exporter is causing the issue.