JIRA applications stalls due to lost lock in dbcp pool caused by StackOverflowError
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
JIRA application stops responding (stalls) due to problem obtaining connection from DBCP pool. Problem is caused by lost lock, which lead to dead-lock for this particular part of code. Note that there is no problem with DB configuration or network.
You may see the following in the thread dump: large number of thread waiting in the same code with same stack-trace:
"http-nio-8000-exec-1003" #388843 daemon prio=5 tid=0x00007fbd7c0e3000 nid=0x7e80 waiting on condition [0x00007fbcfe331000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006810eba00> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.apache.commons.pool2.impl.LinkedBlockingDeque.size(LinkedBlockingDeque.java:989)
at org.apache.commons.pool2.impl.GenericObjectPool.getNumIdle(GenericObjectPool.java:693)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:415)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
...
In this case Lock have been lost: <0x00000006810eba00> - java.util.concurrent.locks.ReentrantLock$NonfairSync
Diagnosis
Diagnostic Steps
- Collect thread dump and check waiting threads, see example of stack-trace above.
- Check logs for StackOverflowError exception
Cause
Problem is caused by the following chain of actions:
- Thread obtains the lock for the DBpool
- Thread executes the code
- Then it gets StackOverflowError exception (see Jira applications stall due to StackOverflowError exception)
- Thread still holds the lock, but it was not able to release it
- For whole application this lock is lost, which lead to dead-lock for this particular part of code.
Workaround
Restart the application.
Resolution
Fix the problem in the code which has caused StackOverflowError.
See related KB - Jira applications stall due to StackOverflowError exception