Performance issue when Confluence is connecting to its Database via Kerberos-based authentication
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
Confluence hits into performance issue which led to application unresponsiveness, while Confluence is connected to its MS SQL Database via Kerberos-based authentication scheme and is using MS SQL Driver. This issue was observed only after a Confluence upgrade from version 6.4.x to Confluence 6.6.x.
Diagnosis
Capture a set of thread dumps and analyze them using the following command:
$ grep -A1 '-exec-' conf_threads.* | grep State | sort | uniq -c
Here is an example result set from the thread dumps (taken at the time when Confluence was unresponsive):
22 conf_threads.1521599075.txt- java.lang.Thread.State: TIMED_WAITING (parking)
127 conf_threads.1521599075.txt- java.lang.Thread.State: WAITING (on object monitor)
1 conf_threads.1521599075.txt- java.lang.Thread.State: WAITING (parking)
2 conf_threads.1521599087.txt- java.lang.Thread.State: RUNNABLE
147 conf_threads.1521599087.txt- java.lang.Thread.State: TIMED_WAITING (parking)
1 conf_threads.1521599087.txt- java.lang.Thread.State: WAITING (parking)
149 conf_threads.1521599098.txt- java.lang.Thread.State: TIMED_WAITING (parking)
1 conf_threads.1521599098.txt- java.lang.Thread.State: WAITING (parking)
149 conf_threads.1521599109.txt- java.lang.Thread.State: TIMED_WAITING (parking)
1 conf_threads.1521599109.txt- java.lang.Thread.State: WAITING (parking)
149 conf_threads.1521599119.txt- java.lang.Thread.State: TIMED_WAITING (parking)
1 conf_threads.1521599119.txt- java.lang.Thread.State: WAITING (parking)
149 conf_threads.1521599130.txt- java.lang.Thread.State: TIMED_WAITING (parking)
1 conf_threads.1521599130.txt- java.lang.Thread.State: WAITING (parking)
Above, most of the HTTP
threads are stuck in a WAITING
status.
In the above example there are 2 RUNNABLE
threads in the second thread dump. Opening the second thread dump in a text editor and looking for the RUNNABLE
process shows that the below C3P0
thread locked all of the user HTTP
threads. This thread is used to acquire connections and interact with the database. The C3P0
threads were stuck waiting to acquire a connection with the SQL Server database, most precisely waiting on Kerberos to allow the authentication in between application and the database:
"C3P0PooledConnectionPoolManager[identityToken->adswerwasafgd|214ftew34]-HelperThread-#2" #2242 daemon prio=5 os_prio=0 tid=0x00007f6a44027800 nid=0x2282 runnable [0x00007f6961332000]
java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) - locked <0x000000073d9f13b0> (a java.net.SocksSocketImpl)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.security.krb5.internal.TCPClient.<init>(NetClient.java:63)
at sun.security.krb5.internal.NetClient.getInstance(NetClient.java:43)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:393)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.krb5.KdcComm.send(KdcComm.java:348)
at sun.security.krb5.KdcComm.sendIfPossible(KdcComm.java:253)
at sun.security.krb5.KdcComm.send(KdcComm.java:229)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbTgsReq.send(KrbTgsReq.java:246)
at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:261)
at sun.security.krb5.KrbCred.<init>(KrbCred.java:86)
at sun.security.jgss.krb5.InitialToken$OverloadedChecksum.<init>(InitialToken.java:114)
at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:59)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:741)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.microsoft.sqlserver.jdbc.KerbAuthentication.intAuthHandShake(KerbAuthentication.java:163)
at com.microsoft.sqlserver.jdbc.KerbAuthentication.GenerateClientContext(KerbAuthentication.java:401)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.sendLogon(SQLServerConnection.java:4265)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.logon(SQLServerConnection.java:3374)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.access$100(SQLServerConnection.java:85)
at com.microsoft.sqlserver.jdbc.SQLServerConnection$LogonCommand.doExecute(SQLServerConnection.java:3338)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7342)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2688)
Looking at <confluence-home>/confluence.cfg.xml
the following configuration is present:
<property name="hibernate.connection.driver_class">com.microsoft.sqlserver.jdbc.SQLServerDriver</property>
...
<property name="hibernate.connection.url">jdbc:sqlserver://localhost:1433;databaseName=confluence;domain=global;integratedSecurity=true;authenticationScheme=JavaKerberos;</property>
Cause
The above diagnosis indicates that the performance degradation was caused by HTTP
thread exhaustion, as the current DB threads are locked and waiting for the Kerberos server to authenticate the connections from Confluence.
However, this only occurs after the recent Confluence upgrade from version 6.4.x to version 6.6.x. In between these versions, there is one major change done for customers with SQL Server DB in particular - while upgrading to Confluence version 6.6 and above, Confluence will automatically switch the DB driver used from the open source jTDS driver for Microsoft SQL Server with the official Microsoft JDBC Driver for SQL Server.
As a conclusion, this then suggests us that the issue is only affecting Confluence with SQL Server Database that uses Kerberos-based authentication scheme to connect with its Database and is also using Microsoft JDBC SQL Server driver.
Workaround
To resolve this issue, there are two workarounds that we could follow:
- To use native SQL Server Authentication option to connect Confluence to its MS SQL Server Database instead of using Kerberos-based authentication scheme.
- To connect Confluence to its Database by using Datasource with jTDS driver instead of using the Microsoft SQL Server Driver.
Automatic fallover to the older jTDS driver is not supported. Confluence 6.6 and above enforces an automatic migration to MS SQL Driver at a code level, whenever an application starts using a jTDS driver.