Troubleshooting Confluence performance issues with thread dumps

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform Notice: Server and Data Center Only - This article only applies to Atlassian products on the server and data center platforms.

Problem

Confluence is behaving slowly, and you need more information as to what part of it is being slow. You may have also noticed that the CPU usage on your application server is very high.

This page provides a way of collecting thread dumps (a breakdown of what every thread is doing for a Java process) and the output of top (shows what each native OS thread is consuming as far as resources are concerned). This breakdown could normally be collected with something like jProfiler or other options, as discussed here. In this example we're using native (free) tools to collect information.

This will only work for *nix systems and needs jstack to be installed (should be by default). For Windows please see Generating thread dumps on windows or  Generating a Thread Dump.

Capturing CPU Diagnostics

Resolution

  1. Run the following command to find your Confluence instance. If this returns multiple results, then you have more than one Tomcat instance running on your machine. You'll need to identify your Confluence instance from these results manually:

    ps aux | grep -i catalina.startup.Bootstrap
  2. Set the variable CONF_PID to the process ID returned. This will be the second field from Step 1:

    CONF_PID=<your_process_id>
    Alternative methods for obtaining the PID

    You may be able to automatically identify and set the Confluence PID variable by using a command such as:

    CONF_PID=`ps aux | grep -i confluence | grep -i java | awk  -F '[ ]*' '{print $2}'`
  3. After the CONF_PID variable is set, execute the following script with the user running the Confluence application. The script will generate 6 sets of CPU usage info and thread dumps at 10 seconds intervals, running for a total of 60 seconds:

    Option 1 - JSTACK
    for i in $(seq 6); do top -b -H -p $CONF_PID -n 1 > conf_cpu_usage.`date +%s`.txt; jstack -l $CONF_PID > conf_threads.`date +%s`.txt; sleep 10; done
    Option 2 - KILL -3
    for i in $(seq 6); do top -b -H -p $CONF_PID -n 1 > conf_cpu_usage.`date +%s`.txt; kill -3 $CONF_PID; sleep 10; done

    If this gives you the error "Unable to open socket file: target process not responding or HotSpot VM not loaded", or if any of the files generated are completely empty, please make sure you're executing this script as the same user that started your Confluence process.

    Alternative methods for capturing thread dumps

    There are a few scripts which will automatically grab the PID, then generate the CPU usage info and thread dumps. However these scripts assume that Confluence is the only Java application on the host. If you have multiple Java processes running, the automatic method will not work, as you must manually the correct process that is running Java for the Confluence application:

    Alternative 1:

    for i in $(seq 6); do top -b -H -p `ps -ef | grep java | awk 'FNR == 1 {print $2}'` -n 1 > conf_cpu_usage.`date +%s`.txt; jstack -l `ps -ef | grep java | awk 'FNR == 1 {print $2}'` > 
    conf_threads.`date +%s`.txt; sleep 10; done

    Alternative 2:

    for i in $(seq 6); do top -b -H -p `pgrep -f java` -n 1 > conf_cpu_usage.`date +%s`.txt; jstack -l `pgrep -f java` > conf_threads.`date +%s`.txt; sleep 10; done

    Alternative 3:

    A set of scripts have been designed to make the generation of thread dumps a little easier and can be found at https://bitbucket.org/atlassianlabs/atlassian-support/.

Reading CPU Diagnostics

Two types of files will be generated by the script:

  • CPU usage info - this shows how much CPU each Confluence thread is consuming at that snapshot in time
  • Thread dump - this shows what is thread was doing at that snapshot in time

Looking at both sets of data together can help locate a problematic process:

  1. Look in the resulting CPU usage files to identify which threads are consistently using a lot of CPU time.

  2. Take the PID of the top threads which are using CPU time and convert them to Hexadecimal. Eg: 11159 becomes 0x2b97.

  3. Search for the Hex values in the thread dumps to figure out what these high-CPU threads are doing.


Performance Data Collector

The Performance Data Collector is a server-side, standalone application that exposes a number of REST APIs for collecting performance data. It can be used to collect data, such as thread dumps, disk speed and CPU usage information, to troubleshoot performance problems.

See How to use the Performance Data Collector for more information. 

Last modified on Jan 20, 2021

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.