Hipchat Server CPU Spike caused by _ohai_btf.py processes after 1.4.1 upgrade

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the server and data center platforms.

This is for an outdated version of Hipchat Server

 This article applies to a version of Hipchat Server which will be deprecated soon. After that period the version will no longer be supported.

When will my version be deprecated?

The following versions have been deprecated:

  • Hipchat Server 1.3 (EOL Date: Aug 17, 2017)
  • Hipchat Server 2.0 (EOL Date: Jun 17, 2018)
  • Hipchat Server 2.1 (EOL Date: Dec 8, 2018)

The following version will be deprecated soon:

  • Hipchat Server 2.2 (EOL Date: May 30, 2019)

You can read more about Atlassian's End of Life policy here.

You should upgrade to a more recent version of Hipchat Server as soon as you can to take advantage of new features, and security and bug fixes.

Problem

The CPU of the machine hosting Hipchat Server is constantly high after the upgrade to the version 1.4.1 causing the following problems:

  • Hipchat Server is unresponsive / slow
  • Unable to login to the web interface via https://<fqdn>
  • Unable to SSH to the Hipchat Server

This CPU utilisation be observed from the runtime.log specifically looking at the output of + /opt/atlassian/hipchat/sbin/_stats.py --show.

runtime.log.1:CPU: 98.0% of 2 cores
runtime.log.1-Clock: 2016-06-14 08:25:23 UTC
..
runtime.log.1:CPU: 99.1% of 2 cores
runtime.log.1-Clock: 2016-06-14 09:40:36 UTC
..
runtime.log.1:CPU: 100.0% of 2 cores
runtime.log.1-Clock: 2016-06-14 11:13:30 UTC

Diagnosis

Environment

  • Hipchat Server 1.4.1

Diagnostic Steps

The following appears in the runtime.log under the + ps auxwww section. As we can see, the _ohai_btf.py process swarms the list of processes of the Hipchat Server

root     21022  0.0  0.0  11112   192 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21025  0.0  0.0  11112   188 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21026  0.0  0.0  11112   184 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21028  0.0  0.0  11112   192 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21030  0.0  0.0  11112   184 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21031  0.0  0.0  11112   184 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21032  0.0  0.0  11112   188 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21033  0.0  0.0  11112   192 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21035  0.0  0.0  11112   184 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21036  0.0  0.0  11112   184 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21037  0.0  0.0  11112   188 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21038  0.0  0.0  11112   184 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21040  0.0  0.0  11112   180 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21041  0.0  0.0  11112   188 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py
root     21042  0.0  0.0  11112   192 ?        S    May20   0:00 /bin/bash /opt/atlassian/hipchat/sbin/_ohai_btf.py

 

Since the _ohai_btf.py process is related to Hipchat's Phone-Home, we tried to switch it off using Disabling Phone-Home Signal but that did not help.

Terminating the process through kill -9 was also unsuccessful as the process will continue to fill the runtime.log entries.

 

sudo dont-blame-hipchat
ps aux | grep ohai
kill -9 <ohai_pid>

Cause

While the specific cause of the issue is unknown, there is a possibility of corruption that occurred in the cron config or ohai script as there are thousands of lines of _ohai_btf.py process being run.

Workaround

Please execute the commands in your Hipchat Server terminal / SSH console to completely 

  1. Obtain the root access to your Hipchat Server:

    sudo dont-blame-hipchat
  2. Navigate to the startup_scripts directory:

    cd ~/startup_scripts/
  3. Download the shell script remove-ohai-fix.sh to the directory:

    wget https://s3.amazonaws.com/hipchat-server-stable/utils/remove-ohai-fix.sh
  4. Grant the user file execution permissions for that shell script:

    chmod +x remove-ohai-fix.sh
  5. Execute the shell script by running the command:

    ./remove-ohai-fix.sh

Last modified on Jan 19, 2018

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.