HipChat Server Performance Tweaks

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Compiled below are known performance tweaks that you can implement in HipChat Server to help improve Server health and performance server side. As always, please take a backup (or a VM level snapshot) before instituting any of these updates and changes. These changes are only recommended for the HipChat Server 


Upgrading

Upgrading to the latest HipChat Server version (latest is currently v2.4.3 - HipChat Server Release Notes) offers the latest improvements, security features as well as roster improvements and will be supported up until End of Life or the expiry of your license.

Adding additional hardware resources

Depending on how many users you have, the single quickest way to get a performance boost in HipChat Server is:

  • Adding more vCPU cores

  • Faster vCPU cores (2.8GHz or faster)

  • Adding more RAM

A quick guide that you can use for scaling hardware up is by looking at the HipChat Server System Requirements and move a step up from the recommended settings.

  • Under 100 users: 2 CPU cores (2GHz or higher) and 4GB RAM

  • Under 500 users: 4 CPU cores (2.8GHz or higher) and 8GB RAM

  • Under 1000 users: 8 CPU cores (2.8GHz or higher) and 16GB RAM

  • Under 3000 users: 16 CPU cores (2.8GHz or higher) and 32GB RAM

  • Under 5000 users: 16 CPU cores (2.8GHz or higher) and 64GB RAM

For instance, if your group has under 100 registered users, then moving to the next step up to 4 vCPU and 8GB RAM should provide a noticeable increase in stability.

Alternatively, if you have access to faster CPU's (our recommendation is 2.8GHz however the faster the better) this would also have a net increase in performance across the board, especially with single-process services like Punjab.

Why does this work?

Several reasons. HipChat Server scales out more Tetra (HipChat logic center) and Coral (internal/external API handling) processes as more CPU's are detected. More processes can handle a larger bandwidth and keep the load on each individual Tetra and Coral process lower, which leads to a more stable environment. Additionally, single threaded services such as Punjab (HTTP to XMPP translation service) and embedded Crowd (internal authentication system) will also see a benefit.

When should you not do this?

If your issues are related to network speed (i.e. bandwidth), then increasing the hardware specifications will have a negligible effect.

How do I do this?

    • For AWS instances, please refer to Changing the Instance Type.
    • For OVA deployments, please refer to the VM hypervisor documentation from the VM vendor.

Disable XMPP connections directly to Tetra

In HipChat Server v2.0.6, we introduced a feature to disable direct XMPP connections with Tetra. In testing, we found that using Tetra to manage direct XMPP connections is expensive and can cause delays with processing data upstream. The cumulative effect of offloading XMPP translation to Punjab (via BOSH) has on the Server can be noticeable depending on how many bots or third-party XMPP chat clients Tetra is managing at any one time.

Why does this work?

Tetra-app is the 'logic' of HipChat and has a hand in everything from processing events (like user logins) to sending out presence information to rooms and creating new sessions for HipChat clients connecting. To manage processing queues and events in/out of Tetra-app, a second set of processes is needed. This set of processes is called Tetra-proxy and it acts like a 'mesh' which load balances traffic to Tetra-app when needed.

Adding SSL handshaking and managing direct XMPP traffic to Tetra-app will cause delays in processing other events and will have an impact upstream if Tetra-app (or Tetra-proxy) gets bogged down during peak times.

Bots or third-party XMPP clients that are BOSH compatible will flow through NGINX to Punjab then to Tetra, completely offloading the session handling and SSL handshaking from Tetra to NGINX and Punjab. Check the bot or third-party XMPP chat client documentation if they are BOSH compliant or not.

When should you not do this?

If you run XMPP bots or have third-party XMPP clients that are not BOSH compatible.

How do I do this?

XMPP ports are disabled by default on all HipChat Server versions 2.0.6 and newer. To check if these ports are enabled, log into the HipChat Server CLI (command line interface) and run the following check:

    • If the XMPP ports are disabled, you'll see this set to 'False'
    • If the ports are enabled, you'll see this set to 'True'

To enable/disable the ports, you can use the following flags for the hipchat network command:

--disable-xmpp-ports - Disables external XMPP (BOSH still available)
--enable-xmpp-ports - Enables external XMPP port access

Additional Info

You can read more about XMPP-BOSH implementation at XEP-0124: Bidirectional-streams Over Synchronous HTTP (BOSH).

Update the Tetra cleanup script to run more frequently (every 6 hours)

Aside from upgrading to more/faster CPU's and disabling direct XMPP connections to Tetra, you can also configure crontab to run the Tetra cleanup script more often. This is a script that will terminate dead/stale user sessions that are still shown as 'open' in Tetra's eyes. This will temporarily increase the load on the server while it runs (~10-30 seconds).

Why does this work?

In HipChat Server v2.4.x versions, the cleanup script runs once a day at 3am UTC every day, but it can be set to run more frequently to clean up sessions at a more aggressive pace. Cleaning up these rogue sessions more frequently will help both Tetra and Punjab focus on only processing sessions that are active (and not old sessions that are waiting to time out). A good starting point is setting it run at 6-hour intervals: this will clean up sessions before users log in (6AM), at lunch time (12PM), in the evening (6PM) and again at midnight (12AM). Again, these sessions are stale (not tied to an active client) but still taking up processing cycles until they time out (which sometimes they do not) and this should not affect existing, active sessions.

When should you not do this?

Do not do this if your HipChat Server VM is under-resourced either in CPU cores or CPU speed or you have high RAM utilization when the server is not under load (>70%).

How do I do this?

  1. Log into the HipChat Server CLI
  2. Gain root:

    sudo dont-blame-hipchat
  3. From here, switch to the 'hipchat' user.

    sudo su hipchat
  4. Now, let's edit the Crontab:

    crontab -e

    At this point you will need to choose an editor, this is up to you to select.


  5. Under the "# Chef Name: nightly tetra sessions cleanup" section, you'll see the schedule for the tetra_sessions_cleanup.py script. It'll look like this:

    0 3 * * *


    What you'll want to do is change it so it runs every 6 hours. Update the schedule so it reads this:

    0 */6 * * *
  6. Once this is set, save file and exit the editor.
  7. Now you should reload the cron service (as the hipchat user):

    sudo service cron restart
  8. Finally, exit to get back to the root user account in the CLI:

    exit

Additional Info

From here, you'll want to monitor the health of the server after this script fires and especially during times where there is load. If you see any issues after this change that seem to be related, you can reverse the steps above and set the schedule back to default values.

Please note, on a reboot of the server or on an upgrade, this value will be reverted to default.

Trim licensed user counts down

In HipChat Server v2.4.2 and newer, we've added a licensing feature that allows you to de-license users individually (opposed to deactivating them).

Why does this work?

This removes users from taking a seat up on the license and has the bonus of removing the user from being included in the roster push when the HipChat clients connect. Removing a significant amount of users from the roster will greatly reduce the amount of data that needs to be processed by Tetra and moved per roster update. The performance gain comes from the cumulative effect of a smaller roster with better performance as more users are deactivated.

When should you not do this?

If you have a high number of licensed users that are using the service - i.e. if you have 1000 users and 990 users are using HipChat daily, delicensing 10 users will not be a big enough impact to be noticable. On the flipside, if you have 1000 users and only 500 of them are using HipChat daily, delicensing the other 500 users would have a larger impact.

How do I do this?

  1. Log into the HipChat Server CLI
  2. In the CLI, use the following commands to license or de-license a user:

    hipchat license --license-user n
    hipchat license --delicense-user n

    Replace n with either the user id or the email address of the user.

Additional Info

Delicensing a user does not prevent the user from logging in, but instead removes the user from taking up a seat. If the user logs in after they have been delicensed, then the user will be viewed as active and thus will take a license seat at that time. If you want to prevent the user from logging in, you can simply deactivate them using the web interface (Group Admin -> Users -> <NAME> -> Deactivate).

Archive inactive rooms

Along the same lines as trimming licensed user count down, archiving rooms that are no longer active will also help reduce the amount of roster data that is pushed from the Server to Client.

Why does this work?

This removes rooms from the roster push when the HipChat clients connect. Removing a significant amount of rooms from the roster will greatly reduce the amount of data that needs to be processed by Tetra and moved per roster update. The performance gain comes from the cumulative effect of a smaller roster with better performance as more rooms are archived.

When should you not do this?

When a room is archived, the room disappears from the HipChat client sidebar for any users that subscribed to it. From here, history can only be searched for this room via the HipChat web interface under the 'Rooms' link. If you rely on searching in app for history for a particular room, you may not want to archive it. Additionally, archiving the room will prevent any users from posting messages as described in Chat history retention and other privacy options.

How do I do this?

You can archive rooms two ways:

    • While in the HipChat app (as a room owner or admin): Select focus on the room, select the three dots icon, then select the 'Archive' option in the selected room.
    • While in the HipChat web interface (as a room owner or admin): Log into the web interface, select 'Rooms', search for and select the room, then select the 'Archive' option on the left hand side.

Additional Info

Please see Using the Auto-Archive script for HipChat Server if you want to auto-archive rooms based on room inactivity (via days).

Reduce Autojoins

Autojoin lists contain the room and user individual chats names that load when a HipChat user logs in with the HipChat client. This value is stored in Redis cache.

Why does this work?

Much like reducing the roster size, reducing the total amount of autojoins per user will help lower the amount of data to process from Tetra when a client logs in. Reducing these down to 50 or less across the end user base will have the cumulative effect and reduce the time it takes for Tetra to process and handle the log in requests (and by extension shorten any 'reconnection' storms that may occur).

When should you not do this?

HipChat Server does not have a 'hard' limit to autojoins, however it is recommended to keep these at 50 or below (rooms and private chats). In especially busy instances, some users may need to keep more than 50 chats open in the sidebar due to various reasons, however the general rule of thumb is to keep the number below 50. You can backup the autojoin list using How to backup Autojoin data from Redis in HipChat Server and reference it if this is the case.

Additionally 

How do I do this?

We have a script which will sort through Redis cache and count all the autojoin entries per user and add them to a report file. No changes are made, this simply queries Redis and generates a report.

  1. Run through the How to backup Autojoin data from Redis in HipChat Server documentation to backup the autojoins for each user.
  2. Log into the HipChat Server command line interface (CLI)
  3. Next, download the query_autojoinv2.sh script to your HipChat Server. 

  4. Make it executable:

    chmod +x query_autojoinv2.sh
  5. Run the script:

    ./query_autojoinv2.sh
  6. Review the report and reach out to the users that have more than 50 autojoins and ask them to remove some items from their sidebar to get under this limit.

Additional Info

You can modify the numbered threshold that the script uses to determine who is listed in the report. The default is set to '50', the value is labeled THRESHOLD and resides on line 30.

Alternatively, the script can also send an email to whomever is listed on the report automatically using their email address saved under their user account. You'll need to uncomment lines 64 through 71 to enable it, save the script, then re-run it.

Note: This uses sendmail and if your environment uses a mail relay, then this will not work (sendmail will not honor the relay settings as they are specific to Postfix).



Last modified on Apr 30, 2019

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.