Troubleshooting Hipchat Data Center

Before you allow users in to your Hipchat Data Center deployment, you should make sure everything is configured and working appropriately. This page contains a checklist that we hope will help you verify the deployment, and catch any errors before you open the deployment to users.

On this page:

System monitoring and alerts

You can configure Hipchat Data Center with a recipient email address for system alerts from the Hipchat Data Center admin UI.

An alert email is sent when any of the following conditions are met:

System Utilization

  • Memory utilization over 98% for three cycles
  • Swap file utilization over 10% for three cycles
  • CPU (user) utilization over 95% for three cycles
  • CPU (system) utilization over 95% for three cycles
  • CPU (wait) utilization over 99% for three cycles

Services

  • 'gearman' over 30% CPU utilization
  • 'nginx' over 20% CPU utilization for five cycles
  • 'ntpd' becomes unavailable
  • 'php5' restarts three times within five cycles
  • 'punjab' unavailable for three cycles, or over 45% CPU for three cycles
  • 'rsyslog'  over 75% CPU for three cycles

Set up SNMP monitoring

Hipchat Data Center implements SNMP v2c using standard Ubuntu MIBs that can be enabled at the command line. 

  • To turn SNMP on or off: 
    hipchat service -n "on" OR "off"
  • To set up the community string:
    hipchat service -c <communitystring>
    Example: hipchat service -c public
  • To add TRAP recipient server list:

    hipchat service -t trap.server.com
    Example: snmp1.exmaple.com,snmp2.example.com

    Note: Add \ prior to a special character as in dollar\$ign

Troubleshooting and logs

Log files are available in the /var/log/ directory of each node.  The Hipchat service logs can be found inside /var/log/hipchat/.

Once per day, the log files from each node are copied to the /file_store/shared/logs subdirectory of your network-attached storage volume. They follow a /YYYYMMDD/machineid/log-files naming convention.

Configuration management is managed by chef-solo. It is run at boot, upgrade, and during service restarts. You can find the chef-solo log file in at /var/log/chef.log

To retrieve all your logs, run  hipchat log -r  on each node. This copies the logs to the  /file_store/shared/logs folder, which you can then compress and include with your support request. 

tip/resting Created with Sketch.

If you need to open a Support request, make sure you download and attach your logs if possible. This helps us speed up the troubleshooting process.

Logs commands

Command Use Notes
hipchat log --rotate Force a log rotation This will force all logs to conform to the log rotation configuration specified in /etc/logrotate.conf and /etc/logrotate.d
hipchat log --purge Truncates the contents of all logs in /var/log Be sure to backup any logs required for troubleshooting before executing this command.

Log file reference

Resource

Use

Notes

/var/log/chef.log chef runs for installing/updating/configuring Logging starts from first boot. Most system configuration changes will trigger a chef run.
/var/log/hipchat/nginx.log nginx logs AND coral logs

Includes nginx-access entries alongside coral entries. nginx.err.log only logs ERROR and above.

 Any entries in nginx.err.log are indicative of a problem.

/var/log/hipchat/kern.log Ubuntu kernel logging  
/var/log/schema_upgrade.log Logs any schema upgrade changes that occur during upgrades Useful for seeing upgrade history.
/var/log/hipchat/atlassian-crowd.log

External directory (Crowd/AD/LDAP) integration and authentication

Related to user authentication and external directory synchronization.
/var/log/hipchat/coral.log APIv2 logs

Many services rely on coral for authentication, so this log is often referenced while tracing a problem. 

coral.err.log only logs ERROR and above. Any entries in coral.err.log are indicative of a problem.

/var/log/hipchat/cron.log Entries related to cron job schedules on the server  
/var/log/hipchat/web.log WebUI logging (i.e. the php-based administration)

Good starting point for any error messages or stack traces occurring in the web interface. 

web.err.log only logs ERROR and above. Any entries in web.err.log are indicative of a problem.

/var/log/hipchat/update.log Detailed output of upgrades (and errors) Critical for troubleshooting upgrade issues, along with chef.log.
/var/log/hipchat/tetra.log

Core chat service log

Errors here are often critical.

tetra.err.log only logs ERROR and above. Any entries in tetra.err.log are indicative of a problem.

/var/log/hipchat/hup.log Logs when services are restarted

Helpful for troubleshooting a broken service/upgrade.

"services starting" is to prevent access to the system before it is fully initialized, the hup.log is the orderly start - the last statement should be "maintenance_mode now OFF".

/var/log/hipchat/hcapp.log Hipchat-specific subprocesses:
  • Barb - manages mobile push notifications
  • Scissortail - import/export jobs
Entries include associated service name for easy parsing, such as:
grep scissortail hcapp.log
/var/log/hipchat/database.log redis master log, there is another redis log for stats If this file is very large, then most likely sudo /bin/dont-blame-hipchat; chown redis /mnt is required.
/var/log/hipchat/daemon.log Logs for the various daemons, including monit and ntpd Useful for observing emergency service restarts via monit. Entries include daemon names for parsing, similar to hcapp.log
/var/log/hipchat/runtime.log Lists server processes, disk space, server status (including CPU, memory, active user counts, etc.) This is a great place to start for root cause analysis.
/var/log/hipchat/mypsql.log Output related to connection with external PostgreSQL database.  
Last modified on Nov 30, 2017

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.