Troubleshooting Hipchat Data Center

Before you allow users in to your Hipchat Data Center deployment, you should make sure everything is configured and working appropriately. This page contains a checklist that we hope will help you verify the deployment, and catch any errors before you open the deployment to users.

On this page:

System monitoring and alerts

You can configure Hipchat Data Center with a recipient email address for system alerts from the Hipchat Data Center admin UI.

An alert email is sent when any of the following conditions are met:

System Utilization

  • Memory utilization over 98% for three cycles
  • Swap file utilization over 10% for three cycles
  • CPU (user) utilization over 95% for three cycles
  • CPU (system) utilization over 95% for three cycles
  • CPU (wait) utilization over 99% for three cycles

Services

  • 'gearman' over 30% CPU utilization
  • 'nginx' over 20% CPU utilization for five cycles
  • 'ntpd' becomes unavailable
  • 'php5' restarts three times within five cycles
  • 'punjab' unavailable for three cycles, or over 45% CPU for three cycles
  • 'rsyslog'  over 75% CPU for three cycles

Set up SNMP monitoring

Hipchat Data Center implements SNMP v2c using standard Ubuntu MIBs that can be enabled at the command line. 

  • To turn SNMP on or off: 
    hipchat service -n "on" OR "off"
  • To set up the community string:
    hipchat service -c <communitystring>
    Example: hipchat service -c public
  • To add TRAP recipient server list:

    hipchat service -t trap.server.com
    Example: snmp1.exmaple.com,snmp2.example.com

    Note: Add \ prior to a special character as in dollar\$ign

Troubleshooting and logs

Log files are available in the /var/log/ directory of each node.  The Hipchat service logs can be found inside /var/log/hipchat/.

Once per day, the log files from each node are copied to the /file_store/shared/logs subdirectory of your network-attached storage volume. They follow a /YYYYMMDD/machineid/log-files naming convention.

Configuration management is managed by chef-solo. It is run at boot, upgrade, and during service restarts. You can find the chef-solo log file in at /var/log/chef.log

To retrieve all your logs, run  hipchat log -r  on each node. This copies the logs to the  /file_store/shared/logs folder, which you can then compress and include with your support request. 

tip/resting Created with Sketch.

If you need to open a Support request, make sure you download and attach your logs if possible. This helps us speed up the troubleshooting process.

Logs commands

CommandUseNotes
hipchat log --rotateForce a log rotationThis will force all logs to conform to the log rotation configuration specified in /etc/logrotate.conf and /etc/logrotate.d
hipchat log --purgeTruncates the contents of all logs in /var/logBe sure to backup any logs required for troubleshooting before executing this command.

Log file reference

Resource

Use

Notes

/var/log/chef.logchef runs for installing/updating/configuringLogging starts from first boot. Most system configuration changes will trigger a chef run.
/var/log/hipchat/nginx.lognginx logs AND coral logs

Includes nginx-access entries alongside coral entries. nginx.err.log only logs ERROR and above.

 Any entries in nginx.err.log are indicative of a problem.

/var/log/hipchat/kern.logUbuntu kernel logging 
/var/log/schema_upgrade.logLogs any schema upgrade changes that occur during upgradesUseful for seeing upgrade history.
/var/log/hipchat/atlassian-crowd.log

External directory (Crowd/AD/LDAP) integration and authentication

Related to user authentication and external directory synchronization.
/var/log/hipchat/coral.logAPIv2 logs

Many services rely on coral for authentication, so this log is often referenced while tracing a problem. 

coral.err.log only logs ERROR and above. Any entries in coral.err.log are indicative of a problem.

/var/log/hipchat/cron.logEntries related to cron job schedules on the server 
/var/log/hipchat/web.logWebUI logging (i.e. the php-based administration)

Good starting point for any error messages or stack traces occurring in the web interface. 

web.err.log only logs ERROR and above. Any entries in web.err.log are indicative of a problem.

/var/log/hipchat/update.logDetailed output of upgrades (and errors)Critical for troubleshooting upgrade issues, along with chef.log.
/var/log/hipchat/tetra.log

Core chat service log

Errors here are often critical.

tetra.err.log only logs ERROR and above. Any entries in tetra.err.log are indicative of a problem.

/var/log/hipchat/hup.logLogs when services are restarted

Helpful for troubleshooting a broken service/upgrade.

"services starting" is to prevent access to the system before it is fully initialized, the hup.log is the orderly start - the last statement should be "maintenance_mode now OFF".

/var/log/hipchat/hcapp.logHipchat-specific subprocesses:
  • Barb - manages mobile push notifications
  • Scissortail - import/export jobs
Entries include associated service name for easy parsing, such as:
grep scissortail hcapp.log
/var/log/hipchat/database.logredis master log, there is another redis log for statsIf this file is very large, then most likely sudo /bin/dont-blame-hipchat; chown redis /mnt is required.
/var/log/hipchat/daemon.logLogs for the various daemons, including monit and ntpdUseful for observing emergency service restarts via monit. Entries include daemon names for parsing, similar to hcapp.log
/var/log/hipchat/runtime.logLists server processes, disk space, server status (including CPU, memory, active user counts, etc.)This is a great place to start for root cause analysis.
/var/log/hipchat/mypsql.logOutput related to connection with external PostgreSQL database. 
Last modified on Nov 30, 2017

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.