Fisheye and Crucible Process Dies Unexpectedly Due to Linux OOM-Killer

Still need help?

The Atlassian Community is here for you.

Ask the community

Problem

Fisheye and Crucible does not terminate its own process unless the stop.sh is executed which writes shutdown messages to the log. An unexpected process termination clearly indicates the work of an external entity. Some examples include:

  •  kill -9 (user triggered or via script) 
  • Linux OOM-Killer

This KB article focuses on the Linux OOM-Killer, which is a feature on some Linux installations that will sacrifice processes to free up memory if the operating system experiences memory exhaustion for its own operations.  Please note that this is different from Java running out of memory. In this case, the OS itself is in danger of running out of memory and thus starts terminating processes to avoid it.  If Fisheye or Crucible is shutdown correctly, this is an example log from the application log.  If this log exists, Fisheye or Crucible was shutdown most likely using the stop.sh command and this article does not apply. 

2019-05-28 19:16:18,693 INFO  [Thread-65 ] fisheye ShutdownService-stopImpl - Shutdown requested

Diagnosis

Environment

  • Fisheye and/or Crucible is installed on a Linux host.
  • The entire Fisheye and/or Crucible process suddenly terminates without warning. That is to say, the process ID (pid) is no longer running and there is no log message as the example above.
  • The browser shows a generic "cannot connect" or similar error, indicating that it is not able to reach the webpage.
  • Nothing out of the ordinary appears in the Fisheye application logs ($FISHEYE_INST/var/log/fisheye.out), since the application was terminated without properly shutting down.

Diagnostic Steps

  • /var/log/ directory for the syslog or messages, and locate the timestamps spanning the approximate time when the process was terminated.  If you see entries similar to the following, then you know the process was a victim of the OOM-Killer:

    • May 28 11:29:48 fecru01 kernel: Out of memory: Kill process 1386 (java) score 358 or sacrifice
      May 28 11:29:48 fecru01 kernel: Killed process 5388, UID 4048, (java) total-vm:1331564kB, anon
      May 28 11:29:51 fecru01 kernel: java invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0,
  • Use dmesg and search for lines around Killed process.

    • dmesg -T | grep -C 5 -i “killed process”

Cause

The linux system experiences memory exhaustion for its own operations.  In an attempt to save the system from crashing, Linux will kill process to free memory to continue running.

Workaround

The way to mitigate this type issue is to Add Swap Space to the system.  This will keep Fisheye and/or Crucible from crashing while working on the resolution.  This can be done live without having to stop Linux, Fisheye and/or Crucible. This will have performance impacts on Fisheye and/or Crucible so this should only be used to prevent a full outage. 

  • Decide where on the disk there is available space.  This example will put the swap space in the root user's home directory.

  • Create a file on the local disk.  This is an example of 2G swap file: 

    dd if=/dev/zero of=/root/myswapfile bs=1M count=2048
  • Change the permission of the swap file so that only root can access it. 

    chmod 600 /root/myswapfile
  • Make and enable this file as a swap file using mkswap command.

    mkswap /root/myswapfile
    swapon /root/myswapfile
  • Verify whether the newly created swap area is available for your use. 

    swapon -s

Resolution

In the case of the OOM-Killer, the possible resolutions would be to:

  • Increase the amount of memory available on the host machine itself.

  • Decrease the amount of memory allocated to Fisheye and/or Crucible or competing processes on the machine.
  • Migrate other applications to another system.

(info) Additional info on how the OOM-Killer operates, please see: http://prefetch.net/blog/index.php/2009/09/30/how-the-linux-oom-killer-works/



DescriptionFisheye and/or Crucible becomes unresponsive. Application just stops.
ProductFisheye, Crucible, fecru
Last modified on May 29, 2019

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.