Bitbucket process dies unexpectedly due to Linux OOM killer
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
Bitbucket does not terminate its own process unless a stop-bitbucket.sh
is executed which writes shutdown messages to the log. An unexpected process termination clearly indicates the work of an external entity. Some examples include:
kill -9
(user-triggered or via script)- Linux OOM killer
This KB article focuses on the Linux OOM killer, which is a feature on some Linux installations that will sacrifice processes to free up memory if the operating system experiences memory exhaustion for its own operations. Please note that this is different from Bitbucket running out of memory. In this case, the OS itself is in danger of running out of memory and thus starts terminating processes to avoid it.
If Bitbucket is shut down correctly, this is an example log from Bitbucket shutting down. If this log exists, Bitbucket was most likely shut down using the stop-bitbucket.sh command, and this article may not apply.
2024-10-15 08:55:05,093 INFO [SpringApplicationShutdownHook] c.a.b.i.boot.log.BuildInfoLogger Bitbucket 9.2.1 has shut down
However, if a Bitbucket Admin says that the application wasn't manually stopped, it could mean that the system's OOM killer might have still been involved. To find out what really happened, you'll need to check the OS system logs.
Resolution
In the case of the OOM killer, the possible resolutions would be to:
Increase the amount of memory available on the host machine itself.
- Decrease the amount of memory allocated to Bitbucket or competing processes on the machine.
- Migrate other applications to another system.
- If the heaviest consumers are Git processes, ensure appropriate tuning is carried out for the throttling sub-system.
For additional info on how the OOM killer operates, please see: http://prefetch.net/blog/index.php/2009/09/30/how-the-linux-oom-killer-works/
Monitoring
If the OOM killer keeps getting activated, monitoring has to be put in place so that the problem can be diagnosed further. The output of the following commands should be retrieved when memory usage is unusually high.
# Display the amount of free and used memory in the system, in megabytes.
free -m
# Display the system's processes sorted by resident memory usage (top 20)
top -bn1 -oRES -w 500 | head -n 20
# Display processes sorted by their resident set size (RSS) in descending order
ps -eo pid,user:20,rss,ppid,pri,ni,vsize,pcpu,pmem,time,command --sort=-rss
# If the heavy consumer is a git process, run the pwdx command to find the repository path. Get the size of the repository.
# Git processes such as git pack-objects and git repack on large repositories may consume a huge amount of memory.
pwdx <pid>
cd <repo-path>
git count-objects -v --human-readable
Workaround
The way to mitigate this type issue is to Add Swap Space to the system. This will keep Bitbucket from crashing while working on the resolution. This will have performance impacts on Bitbucket so this should only be used to prevent a full outage for Bitbucket.
Decide where on the disk there is available space. This example will put the swap space in the root user's home directory.
Create a file on the local disk. This is an example of 2G swap file:
dd if=/dev/zero of=/root/myswapfile bs=1M count=2048
Change the permission of the swap file so that only root can access it.
chmod 600 /root/myswapfile
Make and enable this file as a swap file using mkswap command.
mkswap /root/myswapfile swapon /root/myswapfile
Verify whether the newly created swap area is available for your use.
swapon -s
Diagnosis
Environment
- Bitbucket is installed on a Linux host.
- The entire Bitbucket process suddenly terminates without warning. That is to say, the process ID (pid) is no longer running and there is no log message as the example above.
- The browser shows a generic "cannot connect" or similar error, indicating that it is not able to reach the webpage.
- Nothing out of the ordinary appears in the Bitbucket application logs (
$BITBUCKET_HOME/logs/atlassian-bitbucket.log)
, since the application was terminated without properly shutting down.
Diagnostic Steps
- Use dmesg and search for lines around killed process.
dmesg -T | grep -C 5 -i “killed process”
/var/log/ directory for the syslog or messages, and locate the timestamps spanning the approximate time when the process was terminated. If you see entries similar to the following, then you know the process was a victim of the OOM-Killer:
May 28 11:29:48 Bitbucket01 kernel: Out of memory: Kill process 1386 (java) score 358 or sacrifice May 28 11:29:48 Bitbucket01 kernel: Killed process 5388, UID 4048, (java) total-vm:1331564kB, anon May 28 11:29:51 Bitbucket01 kernel: java invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0,
- Sometimes, a secondary process started by Bitbucket, like Git, might be terminated by the system's out-of-memory killer when the system runs out of memory. After this happens, Bitbucket itself is then stopped in an orderly manner by the oom-reaper.
Sep 10 00:03:11 Bitbucket01 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/bitbucket.service,task=git,pid=8846,uid=501 Sep 10 00:03:11 Bitbucket01 kernel: Out of memory: Killed process 8846 (git) total-vm:45875804kB, anon-rss:42999636kB, file-rss:872kB, shmem-rss:0kB, UID:501 pgtables:85704kB oom_score_adj:0 Sep 10 00:03:11 Bitbucket01 systemd[1]: bitbucket.service: A process of this unit has been killed by the OOM killer. Sep 10 00:03:12 Bitbucket01 kernel: oom_reaper: reaped process 8846 (git), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB Sep 10 00:03:20 Bitbucket01 stop-bitbucket.sh[44720]: Stopping Atlassian Bitbucket as the current user Sep 10 00:03:20 Bitbucket01 stop-bitbucket.sh[48187]: Stopping Bitbucket webapp Sep 10 00:04:01 Bitbucket01 stop-bitbucket.sh[48187]: The Bitbucket webapp did not stop in time Sep 10 00:04:01 Bitbucket01 stop-bitbucket.sh[48187]: Check /var/atlassian/application-data/bitbucket/log/launcher.log for a thread dump Sep 10 00:04:01 Bitbucket01 stop-bitbucket.sh[48187]: Killing Bitbucket webapp
- The kernel logs include detailed information about the state of the tasks (processes) that were running at the time. This information includes memory usage values, often represented in pages. Each page generally corresponds to a fixed amount of memory, typically 4 KB on many systems.
Sep 10 00:03:11 Bitbucket01 kernel: Tasks state (memory values in pages): Sep 10 00:03:11 Bitbucket01 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name Sep 10 00:03:11 Bitbucket01 kernel: [ 997] 0 997 63229 49825 536576 0 -250 systemd-journal Sep 10 00:03:11 Bitbucket01 kernel: [ 1226] 32 1226 3312 720 61440 0 0 rpcbind Sep 10 00:03:11 Bitbucket01 kernel: [ 1227] 0 1227 23548 1053 77824 0 -1000 auditd Sep 10 00:03:11 Bitbucket01 kernel: [ 1229] 0 1229 756 360 45056 0 0 audisp-syslog Sep 10 00:03:11 Bitbucket01 kernel: [ 1251] 81 1251 2699 720 61440 0 -900 dbus-broker-lau Sep 10 00:03:11 Bitbucket01 kernel: [ 1253] 81 1253 1315 576 49152 0 -900 dbus-broker Sep 10 00:03:11 Bitbucket01 kernel: [ 1257] 0 1257 32674 6912 159744 0 0 firewalld Sep 10 00:03:11 Bitbucket01 kernel: [ 1258] 0 1258 19806 1102 57344 0 0 irqbalance Sep 10 00:03:11 Bitbucket01 kernel: [ 1259] 990 1259 1265 360 49152 0 0 lsmd Sep 10 00:03:11 Bitbucket01 kernel: [ 1260] 2 1260 78179 2006 114688 0 0 rngd Sep 10 00:03:11 Bitbucket01 kernel: [ 1262] 0 1262 5410 864 86016 0 0 systemd-logind Sep 10 00:03:11 Bitbucket01 kernel: [ 1269] 0 1269 918250 66693 1081344 0 0 ampdaemon Sep 10 00:03:11 Bitbucket01 kernel: [ 1271] 996 1271 3300 653 65536 0 0 chronyd Sep 10 00:03:11 Bitbucket01 kernel: [ 1281] 0 1281 39409 1483 77824 0 0 ampcreport Sep 10 00:03:11 Bitbucket01 kernel: [ 1283] 0 1283 65052 2387 139264 0 0 NetworkManager Sep 10 00:03:11 Bitbucket01 kernel: [ 1353] 0 1353 30132 1409 94208 0 0 gssproxy Sep 10 00:03:11 Bitbucket01 kernel: [ 1354] 0 1354 4516 936 73728 0 -1000 sshd Sep 10 00:03:11 Bitbucket01 kernel: [ 1360] 0 1360 267413 3155 229376 0 0 TaniumClient Sep 10 00:03:11 Bitbucket01 kernel: [ 1361] 0 1361 64670 4606 135168 0 0 tuned Sep 10 00:03:11 Bitbucket01 kernel: [ 1384] 0 1384 4615 1123 73728 0 0 rpc.gssd Sep 10 00:03:11 Bitbucket01 kernel: [ 1430] 0 1430 42883 3689 319488 0 0 adclient Sep 10 00:03:11 Bitbucket01 kernel: [ 1431] 0 1431 21783 1388 86016 0 0 cdcwatch Sep 10 00:03:11 Bitbucket01 kernel: [ 1648] 998 1648 744769 2740 245760 0 0 polkitd Sep 10 00:03:11 Bitbucket01 kernel: [ 1753] 0 1753 10119 864 86016 0 0 master Sep 10 00:03:11 Bitbucket01 kernel: [ 1776] 89 1776 10139 864 81920 0 0 qmgr Sep 10 00:03:11 Bitbucket01 kernel: [ 1796] 0 1796 1395 720 49152 0 0 dhclient Sep 10 00:03:11 Bitbucket01 kernel: [ 1925] 0 1925 960667 14060 757760 0 0 metricbeat Sep 10 00:03:11 Bitbucket01 kernel: [ 1946] 0 1946 129264 17181 499712 0 0 rsyslogd Sep 10 00:03:11 Bitbucket01 kernel: [ 2061] 29 2061 2436 387 53248 0 0 rpc.statd Sep 10 00:03:11 Bitbucket01 kernel: [ 2183] 0 2183 16498 3649 139264 0 0 ds_agent Sep 10 00:03:11 Bitbucket01 kernel: [ 2184] 0 2184 390403 179785 2502656 0 0 ds_agent Sep 10 00:03:11 Bitbucket01 kernel: [ 2349] 0 2349 1757 432 49152 0 0 atd Sep 10 00:03:11 Bitbucket01 kernel: [ 2350] 0 2350 2728 792 69632 0 0 crond Sep 10 00:03:11 Bitbucket01 kernel: [ 2351] 0 2351 761 360 40960 0 0 agetty Sep 10 00:03:11 Bitbucket01 kernel: [ 2353] 0 2353 1403 432 49152 0 0 agetty Sep 10 00:03:11 Bitbucket01 kernel: [ 2404] 0 2404 3485 757 65536 0 0 mgsusageag Sep 10 00:03:11 Bitbucket01 kernel: [ 2405] 0 2405 1931 1306 49152 0 0 ndtask Sep 10 00:03:11 Bitbucket01 kernel: [ 2610] 0 2610 309231 3030 118784 0 0 fnms-docker-mon Sep 10 00:03:11 Bitbucket01 kernel: [ 3942] 0 3942 7928 815 114688 0 0 ds_am Sep 10 00:03:11 Bitbucket01 kernel: [ 6111] 0 6111 585574 5670 323584 0 0 dsa-connect Sep 10 00:03:11 Bitbucket01 kernel: [ 6131] 0 6131 678299 7796 376832 0 0 dsa-connect Sep 10 00:03:11 Bitbucket01 kernel: [ 6992] 498 6992 46365 5881 135168 0 0 ampscansvc Sep 10 00:03:11 Bitbucket01 kernel: [ 37240] 39520 37240 5231 1152 77824 0 100 systemd Sep 10 00:03:11 Bitbucket01 kernel: [ 37243] 39520 37243 44332 1588 110592 0 100 (sd-pam) Sep 10 00:03:11 Bitbucket01 kernel: [ 40863] 501 40863 7699619 4423665 49029120 0 0 java Sep 10 00:03:11 Bitbucket01 kernel: [ 42584] 501 42584 4097606 737671 22220800 0 0 java Sep 10 00:03:11 Bitbucket01 kernel: [1245347] 499 1245347 2313 576 57344 0 0 uuidd Sep 10 00:03:11 Bitbucket01 kernel: [1005312] 0 1005312 113176 63582 819200 0 0 splunkd Sep 10 00:03:11 Bitbucket01 kernel: [1005317] 0 1005317 31290 4078 131072 0 0 splunkd Sep 10 00:03:11 Bitbucket01 kernel: [2499694] 89 2499694 11510 1080 94208 0 0 tlsmgr Sep 10 00:03:11 Bitbucket01 kernel: [ 364704] 0 364704 181489 3631 122880 0 0 tm_netagent Sep 10 00:03:11 Bitbucket01 kernel: [ 364725] 0 364725 185283 7357 196608 0 0 tm_netagent Sep 10 00:03:11 Bitbucket01 kernel: [1231817] 0 1231817 1636 504 57344 0 0 TPython Sep 10 00:03:11 Bitbucket01 kernel: [1231826] 0 1231826 1636 432 53248 0 0 python Sep 10 00:03:11 Bitbucket01 kernel: [1231831] 0 1231831 32877 25380 331776 0 0 pybin Sep 10 00:03:11 Bitbucket01 kernel: [2057921] 0 2057921 1436645 25020 757760 0 0 telegraf Sep 10 00:03:11 Bitbucket01 kernel: [2522310] 0 2522310 1915813 87588 1634304 0 0 ds_am Sep 10 00:03:11 Bitbucket01 kernel: [2542797] 0 2542797 8788 1138 86016 0 -1000 systemd-udevd Sep 10 00:03:11 Bitbucket01 kernel: [ 918554] 0 918554 194858 3671 221184 0 0 TaniumCX Sep 10 00:03:11 Bitbucket01 kernel: [ 918790] 0 918790 1146618 17645 741376 0 0 TaniumTSDB Sep 10 00:03:11 Bitbucket01 kernel: [ 918856] 0 918856 1636 504 57344 0 0 TPython Sep 10 00:03:11 Bitbucket01 kernel: [ 918870] 0 918870 1636 504 53248 0 0 python Sep 10 00:03:11 Bitbucket01 kernel: [ 918873] 0 918873 22733 15252 233472 0 0 pybin Sep 10 00:03:11 Bitbucket01 kernel: [ 935500] 0 935500 170968 11551 262144 0 0 TaniumClient Sep 10 00:03:11 Bitbucket01 kernel: [3465694] 89 3465694 10128 648 81920 0 0 pickup Sep 10 00:03:11 Bitbucket01 kernel: [ 1644] 0 1644 2728 536 65536 0 0 crond Sep 10 00:03:11 Bitbucket01 kernel: [ 1650] 0 1650 1781 432 57344 0 0 sh Sep 10 00:03:11 Bitbucket01 kernel: [ 1681] 0 1681 1782 576 61440 0 0 CW_FS_MEM_monit Sep 10 00:03:11 Bitbucket01 kernel: [ 5284] 0 5284 185713 37824 802816 0 0 fluent-bit Sep 10 00:03:11 Bitbucket01 kernel: [ 8828] 501 8828 1565 504 45056 0 0 git Sep 10 00:03:11 Bitbucket01 kernel: [ 8830] 501 8830 1202 432 45056 0 0 git-http-backen Sep 10 00:03:11 Bitbucket01 kernel: [ 8832] 501 8832 3026 576 57344 0 0 git Sep 10 00:03:11 Bitbucket01 kernel: [ 8844] 501 8844 184 0 40960 0 0 pack-objects Sep 10 00:03:11 Bitbucket01 kernel: [ 8846] 501 8846 11468951 10750127 87760896 0 0 git Sep 10 00:03:11 Bitbucket01 kernel: [ 18496] 0 18496 2728 536 65536 0 0 crond Sep 10 00:03:11 Bitbucket01 kernel: [ 18500] 0 18500 1782 576 69632 0 0 run-parts Sep 10 00:03:11 Bitbucket01 kernel: [ 18584] 0 18584 1781 504 57344 0 0 ciscoampconnect Sep 10 00:03:11 Bitbucket01 kernel: [ 18585] 0 18585 1636 360 45056 0 0 sed Sep 10 00:03:11 Bitbucket01 kernel: [ 18587] 0 18587 1396 360 49152 0 0 sleep Sep 10 00:03:11 Bitbucket01 kernel: [ 44607] 0 44607 9803 7704 122880 0 0 mon-put-instanc
- Store these lines into a file, such as processes.log and run the following command to capture the total memory consumed by Git processes. Git processes are the heaviest consumers of system-memory.
cat processes.log | awk 'BEGIN {sum=0} /git/ {sum+=$10} END {print sum*4/1024/1024, "GB"}' Result: 43.7727 GB
- In the above example, 43.77 GB of system memory was used by Git processes. Check if this is a reasonable usage in your Bitbucket application.
- If Git memory usage is low, review the other processes and locate the ones with the highest rss value.
Cause
The Linux system experiences memory exhaustion for its own operations. In an attempt to save the system from crashing, Linux will kill process to free memory to continue running.
Feature Requests
These feature requests will help reduce the likelihood of the issue happening and also help you monitor Bitbucket Data Center better.
- BSERV-13611: Investigate limiting repack memory usage via --windowMemory or --threads
- BSERV-19648: Fair-share queuing of hosting operations
- BSERV-19649: Adaptive throttling for memory usage
- BSERV-19651: Git operations that consume a huge amount of RAM are not logged