Bitbucket Server is reaching resource limits
Platform Notice: Server and Data Center Only - This article only applies to Atlassian products on the server and data center platforms.
This article aims to explain the meaning of the "Bitbucket Server is reaching resource limits..." banner that you might be seeing on your instance.
It is important to note that you might have two different banners displayed by Bitbucket Server:
- The yellow banner indicating Bitbucket Server is queuing requests or;
- The red banner that indicates that requests are actually being rejected
A yellow banner under heavy load is normal. Do not change any of your configuration parameters at this stage. Before increasing your configuration limits, you need to monitor the CPU, memory and/or I/O load on the server and verify that the box isn't thrashing, as increasing the ticket limits can worsen your instance performance. This is better explained in the following sections.
When deciding on how much memory to allocate for Bitbucket Server, the most important factor to consider is the amount of memory required by the
git operations. Bitbucket Server's default JVM memory configuration (
-Xms512m -Xmx1024m) works well for customers with large instances in our experience as Bitbucket Server's heap size has zero impact on hosting operations. Most of the memory usage in a server running Bitbucket Server is on the forked
git processes – the
git operations are very expensive in terms of memory. So allocating more memory to the Bitbucket Server JVM won't help your instance scale and perform better. The effect can be exactly the opposite: if your server doesn't have enough free memory available for the forked
git operations because your Bitbucket Server JVM is using most of the memory, as soon as
git concurrent operations start to happen, you will experience a performance loss as your server will not have enough memory to fork out more
Ok. I get that I shouldn't tweak Bitbucket Server's default memory in most cases. How should I budget my server memory then?
It is a simple formula: we should focus on how many concurrent hosting tickets Bitbucket Server allows as well as how much memory is used by a single clone in order to budget the memory usage.
- Memory usage by a single clone operation
As a rule of thumb, 1.5 x the repository size on disk (contents of the
.git/objectsdirectory) is a rough estimate of the required memory for a single clone operation for repositories up to 400 MB. For larger repositories, memory usage tends to flatten out at about 700 MB (but there is theoretically no maximum on how much memory could be used by Git). For example, for a single hosting operation, as the number of memory usage for Bitbucket Server remains constant at about 800MB (default) during the entire hosting operation, Git's memory usage climbs according to the rule just described.
For a detailed analysis of the server's resource usage, please read through FIZME Scaling Bitbucket Server - Clones examined and Bitbucket Caches.
Number of concurrent operations allowed by Bitbucket Server
From version 4.11, Bitbucket Server uses the FIZME adaptive throttling mechanism to define the number of Git operations that can be executed by monitoring the system resources.
The following message will be logged when the server resources don't allow the full amount of Git tickets to be available:
INFO [spring-startup] c.a.s.i.t.ResourceThrottleStrategyProvider [scm-hosting] This machine's total available memory cannot safely support a maximum of XXX tickets. Reducing maximum tickets to XX instead
Until version 4.10, Bitbucket Server limits the number of Git operations that can be executed concurrently in order to prevent the performance for all clients dropping below acceptable levels. These limits can be adjusted – see Configuration properties.
The parameter used for that (
throttle.resource.scm-hosting) is based on the number of CPUs that you have on your server and its formula is
1.5 x cpu.
- Awesome! Now I can calculate how much memory I need on my server to safely run Bitbucket Server
That's all you need to know. So, for a common Bitbucket Server environment the budget would be:
- Bitbucket Server: 768MB
- Git: 1.5 * (4 CPUs) * 700 MB = 4200 MB
- Operating System: 1024 MB
- Total: 5992 MB
- 40% safety margin: ~ 8 GB
- Bundled Elasticsearch: 1GB (add this in case you are running Bitbucket Server 4.6.0+ with the bundled Elasticsearch, which is the default configuration)
- Please refer to Scaling Bitbucket Server for more details.
Questions and answers
On the UI, the message "Bitbucket Server is queuing requests" appears and it is followed later by "Bitbucket Server is reaching resource limits". What does it mean?
Let's take the example described in the Memory budget section. At "Stage 1", with 4 CPUs, Bitbucket Server will allow 6 SCM hosting operations to be executed concurrently, thus forking each one of them out into Git processes. If the server receives more than these 6 SCM requests, these will be queued up and won't be forked out into Git processes in order to avoid a memory exhaustion on your server. At "Stage 2", some of the Git processes initially forked out will finish processing and Bitbucket Server takes requests from the SCM queue to fill in those slots. In a third stage, Bitbucket Server will fork them out into new Git processes.
It is important to note that if requests are queued up for more than a minute, Bitbucket Server will display the "Bitbucket Server is queueing requests..." message.
If requests are queued up for more than 5 minutes (
throttle.resource.scm-hosting.timeout), they are rejected and the
clone/fetch/push operation will fail. At this time, the "Bitbucket Server is reaching resource limits..." message is displayed. The message will disappear when 5 minutes have passed in which no requests have been rejected (
server.busy.on.ticket.rejected.within for Bitbucket Server 3.0+). These parameters can be adjusted – see Configuration properties.
See an example of the data that is logged for each type of denied SCM request below:
A [scm-hosting] ticket could not be acquired (0/12)
2015-08-28 11:41:00,327 WARN [ssh-scm-request-handler] Access key user (drohan@localhost) @16NVUP5x701x17836x333 1z0lim3 10.10.10.122 SSH - git-upload-pack '/alpha/apple.git' c.a.s.i.t.SemaphoreThrottleService A [scm-hosting] ticket could not be acquired (0/24)
2015-08-28 11:41:00,334 INFO [ssh-scm-request-handler] Access key user (drohan@localhost) @16NVUP5x701x17836x333 1z0lim3 10.10.10.122 SSH - git-upload-pack '/alpha/apple.git' c.a.s.s.t.ThrottledScmRequestFactory A scm-hosting request was denied due to heavy server load. Please see http://docs.atlassian.com/bitbucket/docs-0212/Scaling+Bitbucket Server for performance guidelines.
A [scm-command] ticket could not be acquired (0/1)
2015-08-28 11:41:05,327 WARN [http-nio-7990-exec-9] usera @16NVUP5x701x17836x3 0:0:0:0:0:0:0:1 "GET /projects/TEST/repos/test/commits HTTP/1.1" c.a.s.i.t.SemaphoreThrottleService A [scm-command] ticket could not be acquired (0/1)
A [scm-refs] ticket could not be acquired (0/1)
2015-08-28 11:41:05,327 WARN [ssh-scm-request-handler] usera @16NVUP5x701x17836x333 1z0lim3 10.10.10.122 SSH - git-upload-pack '/alpha/apple.git' c.a.s.i.t.SemaphoreThrottleService A [scm-refs] ticket could not be acquired (0/1)
To illustrate these stages, see picture below:
The message should be taken seriously: requests are being rejected and if this happens regularly it's an indication that your instance is not well dimensioned.
Which actions should I take to get around this issue?
The best way to do that is by implementing JMX monitoring.
As explained above, the secret is to have a queue with a number of hosting tickets that is processed quickly by your system. Even if a small queue is formed but emptied before the tickets start getting rejected you should be ok. Common mistakes that customers make and results in worse performance - don't perform those:
- Increasing the amount of memory for the Bitbucket Server JVM. Don't do that! Usually, by taking this action, a lot of memory from the server is allocated to the JVM and little extra free memory is available to the Git processes being forked out by Bitbucket Server. This is bad and can have side effects like the one described git push fails - Out of memory, malloc failed as the Git processes are being chocked by the lack of free memory.
- Increasing the number of hosting tickets by tweaking
throttle.resource.scm-hosting. Don't do that ! The reason for not taking this action lies on what was previously explained: the system will wait up to 5 minutes (default configuration of
throttle.resource.scm-hosting.timeout) for a ticket to free up. That said, reducing the number of hosting tickets may result in some queuing, however the individual clones tends to be processed faster due to reduced I/O contention which increases the likelihood a ticket frees up before the timeout. On the other hand, by increasing the number of hosting tickets Bitbucket Server can handle, you will increase the amount of time your CPUs are processing the forked out Git processes as there will be more which in turn decreases the likelihood of a ticket to be freed up before the timeout.
Below are a few actions that should help you get your Bitbucket Server processing hosting tickets faster. Please consider these:
Continuous Integration polling
Reduce the frequency your CI servers are checking out repositories from Bitbucket Server. This is a common point reported by customers hitting this issue. If you have a large number of Git operations happening on Bitbucket Server, you're likely to have the "Bitbucket Server is reaching resources limits..." message. Make sure your CI servers are reasonably keeping Bitbucket Server busy and reduce polling when possible. Make sure you have the SCM caching plugin turned on and up-to-date as described in Scaling Bitbucket Server for Continuous Integration performance.
To further reduce the number of calls from a CI server, Post-receive hooks can be set up to replace the polling mechanism.
Running SSL all the way through to Bitbucket Server is another common issue we see in customers hitting this issue. For a better performance, you really need to setup a proxy in front of Bitbucket Server. The Java SSL stack is nowhere near as efficient as any of those and removing this will result in much more efficient processing from Bitbucket Server. Please see Proxy and secure Bitbucket on how to perform this change.
Ref advertisement caching
The ref advertisement feature is disabled by default as explained in great details on the Bitbucket Caches page and the Scaling Bitbucket Server for Continuous Integration performance under the Caching section. However, it can produce a noticeable reduction in load when enabled. You shouldn't need to restart Bitbucket Server to accomplish that; you can change it at any time using the REST API calls detailed in Enabling and disabling caching.
REST API (no restart required):
Retrieve whether ref advertisement caching is enabled (true) or disabled (false). PUT
Enable (status = true) or disable (status = false) ref advertisement caching.
Below sample commands on our local instance. Make sure you adjust the Bitbucket Server URL and the "username:password" parameters accordingly.
# Check Ref advertisement caching STATUS curl -H "Content-Type:application/json" -H "Accept:application/json" --user charlie:charlie -X GET http://mybbs:7990/rest/scm-cache/latest/config/refs/enabled false # Enable Ref advertisement caching curl -H "Content-Type:application/json" -H "Accept:application/json" --user charlie:charlie -X PUT http://mybbs:7990/rest/scm-cache/latest/config/refs/enabled/true # Disable Ref advertisement caching curl -H "Content-Type:application/json" -H "Accept:application/json" --user charlie:charlie -X PUT http://mybbs:7990/rest/scm-cache/latest/config/refs/enabled/false
bitbucket.propertiesfile, in the
sharedfolder of your Set the home directory if it doesn't exist; add the following property and restart your application:
It is important to ask yourself the questions below:
- How much memory?
- How many CPUs?
- Plugins: As explained, this issue has a lot to do with processing. Make sure you don't have any plugins affecting your performance. We've had experience with Awesome Graphs: it is a nice plugin but the indexing it does is CPU and IO intensive and can dramatically affect the performance of a Bitbucket Server instance. If your system is already under heavy load, we advise you to disable all user installed plugins for a period of observation. Follow the instruction in Set UPM to safe mode.
- Processing: A strategy here could be adding more CPUs to the machine while keeping the concurrency max at your current system's default. So let's say you have 4 CPUs reported by your system. Your current SCM hosting ticket number is 6. Adding more CPUs and keeping the concurrency queue on its previous state, will help you process those 6 hosting tickets quicker, leaving the extra requests queued up for a shorter period, thus, yielding a better flow of tickets. Hence, this will give you more capacity to handle the same load and improve the overall performance while the server is busy performing hosting operations.
You can keep the current concurrency limit by adding the following configuration to your
bitbucket.properties (create one if it doesn't exist and restart Bitbucket Server so it will load the new file):
# Limits the number of SCM hosting operations, meaning pushes and pulls over HTTP or SSH, which may be running concurrently. # This is intended primarily to prevent pulls, which can be very memory-intensive, from pinning a server's resources. # There is limited support for mathematical expressions; +,-,*,\ and () are supported. You can also use the 'cpu' # variable which is resolved to the number of cpus that are available. throttle.resource.scm-hosting=6
How can I monitor the SCM hosting tickets usage without resorting to a thread dump analysis?
Enabling JMX counters for performance monitoring and watch the Ticket Statistics.
I can't identify my performance culprit. What's the next step?
Please open an issue with Atlassian Support. Make sure of the following:
- You downloaded the logparser tool and ran the
generate-access-logs-graph.shon your Bitbucket Server
atlassian-bitbucket-access*.logsas per instructions on its page. Attach the graphs to the issue. If you can't run the tool, that's ok. We will run it for you based on your Support Zip.
- You have your instance debug logging on
- You have your instance profile logging on: this item is very important as it is through it we can see which process is taking a long time to be finished
- You generated the Support Zip (Administration -> Atlassian Support Tools -> Support Zip -> Create) after the problem happened. Pleased attach to the support issue and make sure Limit File Sizes? is unchecked.