Recommendations for running Bitbucket in AWS

 

This page provides general sizing and configuration recommendations for running self-managed Bitbucket Data Center and Bitbucket Server instances on Amazon Web Services.To get the best performance out of your Bitbucket deployment in AWS, it's important to not under-provision your instance's CPU, memory, or I/O resources. Note that the very smallest instance types provided by AWS do not meet Bitbucket's minimum hardware requirements and aren't recommended in production environments. If you don't provision sufficient resources for your workload, Bitbucket is likely to exhibit slow response times, display a Bitbucket Server is reaching resource limits banner, or fail to start altogether. 

Recommended EC2 and EBS instance sizes

The following table lists the recommended EC2 and EBS configurations for operating a Bitbucket Server (standalone) or Bitbucket Data Center (clustered) instance under typical workloads.

To deploy an instance architected for both scale and resilience, we recommend deploying Bitbucket Data Center. The AWS Quick Start and associated CloudFormation template provides a set of recommended defaults, node size options and scaling parameters.

On this page

Bitbucket Server

Active UsersEC2 instance typeEBS OptimizedEBS Volume typeIOPS

0 – 250

c3.largeNoGeneral Purpose (SSD)N/A
250 – 500c3.xlargeYesGeneral Purpose (SSD)N/A
5001000c3.2xlargeYesProvisioned IOPS500 – 1000

Bitbucket Data Center (cluster nodes)

Active UsersEC2 instance typeRecommended number of nodes

0 – 250

c3.large1-2*
250 – 500c3.xlarge1-2*
500 – 1000c3.2xlarge2
1000 – massive scalec3.4xlarge+3+

* For high-availability, we recommend deploying 2 or more cluster nodes as a minimum.

Bitbucket Data Center (shared file server)

These recommendations assume a single EC2 instance with attached EBS volume acting as a shared NFS server for the cluster.

Active UsersEC2 instance typeEBS Volume typeIOPS

0 – 250

m4.largeGeneral Purpose (SSD)N/A
250 – 500m4.xlargeGeneral Purpose (SSD)N/A
500 – 1000m4.2xlargeProvisioned IOPS500 – 1000
1000 – massive scalem4.4xlarge+Provisioned IOPS1000+

(error) The Amazon Elastic File System (EFS) is not supported for Bitbucket's shared home directory at this time.

 

See Amazon EC2 instance typesAmazon EBS–Optimized Instances, and Amazon EBS Volume Types for more information.

Notes

In Bitbucket instances with high hosting workload, I/O performance is often the limiting factor. It's recommended that you pay particular attention to EBS volume options, especially the following:

  • The size of an EBS volume also influences I/O performance. Larger EBS volumes generally have a larger slice of the available bandwidth and I/O operations per second (IOPS). A minimum of 100 GiB is recommended in production environments.
  • The IOPS that can be sustained by General Purpose (SSD) volumes is limited by Amazon's I/O credits. If you exhaust your I/O credit balance, your IOPS will be limited to the baseline level. You should consider using a larger General Purpose (SSD) volume or switching to a Provisioned IOPS (SSD) volume. See Amazon EBS Volume Types for more information.
  • New EBS volumes in particular have reduced performance the first time each block is accessed. See Pre-Warming Amazon EBS Volumes for more information.

The above recommendations are based on a typical workload with the specified number of active users. The resource requirements of an actual Bitbucket instance may vary markedly with a number of factors, including:

  • The number of continuous integration servers cloning or fetching from Bitbucket Server: Bitbucket Server will use more resources if you have many build servers set to clone or fetch frequently from Bitbucket Server
  • Whether continuous integration servers are using push mode notifications or polling repositories regularly to watch for updates
  • Whether continuous integration servers are set to do full clones or shallow clones
  • Whether the majority of traffic to Bitbucket Server is over HTTP, HTTPS, or SSH, and the encryption ciphers used
  • The number and size of repositories: Bitbucket Server will use more resources when you work on many very large repositories
  • The activity of your users: Bitbucket Server will use more resources if your users are actively using the Bitbucket Server web interface to browse, clone and push, and manipulate pull requests
  • The number of open pull requests: Bitbucket Server will use more resources when there are many open pull requests, especially if they all target the same branch in a large, busy repository.

See Scaling Bitbucket Server and Scaling Bitbucket Server for Continuous Integration performance for more detailed information on Bitbucket Server resource requirements.

Other supported instance sizes

The following Amazon EC2 instances also meet or exceed Bitbucket Server's minimum hardware requirements. These instances provide different balances of CPU, memory, and I/O performance, and can cater for workloads that are more CPU-, memory-, or I/O-intensive than the typical. 

ModelvCPUMem (GiB)Instance Store (GB)

EBS
optimizations
available

Dedicated EBS
Throughput
(Mbps)
c3.large23.752 x 16 SSD--
c3.xlarge47.52 x 40 SSDYes-
c3.2xlarge8152 x 80 SSDYes-
c3.4xlarge16302 x 160 SSDYes-
c3.8xlarge32602 x 320 SSD--
c4.large23.75-Yes500
c4.xlarge47.5-Yes750
c4.2xlarge815-Yes1,000
c4.4xlarge1630-Yes2,000
c4.8xlarge3660-Yes4,000
i2.xlarge430.51 x 800 SSDYes-
i2.2xlarge8612 x 800 SSDYes-
i2.4xlarge161224 x 800 SSDYes-
i2.8xlarge322448 x 800 SSD--
m3.large27.51 x 32 SSD--
m3.xlarge4152 x 40 SSDYes-
m3.2xlarge8302 x 80 SSDYes-
m4.large28-Yes450
m4.xlarge416-Yes750
m4.2xlarge832-Yes1,000
m4.4xlarge1664-Yes2,000
m4.10xlarge40160-Yes4,000
m4.16xlarge64256-Yes10,000
r3.large215.251 x 32 SSD--
r3.xlarge430.51 x 80 SSDYes-
r3.2xlarge8611 x 160 SSDYes-
r3.4xlarge161221 x 320 SSDYes-
r3.8xlarge322442 x 320 SSD--
x1.32xlarge1281,9522 x 1,920 SSDYes10,000

In all AWS instance types, Bitbucket Server only supports "large" and higher instances. "Micro", "small", and "medium" sized instances do not meet Bitbucket's minimum hardware requirements and aren't recommended in production environments. 

Bitbucket does not support D2 instancesBurstable Performance (T2) Instances, or Previous Generation Instances

In any instance type with available Instance Store device(s), a Bitbucket instance launched from the Bitbucket AMI will configure one Instance Store to contain Bitbucket Server's temporary files and caches. Instance Store can be faster than an EBS volume but the data doesn't persist if the instance is stopped or rebooted. Use of Instance Store can improve performance and reduce the load on EBS volumes. See Amazon EC2 Instance Store for more information. 

Advanced: Monitoring Bitbucket to tune instance sizing

This section is for advanced users who wish to monitor the resource consumption of their instance and use this information to guide instance sizing. If performance at scale is a concern, we recommend deploying Bitbucket Data Center with elastic scaling, which alleviates the need to worry about how a single node can accomodate fluctuating or growing load. See the AWS Quick Start guide for Bitbucket Data Center for more details.

The above recommendations provide guidance for typical workloads. The resource consumption of every Bitbucket Server instance will vary with the mix of workload. The most reliable way to determine if your Bitbucket Server instance is under- or over-provisioned in AWS is to monitor its resource usage regularly with Amazon CloudWatch. This provides statistics on the actual amount of CPU, I/O, and network resources consumed by your Bitbucket Server instance. 

The following simple example BASH script uses

to gather CPU, I/O, and network statistics and display them in a simple chart that can be used to guide your instance sizing decisions. 

Click here to expand...
#!/bin/bash
# Example AWS CloudWatch monitoring script
# Usage:
#   (1) Install gnuplot and jq (minimum version 1.4)
#   (2) Install AWS CLI (http://docs.aws.amazon.com/cli/latest/userguide/installing.html) and configure it with
#       credentials allowing cloudwatch get-metric-statistics
#   (3) Replace "xxxxxxx" in volume_ids and instance_ids below with the ID's of your real instance
#   (4) Run this script

export start_time=$(date -v-14d +%Y-%m-%dT%H:%M:%S)
export end_time=$(date +%Y-%m-%dT%H:%M:%S)
export period=1800
export volume_ids="vol-xxxxxxxx"    # REPLACE THIS WITH THE VOLUME ID OF YOUR REAL EBS VOLUME
export instance_ids="i-xxxxxxxx"    # REPLACE THIS WITH THE INSTANCE ID OF YOUR REAL EC2 INSTANCE

# Build lists of metrics and datafiles that we're interested in
ebs_metrics=""
ec2_metrics=""
cpu_datafiles=""
iops_datafiles=""
queue_datafiles=""
net_datafiles=""
for volume_id in ${volume_ids}; do
  for metric in VolumeWriteOps VolumeReadOps; do
    ebs_metrics="${ebs_metrics} ${metric}"
    iops_datafiles="${iops_datafiles} ${volume_id}-${metric}"
  done
done
for volume_id in ${volume_ids}; do
  for metric in VolumeQueueLength; do
    ebs_metrics="${ebs_metrics} ${metric}"
    queue_datafiles="${queue_datafiles} ${volume_id}-${metric}"
  done
done
for instance_id in ${instance_ids}; do
  for metric in DiskWriteOps DiskReadOps; do
    ec2_metrics="${ec2_metrics} ${metric}"
    iops_datafiles="${iops_datafiles} ${instance_id}-${metric}"
  done
done
for instance_id in ${instance_ids}; do
  for metric in CPUUtilization; do
    ec2_metrics="${ec2_metrics} ${metric}"
    cpu_datafiles="${cpu_datafiles} ${instance_id}-${metric}"
  done
done
for instance_id in ${instance_ids}; do
  for metric in NetworkIn NetworkOut; do
    ec2_metrics="${ec2_metrics} ${metric}"
    net_datafiles="${net_datafiles} ${instance_id}-${metric}"
  done
done

# Gather the metrics using AWS CLI
for volume_id in ${volume_ids}; do
  for metric in ${ebs_metrics}; do
    aws cloudwatch get-metric-statistics --metric-name ${metric} \
                                         --start-time ${start_time} \
                                         --end-time ${end_time} \
                                         --period ${period} \
                                         --namespace AWS/EBS \
                                         --statistics Sum \
                                         --dimensions Name=VolumeId,Value=${volume_id} | \
      jq -r '.Datapoints | sort_by(.Timestamp) | map(.Timestamp + " " + (.Sum | tostring)) | join("\n")' >${volume_id}-${metric}.data
  done
done

for metric in ${ec2_metrics}; do
  for instance_id in ${instance_ids}; do
    aws cloudwatch get-metric-statistics --metric-name ${metric} \
                                         --start-time ${start_time} \
                                         --end-time ${end_time} \
                                         --period ${period} \
                                         --namespace AWS/EC2 \
                                         --statistics Sum \
                                         --dimensions Name=InstanceId,Value=${instance_id} | \
      jq -r '.Datapoints | sort_by(.Timestamp) | map(.Timestamp + " " + (.Sum | tostring)) | join("\n")' >${instance_id}-${metric}.data
  done
done

cat >aws-monitor.gnuplot <<EOF
set term pngcairo font "Arial,30" size 1600,900
set title "IOPS usage"
set datafile separator whitespace
set xdata time
set timefmt "%Y-%m-%dT%H:%M:%SZ"
set grid
set ylabel "IOPS"
set xrange ["${start_time}Z":"${end_time}Z"]
set xtics "${start_time}Z",86400*2 format "%d-%b"
set output "aws-monitor-iops.png"
plot \\
EOF
for datafile in ${iops_datafiles}; do
  echo "  \"${datafile}.data\" using 1:(\$2/${period}) with lines title \"${datafile}\", \\" >>aws-monitor.gnuplot
done

cat >>aws-monitor.gnuplot <<EOF

set term pngcairo font "Arial,30" size 1600,900
set title "IO Queue Length"
set datafile separator whitespace
set xdata time
set timefmt "%Y-%m-%dT%H:%M:%SZ"
set grid
set ylabel "Queue Length"
set xrange ["${start_time}Z":"${end_time}Z"]
set xtics "${start_time}Z",86400*2 format "%d-%b"
set output "aws-monitor-queue.png"
plot \\
EOF
for datafile in ${queue_datafiles}; do
  echo "  \"${datafile}.data\" using 1:2 with lines title \"${datafile}\", \\" >>aws-monitor.gnuplot
done

cat >>aws-monitor.gnuplot <<EOF

set term pngcairo font "Arial,30" size 1600,900
set title "CPU Utilization"
set datafile separator whitespace
set xdata time
set timefmt "%Y-%m-%dT%H:%M:%SZ"
set grid
set ylabel "%"
set xrange ["${start_time}Z":"${end_time}Z"]
set xtics "${start_time}Z",86400*2 format "%d-%b"
set output "aws-monitor-cpu.png"
plot \\
EOF
for datafile in $cpu_datafiles; do
  echo "  \"${datafile}.data\" using 1:2 with lines title \"${datafile}\", \\" >>aws-monitor.gnuplot
done

cat >>aws-monitor.gnuplot <<EOF

set term pngcairo font "Arial,30" size 1600,900
set title "Network traffic"
set datafile separator whitespace
set xdata time
set timefmt "%Y-%m-%dT%H:%M:%SZ"
set grid
set ylabel "MBytes/s"
set xrange ["${start_time}Z":"${end_time}Z"]
set xtics "${start_time}Z",86400*2 format "%d-%b"
set output "aws-monitor-net.png"
plot \\
EOF
for datafile in $net_datafiles; do
  echo "  \"${datafile}.data\" using 1:(\$2/${period}/1000000) with lines title \"${datafile}\", \\" >>aws-monitor.gnuplot
done

gnuplot <aws-monitor.gnuplot

When run on a typical Bitbucket Server instance, this script produces charts such as the following:

You can use the information in charts such as this to decide whether CPU, network, or I/O resources are over- or under-provisioned in your instance. 

If your instance is frequently saturating the maximum available CPU (taking into account the number of cores in your instance size), then this may indicate you need an EC2 instance with a larger CPU count. (Note that the CPU utilization reported by Amazon CloudWatch for smaller EC2 instance sizes may be influenced to some extent by the "noisy neighbor" phenomenon, if other tenants of the Amazon environment consume CPU cycles from the same physical hardware that your instance is running on.) 

If your instance is frequently exceeding the IOPS available to your EBS volume and/or is frequently queuing I/O requests, then this may indicate you need to upgrade to an EBS optimized instance and/or increase the Provisioned IOPS on your EBS volume. See EBS Volume Types for more information. 

If your instance is frequently limited by network traffic, then this may indicate you need to choose an EC2 instance with a larger available slice of network bandwidth. 

Last modified on Oct 11, 2016

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.