Bitbucket DIY Backup

On this page

Still need help?

The Atlassian Community is here for you.

Ask the community

This article explains use of the Bitbucket DIY Backup scripts for use with Bitbucket Server and Data Center 4.x+. If you are running an earlier version of this product, formerly known as Stash, please see Using Stash (3.11) DIY Backup.

The Bitbucket DIY Backup allows you to:

  • significantly reduce the downtime needed to create a consistent backup
  • use the vendor-specific database backup tool appropriate to your back end database, for example:
    • pg_dump if your back end database is PostgreSQL
    • sqlcmd with an appropriate command for differential backup, if your back end database is MS SQL Server

  • use the optimal file system backup tool for your Bitbucket Data Center home directory, for example:
    • an LVM snapshot logical volume if your Bitbucket Data Center home directory uses LVM
    • a SAN-based backup if your Bitbucket Data Center home directory uses a Storage Area Network
    • rsync, if available
  • take backups of Bitbucket Data Center and Bitbucket Mesh instances without having to bring nodes down manually.

On this page:

Download the worked example scripts from Bitbucket:

The key to reducing downtime is the use of optimal, vendor-specific database and file system backup tools. Bitbucket DIY Backup does require you to write some code in a language of your choice to perform the required backup steps, using the REST API available for Bitbucket Server and Data Center 4.0.

DIY Backup supports Windows and Linux platforms, and Bitbucket version 4.0 and higher. DIY Backup supports Bitbucket Data Center and Bitbucket Mesh instances equally any DIY Backup solution that works on one should work on the other without modification.

For information about other backup strategies for Bitbucket Data Center, see Data recovery and backups. That page also discusses the tight coupling between the Bitbucket Data Center file system on disk and the database that the application uses.

Please note that the examples on this page are provided as guidance for developing a DIY Backup solution. As such, the third-party tools described are for example only – you will need to choose the tools that are appropriate to your own specific installation of Bitbucket Data Center.

Consult the vendor documentation for the third-party tools you choose – unfortunately, Atlassian can not provide support for those tools.

This page:

  • Describes a complete DIY Backup solution for a PostgreSQL database and local filesystem, using bash shell scripts.
  • Provides background information about how the Bitbucket Data Center REST API can be used for DIY Backups.

You can use this solution directly if your Bitbucket Data Center instance has the same or similar configuration, or use this as a starting point to develop your own DIY Backup solution tailored to your hardware configuration. 

How it works

When you use DIY Backup, you have complete control over the backup steps, and can implement any custom processes you like in the language of your choice. For example, you can use your database's incremental or fast snapshot tools and/or your file server's specific tools as part of a DIY Backup. 

The DIY Backup does the following:

  1. Prepares the Bitbucket Data Center instance for backup. This happens before Bitbucket Data Center is locked, so we want to do as much processing as possible here in order to minimize downtime later. For example, we can take an initial snapshot using incremental database and filesystem utilities. These do not have to be 100% consistent as Bitbucket Data Center is still running and modifying the database and filesystem. But taking the initial snapshot now may reduce the amount of work done later (while the application is locked), especially if the amount of data modified between backups is large. The steps include:
    • Taking an initial backup of the database (if it supports progressive/differential backups).
    • Doing an initial rsync of the home folder to the backup folder.
  2. Initiates the backup, which will:
    • Lock the Bitbucket Data Center instance.
    • Drain and latch the connections to the database and the filesystem.
    • Wait for the drain/latch step to complete.
  3. Once the instance is ready for backup we can start with the actual DIY Backup. This will include steps to:
    • Make a fully consistent backup of the database, using pg_dump.
    • Make a fully consistent backup of the filesystem, using rsync.

      If you're using Bitbucket Mesh for storing your repositories, you will need to run the DIY backup scripts on each of them at this stage

  4. Notify the Bitbucket Data Center instance once the backup process finishes and unlock it.
  5. Archive all files created during the backup into one big archive.

A user will get an error message if they try to access the web interface, or use the hosting services, when the application is in maintenance mode.

As an indication of the unavailability time that can be expected, in Atlassian's internal use we have seen downtimes of less than a minute.

What is backed up

The Bitbucket DIY Backup backs up the following data:

  • the database the instance is connected to (either the internal or external database)
  • managed Git repositories
  • the Bitbucket Data Center logs
  • installed plugins and their data

DIY Backups using Bash scripts

This section presents a complete DIY Backup solution that uses the following tools:

  • bash - for scripting
  • jq - an open source command line JSON processor for parsing the REST responses from Bitbucket Data Center
  • pg_dump (or sqlcmd) - for backing up a PostgreSQL database
  • rsync - for backing up the filesystem
  • tar - for making a backup archive

This approach (with small modifications) can be used for running DIY Backups on:

  • Linux and Unix
  • macOS
  • Windows with cygwin.

Bash scripts

You can download the example scripts from Bitbucket or simply clone the repository.


Running the Bash script

Once you have downloaded the Bash scripts, you need to create one file:

For example, here's how you might configure  bitbucket.diy-backup.vars.sh if:

  • your Bitbucket Data Center server is called bitbucket.example.com, uses port 7990, and has its home directory in /bitbucket-home
  • you want to generate the backup in /bitbucket-backup, and store your .tar.gz backups in /bitbucket-backup-archives,
  • you have a System Administrator in Bitbucket with the username "admin" and password "admin", and you run Bitbucket (and the backup scripts) as the OS user "atlbitbucket"

bitbucket.diy-backup.vars.sh

#!/bin/bash

CURL_OPTIONS="-L -s -f"

INSTANCE_NAME=bitbucket

BITBUCKET_URL=http://bitbucket.example.com:7990
BITBUCKET_HOME=/bitbucket-home/
BITBUCKET_UID=atlbitbucket

BITBUCKET_GID=atlbitbucket

BACKUP_HOME_TYPE=rsync
BACKUP_DATABASE_TYPE=postgresql
BACKUP_ARCHIVE_TYPE=tar

BITBUCKET_BACKUP_USER=admin
BITBUCKET_BACKUP_PASS=admin
BITBUCKET_BACKUP_EXCLUDE_REPOS=()


BITBUCKET_DB=bitbucket
POSTGRES_HOST=localhost
POSTGRES_USERNAME=dbuser
export PGPASSWORD=dbpass
POSTGRES_PORT=5432

# Make use of PostgreSQL 9.3+ options if available
psql_version="$(psql --version | awk '{print $3}')"

psql_majorminor="$(printf "%d%03d" $(echo "${psql_version}" | tr "." "\n" | head -n 2))"
if [[ ${psql_majorminor} -ge 9003 ]]; then
 PG_PARALLEL="-j 5"
 PG_SNAPSHOT_OPT="--no-synchronized-snapshots"
fi

BITBUCKET_BACKUP_ROOT=/bitbucket-backup
BITBUCKET_BACKUP_DB=${BITBUCKET_BACKUP_ROOT}/bitbucket-db/

BITBUCKET_BACKUP_HOME=${BITBUCKET_BACKUP_ROOT}/bitbucket-home/

BITBUCKET_BACKUP_ARCHIVE_ROOT=/bitbucket-backup-archives

# Used by the scripts for verbose logging. If not true only errors will be shown.
BITBUCKET_VERBOSE_BACKUP=TRUE


HIPCHAT_URL=https://api.hipchat.com
HIPCHAT_ROOM=
HIPCHAT_TOKEN=

KEEP_BACKUPS=0

The supplied  bitbucket.diy-backup.vars.sh is written to use PostgreSQL, rsync, and tar by default.  But if you want to use different tools, you can also customize the top section of this file:

Example usage:

# Strategy for backing up the Bitbucket home directory:
# - amazon-ebs - Amazon EBS snapshots of the volume containing the home directory
# - rsync - "rsync" of the home directory contents to a temporary location. NOTE: This can NOT be used
# with BACKUP_ZERO_DOWNTIME=true.
BACKUP_HOME_TYPE=rsync


# Strategy for backing up the database:
# - amazon-rds - Amazon RDS snapshots
# - mysql - MySQL using "mysqldump" to backup and "mysql" to restore
# - postgresql - PostgreSQL using "pg_dump" to backup and "pg_restore" to restore
# - postgresql93-fslevel - PostgreSQL 9.3 with data directory located in the file system volume as home directory (so
# that it will be included implicitly in the home volume snapshot)
BACKUP_DATABASE_TYPE=postgresql


# Strategy for backing up Elasticsearch:
# - <leave blank> - No separate snapshot and restore of Elasticsearch state (default).
# - s3 - Amazon S3 bucket - requires the Elasticsearch Cloud plugin to be installed.
# - fs - Shared filesystem - requires all data and master nodes to mount a shared file system to the same mount point.
BACKUP_ELASTICSEARCH_TYPE=

You also need to create two directories for DIY Backup to work:

  1. ${BITBUCKET_BACKUP_ROOT} is a working directory (/bitbucket-backup in our example) where copies of Bitbucket Data Center home directory and database dump are built during the DIY Backup process. 
  2. ${BITBUCKET_BACKUP_ARCHIVE_ROOT} is the directory (/bitbucket-backup-archives in our example) where the final backup archives are saved. 

The Bash scripts may be run on any host, provided it has:

  • read/write access to the above ${BITBUCKET_BACKUP_ROOT}  and ${BITBUCKET_BACKUP_ARCHIVE_ROOT}   directories,
  • read access to the ${BITBUCKET_HOME} directory,
  • read access to the database, and
  • network access to run curl commands on the Bitbucket Data Center server. 

It doesn't matter whether the filesystem access is direct or over NFS, or whether the network access is direct to a node or to a load balancer / reverse proxy. 

Once your bitbucket.diy-backup.vars.sh is correctly configured, run the backup in a terminal window:

$ ./bitbucket.diy-backup.sh 

The first time you run the backup, rsync will do most of the work since the /bitbucket-backup working directory is initially empty. This is normal. Fortunately, this script performs one rsync before locking Bitbucket Data Center, followed by a second rsync while Bitbucket Data Center is locked to minimize downtime. 

On second and subsequent backup runs,  /bitbucket-backup is already populated so the backup process should be faster. The output you can expect to see looks something like this:

$ ./bitbucket.diy-backup.sh
[http://localhost:7990/bitbucket]  INFO: Prepared backup of DB bitbucket in /bitbucket-backup/bitbucket-db/
building file list ... done.
sent 4.17M bytes  received 484 bytes  2.78M bytes/sec
total size is 121.12M  speedup is 29.06
[http://localhost:7990/bitbucket]  INFO: Prepared backup of /bitbucket-home to /bitbucket-backup/bitbucket-home/
[http://localhost:7990/bitbucket]  INFO: locked with '7187ae1824ce1ede38a8e7de4bccf58d9a8e1a7a'
[http://localhost:7990/bitbucket]  INFO: backup started with '82c73f89e790b27fef3032e81c7071388ae4e371'
[http://localhost:7990/bitbucket]  INFO: Waiting for DRAINED state....... done
[http://localhost:7990/bitbucket]  INFO: db state 'DRAINED'
[http://localhost:7990/bitbucket]  INFO: scm state 'DRAINED'
[http://localhost:7990/bitbucket]  INFO: Performed backup of DB bitbucket in /bitbucket-backup/bitbucket-db/
[http://localhost:7990/bitbucket]  INFO: Backup progress updated to 50
building file list ... done.
sent 4.87M bytes  received 484 bytes  3.25M bytes/sec
total size is 121.82M  speedup is 24.99
[http://localhost:7990/bitbucket]  INFO: Performed backup of /bitbucket-home to /bitbucket-backup/bitbucket-home/
[http://localhost:7990/bitbucket]  INFO: Backup progress updated to 100
[http://localhost:7990/bitbucket]  INFO: Bitbucket instance unlocked
[http://localhost:7990/bitbucket]  INFO: Archiving /bitbucket-backup into /bitbucket-backup-archives/bitbucket-20150917-082818-498.tar.gz
[http://localhost:7990/bitbucket]  INFO: Archived  /bitbucket-backup into /bitbucket-backup-archives/bitbucket-20150917-082818-498.tar.gz

Restoring a DIY Backup

When restoring Bitbucket Data Center, you must run the bitbucket.diy-restore.sh script on the machine that Bitbucket Data Center should be restored to. In order to ensure accidental restores do not delete existing data, you should never restore into an existing home directory.

The new database should be configured following the instructions in Connect Bitbucket to an external database and its sub-page that corresponds to your database type. 

To see the available backups in your ${BITBUCKET_BACKUP_ARCHIVE_ROOT} directory, just type:

$ ./bitbucket.diy-restore.sh

You should see output similar to this:

$ ./bitbucket.diy-restore.sh
Usage: ./bitbucket.diy-restore.sh <backup-file-name>.tar.gz
Available backups:
bitbucket-20150917-082818-498.tar.gz  bitbucket-20150918-083745-001.tar.gz

To restore a backup, run bitbucket.diy-restore.sh with the file name as the argument:

$ ./bitbucket.diy-restore.sh bitbucket-20150917-082818-498

You should see output like this:

$ ./bitbucket.diy-restore.sh bitbucket-20150917-082818-498.tar.gz
[http://localhost:7990/bitbucket]  INFO: Extracted /bitbucket-backup-archives/bitbucket-20150917-082818-498.tar.gz into /tmp/bitbucket.diy-restore.dQsbzU
[http://localhost:7990/bitbucket]  INFO: Performed restore of /tmp/bitbucket.diy-restore.dQsbzU/bitbucket-db to DB bitbucket2
[http://localhost:7990/bitbucket]  INFO: Performed restore of /tmp/bitbucket.diy-restore.dQsbzU/bitbucket-home to /bitbucket-home2

Canceling the backup

You can cancel the running backup operation if necessary.

To cancel the backup:

  1. Copy the cancel token echoed in the terminal (or the Command Prompt on Windows). Look for the line "backup started with token"

    $ ./bitbucket.diy-backup.sh
    [http://localhost:7990/bitbucket]  INFO: Prepared backup of DB bitbucket in /bitbucket-backup/bitbucket-db/
    building file list ... done.
    sent 4.17M bytes  received 484 bytes  2.78M bytes/sec
    total size is 121.12M  speedup is 29.06
    [http://localhost:7990/bitbucket]  INFO: Prepared backup of /bitbucket-home to /bitbucket-backup/bitbucket-home/
    [http://localhost:7990/bitbucket]  INFO: locked with '7187ae1824ce1ede38a8e7de4bccf58d9a8e1a7a'
    [http://localhost:7990/bitbucket]  INFO: backup started with '82c73f89e790b27fef3032e81c7071388ae4e371'
    [http://localhost:7990/bitbucket]  INFO: Waiting for DRAINED state....... done
    [http://localhost:7990/bitbucket]  INFO: db state 'DRAINED'
    [http://localhost:7990/bitbucket]  INFO: scm state 'DRAINED'

    E.g. use "82c73f89e790b27fef3032e81c7071388ae4e371"

  2. Go to the Bitbucket Data Center interface in your browser. Bitbucket Data Center will display this screen:

  3. Click Cancel backup, and enter the cancel token:

  4. Click Cancel backup.

Note that Bitbucket Data Center will still be locked in maintenance mode. Repeat these steps using the "locked with" token (e.g. "7187ae1824ce1ede38a8e7de4bccf58d9a8e1a7a") to exit maintenance mode as well, and unlock Bitbucket Data Center.

Advanced writing your own DIY Backup using the REST APIs

This section is optional and provides background information about how you might use the Bitbucket Data Center REST APIs if you need to rewrite the DIY Backup scripts described above in your preferred language or to customize them heavily.

Note that this discussion shows curl commands in Bash, however you can use any language.

The following steps are involved:

Preparation

Before you lock Bitbucket Data Center, you can perform any preparation you like. It makes sense to perform as much processing as possible before you lock the application, to minimize downtime later. For example, you could perform an rsync:

rsync -avh --delete --delete-excluded --exclude=/caches/ --exclude=/data/db.* --exclude=/export/ --exclude=/log/ --exclude=/plugins/.*/ --exclude=/tmp --exclude=/.lock ${BITBUCKET_HOME} ${BITBUCKET_BACKUP_HOME}

Lock the Bitbucket Data Center instance

The next step in making a backup of a Bitbucket Data Center instance is to lock the instance for maintenance. This can be done using a POST request to the /mvc/maintenance/lock REST point (where BITBUCKET_URL points to the Bitbucket Data Center instance, BITBUCKET_BACKUP_USER is a Bitbucket Data Center user with backup permissions, and BITBUCKET_BACKUP_PASS is this user's password).

REQUEST
curl -s \
	 -u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
	 -X POST \
	 -H "Content-type: application/json" \
     "${BITBUCKET_URL}/mvc/maintenance/lock"
RESPONSE
{
	"unlockToken":"0476adeb6cde3a41aa0cc19fb394779191f5d306",
	"owner": {
		"displayName":"admin",
		"name":"admin"
	}
}

If successful, the Bitbucket Data Center instance will respond with a 202 and will return a response JSON similar to the one above. The  unlockToken should be used in all subsequent requests where  $BITBUCKET_LOCK_TOKEN  is required. This token can also be used to manually unlock the instance.

Start the backup process

Next, all connections to both the database and the filesystem must be drained and latched. Your code must handle backing up of both the filesystem and the database.

At this point, you should make a POST request to /mvc/admin/backups. Notice that the curl call includes the ?external=true parameter:

REQUEST
curl -s \
	 -u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
	 -X POST \
	 -H "X-Atlassian-Maintenance-Token: ${BITBUCKET_LOCK_TOKEN}" \
	 -H "Accept: application/json" \
	 -H "Content-type: application/json" \
	 "${BITBUCKET_URL}/mvc/admin/backups?external=true"
RESPONSE
{
	"id":"d2e15c3c2da282b0990e8efb30b4bffbcbf09e04",
	"progress": {
		"message":"Closing connections to the current database",
		"percentage":5
	},
	"state":"RUNNING",
	"type":"BACKUP",
	"cancelToken":"d2e15c3c2da282b0990e8efb30b4bffbcbf09e04"
}

If successful the instance will respond with 202 and a response JSON similar to the one above will be returned. The  cancelToken  can be used to manually cancel the back up process.


Wait for the instance to complete preparation.

Part of the back up process includes draining and latching the connections to the database and the filesystem. Before continuing with the back up we have to wait for the instance to report that this has been done. To get details on the current status we make a GET request to the /mvc/maintenance REST point.

REQUEST
curl -s \
	 -u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
	 -X GET \
	 -H "X-Atlassian-Maintenance-Token: ${BITBUCKET_LOCK_TOKEN}" \
	 -H "Accept: application/json" \
	 -H "Content-type: application/json" \
	 "${BITBUCKET_URL}/mvc/maintenance" 
RESPONSE
 {
	"task":{
		"id":"0bb6b2ed52a6a12322e515e88c5d515d6b6fa95e",
		"progress":{
			"message":"Backing up Bitbucket home",
			"percentage":10
		},
		"state":"RUNNING",
		"type":"BACKUP"
	},
	"db-state":"DRAINED",
	"scm-state":"DRAINED"
}

This causes the Bitbucket Data Center instance to report its current state. We have to wait for both  db-state  and  scm-state  to have a status of  DRAINED  before continuing with the backup.  

Perform the actual backup

At this point we are ready to create the actual backup of the filesystem. For example, you could use rsync again:

rsync -avh --delete --delete-excluded --exclude=/caches/ --exclude=/data/db.* --exclude=/export/ --exclude=/log/ --exclude=/plugins/.*/ --exclude=/tmp --exclude=/.lock ${BITBUCKET_HOME} ${BITBUCKET_BACKUP_HOME}

The rsync options shown here are for example only, but indicate how you can include only the required files in the backup process and exclude others. Consult the documentation for rsync, or the tool of your choice, for a more detailed description. 

When creating the database backup you could use your vendor-specific database backup tool, for example pg_dump if you use PostgreSQL:

pg_dump -Fd ${BITBUCKET_DB} -j 5 --no-synchronized-snapshots -f ${BITBUCKET_BACKUP_DB}


While performing these operations, good practice is to update the instance with progress on the backup so that it's visible in the UI. This can be done by issuing a POST request to /mvc/admin/backups/progress/client with the token and the percentage completed as parameters:

REQUEST
curl -s \
	 -u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
	 -X POST \
	 -H "Accept: application/json" \
	 -H "Content-type: application/json" \
	 "${BITBUCKET_URL}/mvc/admin/backups/progress/client?token=${BITBUCKET_LOCK_TOKEN}&percentage=${BITBUCKET_BACKUP_PERCENTAGE}"

Bitbucket Data Center will respond to this request with an empty 202 if everything is OK. 

When displaying progress to users, Bitbucket Data Center divides the 100 percent progress into 90 percent user DIY Backup, and 10 percent application preparation. This means, for example, if your script sends percentage=0, Bitbucket Data Center may display up to 10 percent progress for its own share of the backup work. 

(Optional) Run the backup scripts for each Bitbucket Mesh node that's connected to your instance

If you’re using Bitbucket Mesh to store your repositories, you will need to repeat the above process by setting the appropriate options and running the backup script on each Mesh node individually.

Inform the Bitbucket Data Center instance that the backup is complete

Once we've finished the backup process we must report to the Bitbucket Data Center instance that progress has reached 100 percent. This is done using a similar request to the progress request. We issue a POST request to /mvc/admin/backups/progress/client with the token and 100 as the percentage:

REQUEST
curl -s \
	 -u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
	 -X POST \
	 -H "Accept: application/json" \
	 -H "Content-type: application/json" \
	 "${BITBUCKET_URL}/mvc/admin/backups/progress/client?token=${BITBUCKET_LOCK_TOKEN}&percentage=100"

Bitbucket Data Center will respond with an empty 202 if everything is OK. The back up process is considered completed once the percentage is 100. This will unlatch the database and the filesystem for this Bitbucket Data Center instance.  

Unlock the Bitbucket Data Center instance 

The final step we need to do in the back up process is to unlock the instance. This is done with a DELETE request to the /mvc/maintenance/lock REST point:

REQUEST
curl -s \
	 -u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
	 -X DELETE \
	 -H "Accept: application/json" \
	 -H "Content-type: application/json" \
	 "${BITBUCKET_URL}/mvc/maintenance/lock?token=${BITBUCKET_LOCK_TOKEN}"

The Bitbucket Data Center instance will respond to this request with an empty 202 if everything is OK, and will unlock access.

Last modified on Sep 24, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.