Bitbucket DIY Backup
This article explains use of the Bitbucket DIY Backup scripts for use with Bitbucket Server and Data Center 4.x+. If you are running an earlier version of this product, formerly known as Stash, please see Using Stash (3.11) DIY Backup.
The Bitbucket DIY Backup allows you to:
- significantly reduce the downtime needed to create a consistent backup
- use the vendor-specific database backup tool appropriate to your back end database, for example:
pg_dump
if your back end database is PostgreSQLsqlcmd
with an appropriate command for differential backup, if your back end database is MS SQL Server
- use the optimal file system backup tool for your Bitbucket Data Center home directory, for example:
- an LVM snapshot logical volume if your Bitbucket Data Center home directory uses LVM
- a SAN-based backup if your Bitbucket Data Center home directory uses a Storage Area Network
rsync
, if available
- take backups of Bitbucket Data Center and Bitbucket Mesh instances without having to bring nodes down manually.
Download the worked example scripts from Bitbucket:
The key to reducing downtime is the use of optimal, vendor-specific database and file system backup tools. Bitbucket DIY Backup does require you to write some code in a language of your choice to perform the required backup steps, using the REST API available for Bitbucket Server and Data Center 4.0.
DIY Backup supports Windows and Linux platforms, and Bitbucket version 4.0 and higher. DIY Backup supports Bitbucket Data Center and Bitbucket Mesh instances equally – any DIY Backup solution that works on one should work on the other without modification.
For information about other backup strategies for Bitbucket Data Center, see Data recovery and backups. That page also discusses the tight coupling between the Bitbucket Data Center file system on disk and the database that the application uses.
Please note that the examples on this page are provided as guidance for developing a DIY Backup solution. As such, the third-party tools described are for example only – you will need to choose the tools that are appropriate to your own specific installation of Bitbucket Data Center.
Consult the vendor documentation for the third-party tools you choose – unfortunately, Atlassian can not provide support for those tools.
This page:
- Describes a complete DIY Backup solution for a PostgreSQL database and local filesystem, using
bash
shell scripts. - Provides background information about how the Bitbucket Data Center REST API can be used for DIY Backups.
You can use this solution directly if your Bitbucket Data Center instance has the same or similar configuration, or use this as a starting point to develop your own DIY Backup solution tailored to your hardware configuration.
How it works
When you use DIY Backup, you have complete control over the backup steps, and can implement any custom processes you like in the language of your choice. For example, you can use your database's incremental or fast snapshot tools and/or your file server's specific tools as part of a DIY Backup.
The DIY Backup does the following:
- Prepares the Bitbucket Data Center instance for backup. This happens before Bitbucket Data Center is locked, so we want to do as much processing as possible here in order to minimize downtime later. For example, we can take an initial snapshot using incremental database and filesystem utilities. These do not have to be 100% consistent as Bitbucket Data Center is still running and modifying the database and filesystem. But taking the initial snapshot now may reduce the amount of work done later (while the application is locked), especially if the amount of data modified between backups is large. The steps include:
- Taking an initial backup of the database (if it supports progressive/differential backups).
- Doing an initial
rsync
of the home folder to the backup folder.
- Initiates the backup, which will:
- Lock the Bitbucket Data Center instance.
- Drain and latch the connections to the database and the filesystem.
- Wait for the drain/latch step to complete.
- Once the instance is ready for backup we can start with the actual DIY Backup. This will include steps to:
- Make a fully consistent backup of the database, using
pg_dump
. Make a fully consistent backup of the filesystem, using
rsync
.If you're using Bitbucket Mesh for storing your repositories, you will need to run the DIY backup scripts on each of them at this stage
- Make a fully consistent backup of the database, using
- Notify the Bitbucket Data Center instance once the backup process finishes and unlock it.
- Archive all files created during the backup into one big archive.
A user will get an error message if they try to access the web interface, or use the hosting services, when the application is in maintenance mode.
As an indication of the unavailability time that can be expected, in Atlassian's internal use we have seen downtimes of less than a minute.
What is backed up
The Bitbucket DIY Backup backs up the following data:
- the database the instance is connected to (either the internal or external database)
- managed Git repositories
- the Bitbucket Data Center logs
- installed plugins and their data
DIY Backups using Bash scripts
This section presents a complete DIY Backup solution that uses the following tools:
bash
- for scriptingjq
- an open source command line JSON processor for parsing the REST responses from Bitbucket Data Centerpg_dump
(orsqlcmd
) - for backing up a PostgreSQL databasersync
- for backing up the filesystemtar
- for making a backup archive
This approach (with small modifications) can be used for running DIY Backups on:
- Linux and Unix
- macOS
- Windows with cygwin.
Bash scripts
You can download the example scripts from Bitbucket or simply clone the repository.
Running the Bash script
Once you have downloaded the Bash scripts, you need to create one file:
bitbucket.diy-backup.vars.sh
(you can copybitbucket.diy-backup.vars.sh.example
to start)
For example, here's how you might configure bitbucket.diy-backup.vars.sh
if:
- your Bitbucket Data Center server is called
bitbucket.example.com,
uses port 7990, and has its home directory in/bitbucket-home
- you want to generate the backup in
/bitbucket-backup
, and store your.tar.gz
backups in/bitbucket-backup-archives
, - you have a System Administrator in Bitbucket with the username "admin" and password "admin", and you run Bitbucket (and the backup scripts) as the OS user "atlbitbucket"
bitbucket.diy-backup.vars.sh
#!/bin/bash
CURL_OPTIONS="-L -s -f"
INSTANCE_NAME=bitbucket
BITBUCKET_URL=http://bitbucket.example.com:7990
BITBUCKET_HOME=/bitbucket-home/
BITBUCKET_UID=atlbitbucket
BITBUCKET_GID=atlbitbucket
BACKUP_HOME_TYPE=rsync
BACKUP_DATABASE_TYPE=postgresql
BACKUP_ARCHIVE_TYPE=tar
BITBUCKET_BACKUP_USER=admin
BITBUCKET_BACKUP_PASS=admin
BITBUCKET_BACKUP_EXCLUDE_REPOS=()
BITBUCKET_DB=bitbucket
POSTGRES_HOST=localhost
POSTGRES_USERNAME=dbuser
export PGPASSWORD=dbpass
POSTGRES_PORT=5432
# Make use of PostgreSQL 9.3+ options if available
psql_version="$(psql --version | awk '{print $3}')"
psql_majorminor="$(printf "%d%03d" $(echo "${psql_version}" | tr "." "\n" | head -n 2))"
if [[ ${psql_majorminor} -ge 9003 ]]; then
PG_PARALLEL="-j 5"
PG_SNAPSHOT_OPT="--no-synchronized-snapshots"
fi
BITBUCKET_BACKUP_ROOT=/bitbucket-backup
BITBUCKET_BACKUP_DB=${BITBUCKET_BACKUP_ROOT}/bitbucket-db/
BITBUCKET_BACKUP_HOME=${BITBUCKET_BACKUP_ROOT}/bitbucket-home/
BITBUCKET_BACKUP_ARCHIVE_ROOT=/bitbucket-backup-archives
# Used by the scripts for verbose logging. If not true only errors will be shown.
BITBUCKET_VERBOSE_BACKUP=TRUE
HIPCHAT_URL=https://api.hipchat.com
HIPCHAT_ROOM=
HIPCHAT_TOKEN=
KEEP_BACKUPS=0
The supplied bitbucket.diy-backup.vars.sh
is written to use PostgreSQL, rsync, and tar by default. But if you want to use different tools, you can also customize the top section of this file:
Example usage:
# Strategy for backing up the Bitbucket home directory:
# - amazon-ebs - Amazon EBS snapshots of the volume containing the home directory
# - rsync - "rsync" of the home directory contents to a temporary location. NOTE: This can NOT be used
# with BACKUP_ZERO_DOWNTIME=true.
BACKUP_HOME_TYPE=rsync
# Strategy for backing up the database:
# - amazon-rds - Amazon RDS snapshots
# - mysql - MySQL using "mysqldump" to backup and "mysql" to restore
# - postgresql - PostgreSQL using "pg_dump" to backup and "pg_restore" to restore
# - postgresql93-fslevel - PostgreSQL 9.3 with data directory located in the file system volume as home directory (so
# that it will be included implicitly in the home volume snapshot)
BACKUP_DATABASE_TYPE=postgresql
# Strategy for backing up Elasticsearch:
# - <leave blank> - No separate snapshot and restore of Elasticsearch state (default).
# - s3 - Amazon S3 bucket - requires the Elasticsearch Cloud plugin to be installed.
# - fs - Shared filesystem - requires all data and master nodes to mount a shared file system to the same mount point.
BACKUP_ELASTICSEARCH_TYPE=
You also need to create two directories for DIY Backup to work:
${BITBUCKET_BACKUP_ROOT}
is a working directory (/bitbucket-backup
in our example) where copies of Bitbucket Data Center home directory and database dump are built during the DIY Backup process.${BITBUCKET_BACKUP_ARCHIVE_ROOT}
is the directory (/bitbucket-backup-archives
in our example) where the final backup archives are saved.
The Bash scripts may be run on any host, provided it has:
- read/write access to the above
${BITBUCKET_BACKUP_ROOT}
and${BITBUCKET_BACKUP_ARCHIVE_ROOT}
directories, - read access to the
${BITBUCKET_HOME}
directory, - read access to the database, and
- network access to run
curl
commands on the Bitbucket Data Center server.
It doesn't matter whether the filesystem access is direct or over NFS, or whether the network access is direct to a node or to a load balancer / reverse proxy.
Once your bitbucket.diy-backup.vars.sh
is correctly configured, run the backup in a terminal window:
$ ./bitbucket.diy-backup.sh
The first time you run the backup, rsync
will do most of the work since the /bitbucket-backup
working directory is initially empty. This is normal. Fortunately, this script performs one rsync
before locking Bitbucket Data Center, followed by a second rsync
while Bitbucket Data Center is locked to minimize downtime.
On second and subsequent backup runs, /bitbucket-backup
is already populated so the backup process should be faster. The output you can expect to see looks something like this:
$ ./bitbucket.diy-backup.sh
[http://localhost:7990/bitbucket] INFO: Prepared backup of DB bitbucket in /bitbucket-backup/bitbucket-db/
building file list ... done.
sent 4.17M bytes received 484 bytes 2.78M bytes/sec
total size is 121.12M speedup is 29.06
[http://localhost:7990/bitbucket] INFO: Prepared backup of /bitbucket-home to /bitbucket-backup/bitbucket-home/
[http://localhost:7990/bitbucket] INFO: locked with '7187ae1824ce1ede38a8e7de4bccf58d9a8e1a7a'
[http://localhost:7990/bitbucket] INFO: backup started with '82c73f89e790b27fef3032e81c7071388ae4e371'
[http://localhost:7990/bitbucket] INFO: Waiting for DRAINED state....... done
[http://localhost:7990/bitbucket] INFO: db state 'DRAINED'
[http://localhost:7990/bitbucket] INFO: scm state 'DRAINED'
[http://localhost:7990/bitbucket] INFO: Performed backup of DB bitbucket in /bitbucket-backup/bitbucket-db/
[http://localhost:7990/bitbucket] INFO: Backup progress updated to 50
building file list ... done.
sent 4.87M bytes received 484 bytes 3.25M bytes/sec
total size is 121.82M speedup is 24.99
[http://localhost:7990/bitbucket] INFO: Performed backup of /bitbucket-home to /bitbucket-backup/bitbucket-home/
[http://localhost:7990/bitbucket] INFO: Backup progress updated to 100
[http://localhost:7990/bitbucket] INFO: Bitbucket instance unlocked
[http://localhost:7990/bitbucket] INFO: Archiving /bitbucket-backup into /bitbucket-backup-archives/bitbucket-20150917-082818-498.tar.gz
[http://localhost:7990/bitbucket] INFO: Archived /bitbucket-backup into /bitbucket-backup-archives/bitbucket-20150917-082818-498.tar.gz
Restoring a DIY Backup
When restoring Bitbucket Data Center, you must run the
bitbucket.diy-restore.sh
script on the machine that Bitbucket Data Center should be restored to. In order to ensure accidental restores do not delete existing data, you should never restore into an existing home directory.
The new database should be configured following the instructions in Connect Bitbucket to an external database and its sub-page that corresponds to your database type.
To see the available backups in your ${BITBUCKET_BACKUP_ARCHIVE_ROOT}
directory, just type:
$ ./bitbucket.diy-restore.sh
You should see output similar to this:
$ ./bitbucket.diy-restore.sh
Usage: ./bitbucket.diy-restore.sh <backup-file-name>.tar.gz
Available backups:
bitbucket-20150917-082818-498.tar.gz bitbucket-20150918-083745-001.tar.gz
To restore a backup, run bitbucket.diy-restore.sh
with the file name as the argument:
$ ./bitbucket.diy-restore.sh bitbucket-20150917-082818-498
You should see output like this:
$ ./bitbucket.diy-restore.sh bitbucket-20150917-082818-498.tar.gz
[http://localhost:7990/bitbucket] INFO: Extracted /bitbucket-backup-archives/bitbucket-20150917-082818-498.tar.gz into /tmp/bitbucket.diy-restore.dQsbzU
[http://localhost:7990/bitbucket] INFO: Performed restore of /tmp/bitbucket.diy-restore.dQsbzU/bitbucket-db to DB bitbucket2
[http://localhost:7990/bitbucket] INFO: Performed restore of /tmp/bitbucket.diy-restore.dQsbzU/bitbucket-home to /bitbucket-home2
Canceling the backup
You can cancel the running backup operation if necessary.
To cancel the backup:
Copy the cancel token echoed in the terminal (or the Command Prompt on Windows). Look for the line "backup started with
token
"$ ./bitbucket.diy-backup.sh [http://localhost:7990/bitbucket] INFO: Prepared backup of DB bitbucket in /bitbucket-backup/bitbucket-db/ building file list ... done. sent 4.17M bytes received 484 bytes 2.78M bytes/sec total size is 121.12M speedup is 29.06 [http://localhost:7990/bitbucket] INFO: Prepared backup of /bitbucket-home to /bitbucket-backup/bitbucket-home/ [http://localhost:7990/bitbucket] INFO: locked with '7187ae1824ce1ede38a8e7de4bccf58d9a8e1a7a' [http://localhost:7990/bitbucket] INFO: backup started with '82c73f89e790b27fef3032e81c7071388ae4e371' [http://localhost:7990/bitbucket] INFO: Waiting for DRAINED state....... done [http://localhost:7990/bitbucket] INFO: db state 'DRAINED' [http://localhost:7990/bitbucket] INFO: scm state 'DRAINED'
E.g. use "82c73f89e790b27fef3032e81c7071388ae4e371"
- Go to the Bitbucket Data Center interface in your browser. Bitbucket Data Center will display this screen:
- Click Cancel backup, and enter the cancel token:
- Click Cancel backup.
Note that Bitbucket Data Center will still be locked in maintenance mode. Repeat these steps using the "locked with" token (e.g. "7187ae1824ce1ede38a8e7de4bccf58d9a8e1a7a") to exit maintenance mode as well, and unlock Bitbucket Data Center.
Advanced – writing your own DIY Backup using the REST APIs
This section is optional and provides background information about how you might use the Bitbucket Data Center REST APIs if you need to rewrite the DIY Backup scripts described above in your preferred language or to customize them heavily.
Note that this discussion shows curl
commands in Bash, however you can use any language.
The following steps are involved:
Preparation
Before you lock Bitbucket Data Center, you can perform any preparation you like. It makes sense to perform as much processing as possible before you lock the application, to minimize downtime later. For example, you could perform an rsync
:
rsync -avh --delete --delete-excluded --exclude=/caches/ --exclude=/data/db.* --exclude=/export/ --exclude=/log/ --exclude=/plugins/.*/ --exclude=/tmp --exclude=/.lock ${BITBUCKET_HOME} ${BITBUCKET_BACKUP_HOME}
Lock the Bitbucket Data Center instance
The next step in making a backup of a Bitbucket Data Center instance is to lock the instance for maintenance. This can be done using a POST request to the /mvc/maintenance/lock
REST point (where BITBUCKET_URL
points to the Bitbucket Data Center instance, BITBUCKET_BACKUP_USER
is a Bitbucket Data Center user with backup permissions, and BITBUCKET_BACKUP_PASS
is this user's password).
curl -s \
-u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
-X POST \
-H "Content-type: application/json" \
"${BITBUCKET_URL}/mvc/maintenance/lock"
{
"unlockToken":"0476adeb6cde3a41aa0cc19fb394779191f5d306",
"owner": {
"displayName":"admin",
"name":"admin"
}
}
If successful, the Bitbucket Data Center instance will respond with a 202 and will return a response JSON similar to the one above. The unlockToken
should be used in all subsequent requests where $BITBUCKET_LOCK_TOKEN
is required. This token can also be used to manually unlock the instance.
Start the backup process
Next, all connections to both the database and the filesystem must be drained and latched. Your code must handle backing up of both the filesystem and the database.
At this point, you should make a POST
request to /mvc/admin/backups
. Notice that the curl
call includes the ?external=true
parameter:
curl -s \
-u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
-X POST \
-H "X-Atlassian-Maintenance-Token: ${BITBUCKET_LOCK_TOKEN}" \
-H "Accept: application/json" \
-H "Content-type: application/json" \
"${BITBUCKET_URL}/mvc/admin/backups?external=true"
{
"id":"d2e15c3c2da282b0990e8efb30b4bffbcbf09e04",
"progress": {
"message":"Closing connections to the current database",
"percentage":5
},
"state":"RUNNING",
"type":"BACKUP",
"cancelToken":"d2e15c3c2da282b0990e8efb30b4bffbcbf09e04"
}
If successful the instance will respond with 202 and a response JSON similar to the one above will be returned. The cancelToken
can be used to manually cancel the back up process.
Wait for the instance to complete preparation.
Part of the back up process includes draining and latching the connections to the database and the filesystem. Before continuing with the back up we have to wait for the instance to report that this has been done. To get details on the current status we make a GET
request to the /mvc/maintenance
REST point.
curl -s \
-u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
-X GET \
-H "X-Atlassian-Maintenance-Token: ${BITBUCKET_LOCK_TOKEN}" \
-H "Accept: application/json" \
-H "Content-type: application/json" \
"${BITBUCKET_URL}/mvc/maintenance"
{
"task":{
"id":"0bb6b2ed52a6a12322e515e88c5d515d6b6fa95e",
"progress":{
"message":"Backing up Bitbucket home",
"percentage":10
},
"state":"RUNNING",
"type":"BACKUP"
},
"db-state":"DRAINED",
"scm-state":"DRAINED"
}
This causes the Bitbucket Data Center instance to report its current state. We have to wait for both db-state
and scm-state
to have a status of DRAINED
before continuing with the backup.
Perform the actual backup
At this point we are ready to create the actual backup of the filesystem. For example, you could use rsync again:
rsync -avh --delete --delete-excluded --exclude=/caches/ --exclude=/data/db.* --exclude=/export/ --exclude=/log/ --exclude=/plugins/.*/ --exclude=/tmp --exclude=/.lock ${BITBUCKET_HOME} ${BITBUCKET_BACKUP_HOME}
The rsync options shown here are for example only, but indicate how you can include only the required files in the backup process and exclude others. Consult the documentation for rsync
, or the tool of your choice, for a more detailed description.
When creating the database backup you could use your vendor-specific database backup tool, for example pg_dump
if you use PostgreSQL:
pg_dump -Fd ${BITBUCKET_DB} -j 5 --no-synchronized-snapshots -f ${BITBUCKET_BACKUP_DB}
While performing these operations, good practice is to update the instance with progress on the backup so that it's visible in the UI. This can be done by issuing a POST
request to /mvc/admin/backups/progress/client
with the token and the percentage completed as parameters:
curl -s \
-u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
-X POST \
-H "Accept: application/json" \
-H "Content-type: application/json" \
"${BITBUCKET_URL}/mvc/admin/backups/progress/client?token=${BITBUCKET_LOCK_TOKEN}&percentage=${BITBUCKET_BACKUP_PERCENTAGE}"
Bitbucket Data Center will respond to this request with an empty 202 if everything is OK.
When displaying progress to users, Bitbucket Data Center divides the 100 percent progress into 90 percent user DIY Backup, and 10 percent application preparation. This means, for example, if your script sends percentage=0
, Bitbucket Data Center may display up to 10 percent progress for its own share of the backup work.
(Optional) Run the backup scripts for each Bitbucket Mesh node that's connected to your instance
If you’re using Bitbucket Mesh to store your repositories, you will need to repeat the above process by setting the appropriate options and running the backup script on each Mesh node individually.
Inform the Bitbucket Data Center instance that the backup is complete
Once we've finished the backup process we must report to the Bitbucket Data Center instance that progress has reached 100 percent. This is done using a similar request to the progress request. We issue a POST
request to /mvc/admin/backups/progress/client
with the token and 100 as the percentage:
curl -s \
-u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
-X POST \
-H "Accept: application/json" \
-H "Content-type: application/json" \
"${BITBUCKET_URL}/mvc/admin/backups/progress/client?token=${BITBUCKET_LOCK_TOKEN}&percentage=100"
Bitbucket Data Center will respond with an empty 202 if everything is OK. The back up process is considered completed once the percentage is 100. This will unlatch the database and the filesystem for this Bitbucket Data Center instance.
Unlock the Bitbucket Data Center instance
The final step we need to do in the back up process is to unlock the instance. This is done with a DELETE
request to the /mvc/maintenance/lock
REST point:
curl -s \
-u ${BITBUCKET_BACKUP_USER}:${BITBUCKET_BACKUP_PASS} \
-X DELETE \
-H "Accept: application/json" \
-H "Content-type: application/json" \
"${BITBUCKET_URL}/mvc/maintenance/lock?token=${BITBUCKET_LOCK_TOKEN}"
The Bitbucket Data Center instance will respond to this request with an empty 202 if everything is OK, and will unlock access.