Running integrity checks in Bitbucket Data Center

Still need help?

The Atlassian Community is here for you.

Ask the community

This feature is only for customers with an active Bitbucket Data Center resources license.

This page describes how to run integrity checks in a Bitbucket Data Center instance, for example, after restoring from backups.

About integrity checks

Bitbucket Data Center allows you to perform an integrity check that scans for potential inconsistencies between the database and home directory, and resolves them if necessary so that your pull request and repository state are completely consistent with each other. You can perform an integrity check in any situation where you suspect your database and home directory may contain inconsistencies, for example, after restoring from a backup.

On this page

Running integrity checks

To run integrity checks, add this line to your ${BITBUCKET_HOME}/shared/bitbucket.properties:

disaster.recovery=true

Then start Bitbucket. You can start Bitbucket on all cluster nodes if you wish.

After starting, Bitbucket will run integrity checks on one cluster node only. Integrity checks may take several minutes to complete, but run in the background. While integrity checks are running users can still log in, interact with the system, and perform hosting operations on repositories.

Disabling integrity checks

After you have restored Bitbucket, integrity checks have run, and you have resumed normal operation, turn off the disaster.recovery property in your bitbucket.properties file so integrity checks won't run unnecessarily the next time your instance is restarted.

disaster.recovery=false

What integrity checks look for

Integrity checks (which run when disaster.recovery is set to true) scan your instance for inconsistencies between the database and home directory that can occur when snapshots of your database and file system were taken at slightly different times.

Why integrity checks are needed

When Bitbucket is running it is constantly modifying its database and home directory, but under almost all circumstances the two data sources will be consistent with each other (even if the UI is slow to catch up).

However, when database and home directory snapshots are taken independently, and updates that affect the database and home directory happen between the two snapshots, integrity checks may find inconsistencies. An example of when this could happen is if a pull request is merged between snapshots. When snapshots of your database and home directory are taken close enough together the chance of inconsistencies arising are small.

What integrity checks cannot detect

Inconsistencies in Git: It's important to note that integrity checks only detect inconsistencies between your database and home directory, not internal inconsistencies within the repositories themselves.

If you suspect repositories in your Bitbucket instance have become corrupted in some other way, you may need to manually run git fsck to diagnose and restore individual repositories. See Recommended action plan if a repository becomes corrupted on a Bitbucket Server for more information, or contact Atlassian Support.

Information not in your database/home directory when the backup was takenThe Integrity Checker can detect mismatches between the state of a repository or pull request in the database and file system and make adjustments to restore integrity, but it cannot reconstruct information not in your database or home directory when the backup was taken.

This means if your backups are taken hourly, when restoring from your latest backup your users may lose up to an hour worth of work. In addition, if your latest database and file system snapshots were taken a minute apart, changes to pull requests made in this time may be lost and cannot be reconstructed by the Integrity Checker.

The best way to ensure inconsistencies don't occur in your backups is to ensure your file system and database snapshots are taken as close together in time as possible, or use the "point-in-time recovery" feature of your database vendor to restore the database to when the file system snapshot was taken.

Feedback from the integrity check process

A warning (warning) or information (info) banner will be displayed to system administrator to indicate the state/outcome of the integrity check process. This banner can be in one of four states.

  • Integrity checks are running, no inconsistencies have been found (info)
  • Integrity checks are running, at least one inconsistency has been found (warning)
  • Integrity checks complete, no inconsistencies found (info)
  • Integrity checks complete, inconsistencies found (warning)

When an integrity check finds an inconsistency

If an integrity check finds an inconsistency between the database and home directory, it will automatically perform adjustments to restore integrity between the two. For example, if during your backup process someone merges a pull request after a database snapshot, but before the file system snapshot, and that backup is restored. In this case the integrity checks will find the pull request is in an inconsistent state and adjust the pull request in the database to match the actual state on disk. When this happens the adjustment is shown in the Activity tab of the pull request, and is attributed to the Integrity Checker service user. Any activity to your pull requests performed by the Integrity Checker also generates the usual notifications.

An example of an adjustment made by the Integrity Checker on the Activity tab

The Integrity Checker will write a message to the application log whenever it encounters an inconsistency. Filtering the atlassian-bitbucket.log for DefaultIntegrityCheckReporter will return all relevant log entries.  

You should read the Integrity Checker log entries to understand why the inconsistency occurred. Inconsistency error messages will read:

The repository PROJ/repo[1] exists but the directory /repositories/1 is missing. To restore integrity, an empty repository directory was created.

or

PROJ/repo[1]: Pull request #1 is marked merged but the merge commit could not be found on the target ref. Trying to restore integrity by reopening

or

PROJ/repo[1]: Pull request #1 could not be reopened, declining instead. (Reason: REASON)

 

Where REASON can be one of

  • an open pull request with the same to and from refs already exists
  • unexpected missing commit
  • fromRef could not be resolved

 

If you find many inconsistencies from a larger range of time, this may indicate that your database and home directory snapshots were taken further apart in time than you intended. To ensure these inconsistencies don't arise, test your disaster recovery plan regularly, and ensure that your backup and restore processes capture database and home directory snapshots as close together in time as possible.

Pull requests updated by rescoping.

The standard Bitbucket server rescoping process will normalize a large number of pull request inconsistencies in these cases the integrity check reporter will not log a message to the application log but rather send a notification to all pull request collaborators. 

Example Scenarios

Here are a few example scenarios that the 'Integrity Checker' can detect and resolve:

Integrity CheckFilesystem stateDatabase stateResult
Recently merged pull requestsPull request is mergedPull request is marked as 'open'

'Integrity Checker' will mark pull request as remotely merged.

Note: only the merge activity will be attributed to the 'Integrity Checker' user, the merge commit will remain authored by the original merger.

Pull request is not mergedPull request is marked as 'merged''Integrity Checker' will re-open the pull request.
Repository creationRepository #2 does not existRepository #2 existsAn empty repository will be created on the filesystem.
Last modified on Oct 13, 2020

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.