Bitbucket: Right to erasure

Under Article 17 of the GDPR, individuals have the right to have personal data erased. This is also known as the ‘right to be forgotten’. The right is not absolute and only applies in certain circumstances. Whether or not you are required to honor an individual's request to have personal data deleted will vary on a case-by-case basis, and is determination you should always make with the assistance of legal counsel. Once you have determined you have an obligation to delete personal data, we have provided the following instructions on how to do so within certain Atlassian products.  

Personal data stored within the product can be divided into one of two areas: 1) account-level personal data; and 2) free-form text. Account-level personal data are data fields that exist within the product for the sole purpose of identifying an individual throughout the product. Examples of account-level personal data include the user's display name, profile picture or avatar and email address. These data elements are generally visible from the user's profile and are used throughout the product to point back to the user's profile when the user is @mentioned or tagged on in certain spaces or content. Deleting account-level personal data elements will automatically remove those data elements throughout the product where the relevant account-level data elements appear and in the database (subject to some limitations discussed below). 

If you have included personal data in free-form text, either typed into content spaces or as a custom field label, you will need to use the product's global search feature to surface this personal data and delete it on a case-by-case basis.

Description

These workarounds will help you remove a specific user's personal data in Bitbucket Server. The SQL queries in this guide are written for PostgreSQL, but can be easily adapted to your system's database.

Where a workaround involves running SQL scripts directly on your database, we strongly recommend you do this first on a non-production, staging or test environment before running the scripts on your production database. We also strongly recommend you take a backup of your data before making any modifications via SQL scripts.

Version compatibility

All workarounds are compatible with Bitbucket Server 4.0 and later.

Workaround

Deleting a user and removing their personal data

Step 1 - Delete the user

A deleted user is no longer able to login or make changes, but their data (SSH keys, GPG keys, access tokens, etc) is preserved for 7 days. After 7 days, the data will be deleted automatically.

This step must be completed to ensure the following steps can be performed successfully. If you don't complete this step first, and have attempted to perform the following steps, which subsequently fail, you will need to restore your system from your backup, and start the full process again.

To delete a user, use one of the following methods, depending on whether your Bitbucket Server instance is using an internal or external user directory.

Internal User Directory Delete the user from within  Bitbucket Server User Management Settings.
External User Directory
  1. Delete the user or replace the user's email address and display name in the External Directory with a non-identifying data element.
  2. Perform a manual resync of the external directory in Bitbucket Server.

Step 2 - Find the user's ID

Find the user_id of the user being deleted, and keep it handy for use in subsequent steps.

__username__ is the login name of the user whose personal data you wish to remove.

SELECT 
   user_id 
FROM 
   sta_normal_user
WHERE
   name='__username__';

Step 3 - Remove the user and their personal project

Change the username and personal project to the user's ID. The queries below must be run in the order given.

__user_id__ is the userid value obtained in step 2
__username__ is the login name of the user whose personal data you wish to remove.

Note the tilde (~) in the second query.

UPDATE 
   sta_normal_user 
SET 
   name='__user_id__',
   slug='__user_id__'
WHERE 
   user_id = __user_id__;

UPDATE 
  project
SET 
   name='~__user_id__',
   project_key='~__user_id__'
WHERE 
   name = '~__username__';

Step 4 - Remove user mentions

Replace user mentions with the new, anonymized username.

__user_id__ is the userid value obtained in step 2
__username__ is the login name of the user whose personal data you wish to remove.

For Bitbucket Server 4.0.0 to 5.3.7, replace user mentions in the sta_comment table.

UPDATE
    sta_comment
SET
    comment_text=REPLACE(comment_text, '@__username__', '@__user_id__')
WHERE
    comment_text like '%@__username__%';

For Bitbucket Server 5.0.0 and above, replace user mentions in the bb_comment table. Please note, in Bitbucket Server 5.0.0 - 5.3.7, user mentions will be in both tables.

UPDATE
    bb_comment
SET
    comment_text=REPLACE(comment_text, '@__username__', '@__user_id__')
WHERE
    comment_text like '%@__username__%';

Replace user mentions in pull request descriptions.

UPDATE
    sta_pull_request
SET
    description=REPLACE(description, '@__username__', '@__user_id__')
WHERE
    description like '%@__username__%';

Step 5 - Delete the user's OAuth tokens

The user's OAuth tokens allow Bitbucket Server to perform actions on behalf of the user in other systems (like Jira or Bamboo).

__username__ is the login name of the user whose personal data you wish to remove.

DELETE FROM 
    plugin_setting
WHERE
    key_name like 'com.atlassian.oauth.serviceprovider.ServiceProviderTokenStore%'
AND key_value like '%__username__%' 

Step 6 - Remove the user's dismissed dialogs

Update records of the user's dismissed dialogs.

__user_id__ is the userid value obtained in step 2
__username__ is the login name of the user whose personal data you wish to remove.

UPDATE
    plugin_setting
SET
    key_name=REPLACE(key_name, '__username__', '__user_id__')
WHERE
    key_name like 'chaperone:%:__username__';

Step 7 - Remove the user's avatar images

If the user uploaded any personal avatar photos, they will be at <Bitbucket home directory>/shared/data/avatars/users/__user_id__/ (where  __user_id__ is the userid value obtained in step 2).

Delete any files in that directory.

Step 8 - Restart the server

Restart the server to flush all caches and force the UI to update.


Limitations

Data in Git repositories

This guide does not discuss removing a user's personal data from a Git repository. The personal data stored within a Git repository is essential for auditing and providing a chain of license contribution and authorship.

Each Git repository hold the user's display name and email address against every change they made to the repository. Bitbucket Server will display this information and use a hash of the user's email address to look up an avatar photo on the third-party Gravatar site.

If you are considering anonymizing a user's data in a Git repository, you can achieve this by following the Git documentation on filter branch .

Anonymizing data in this way could have a significant impact on development teams, their tools and continuous integration systems. As a result, we strongly recommend against purging the user from Git history.

If you do attempt to rewrite history (using filter branch or another method), old commits will still be retained in two circumstances:

  1. A forked repository will never remove the old commits in case a fork requires them.
  2. If a commit was commented on in a pull request or commit view, the commit will not be removed in order to show the comment in its original context.

Audit, access and application logs

Bitbucket Server logs information for auditing and diagnostic purposes. All these logs may contain the user's username, display name and email address. The access and application logs may also contain the user's IP address.


Description Purpose Location
Audit logs A list of all audited events Identify authorized and unauthorized changes, or suspicious activity over a period of time (see Audit logging in Bitbucket Server and How to read the Bitbucket Server Log Formats) <Bitbucket home directory>/log/audit/
Audit table A truncated list of recent audit events Database
Access logs

A list of who accessed the system and what they accessed

<Bitbucket home directory>/log/atlassian-bitbucket-access.log
<Bitbucket home directory>/log/atlassian-bitbucket-access-YYYY-MM-DD.log
Application logs

A list of errors, warnings and diagnostic information

May be required to diagnose problems with Bitbucket Server (see Bitbucket Server debug logging and Configure Bitbucket Server Logging) <Bitbucket home directory>/log/atlassian-bitbucket.log
<Bitbucket home directory>/log/atlassian-bitbucket-YYYY-MM-DD.log

You can safely delete any audit logs, access logs or application logs that you no longer need for auditing/diagnostic purposes.

If you wish to delete the list of recent audit events generated by a user, use the following queries:

__user_id__ below is the userid value obtained in step 2

DELETE FROM
	"AO_BD73C3_PROJECT_AUDIT"
WHERE 
	"USER" = __user_id__;

DELETE FROM
	"AO_BD73C3_REPOSITORY_AUDIT"
WHERE 
	"USER" = __user_id__;

Webhook logging

If you're running Bitbucket Server 5.4.0 or above, Bitbucket Server keeps a record of triggered webhooks in the database, please read  Troubleshooting webhooks. Webhooks sent as the result of a user action will contain that user's username. If you no longer need these entries for auditing purposes, you can delete them.

__username__ below is the login name of the user whose personal data you wish to remove.

DELETE FROM
    "AO_371AEF_HIST_INVOCATION"
WHERE
    "REQUEST_BODY" like '%"name":"__username__"%';

Search data

Search data is essential for Bitbucket's repository and code search features. The user's display name, username and avatar are stored in the search data in order to allow searching their personal project.

When the system is restarted, it will reindex all projects and remove the deleted user's data. If you want to force this process to happen before a restart, follow the steps in Resolve Elasticsearch 404 error by rebuilding the index. 

Browser cache

The user's avatar image is cached aggressively by the browser and such caching is not completely under the control of Bitbucket Server. The browser may periodically refresh/purge its cache, and so the avatars will eventually disappear, but exact timings depends on the browser and user's local configuration.

Additional notes

There may be limitations based on your product version.

Note, the above-related GDPR workaround has been optimized for the latest version of this product. If you are running on a legacy version of the product, the efficacy of the workaround may be limited. Please consider upgrading to the latest product version to optimize the workarounds available under this article.

Third-party add-ons may store personal data in their own database tables or on the filesystem.

The above article in support of your GDPR compliance efforts applies only to personal data stored within the Atlassian server and data center products. To the extent you have installed third-party add-ons within your server or data center environment, you will need to contact that third-party add-on provider to understand what personal data from your server or data center environment they may access, transfer or otherwise process and how they will support your GDPR compliance efforts.

If you are a server or data center customer, Atlassian does not access, store, or otherwise process the personal data you choose to store within the products. For information about personal data Atlassian processes, see our Privacy Policy.

Last modified on May 28, 2018

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.