Bamboo: Right to erasure

Under Article 17 of the GDPR, individuals have the right to have personal data erased. This is also known as the ‘right to be forgotten’. The right is not absolute and only applies in certain circumstances. Whether or not you are required to honor an individual's request to have personal data deleted will vary on a case-by-case basis, and is determination you should always make with the assistance of legal counsel. Once you have determined you have an obligation to delete personal data, we have provided the following instructions on how to do so within certain Atlassian products.  

Personal data stored within the product can be divided into one of two areas: 1) account-level personal data; and 2) free-form text. Account-level personal data are data fields that exist within the product for the sole purpose of identifying an individual throughout the product. Examples of account-level personal data include the user's display name, profile picture or avatar and email address. These data elements are generally visible from the user's profile and are used throughout the product to point back to the user's profile when the user is @mentioned or tagged on in certain spaces or content. Deleting account-level personal data elements will automatically remove those data elements throughout the product where the relevant account-level data elements appear and in the database (subject to some limitations discussed below). 

If you have included personal data in free-form text, either typed into content spaces or as a custom field label, you will need to use the product's global search feature to surface this personal data and delete it on a case-by-case basis.

Locating and Accessing Personal Data in Bamboo

Personal Data (PD) is stored in Bamboo in one of four ways:

  • Structured PD: data in user profiles
  • Unstructured PD: data associated with Bamboo builds, results, deployment projects, environments, versions - free text
  • Filesystem PD on the server: other data stored on a server (build result logs, artifacts, audit logs, global entities, configuration etc.)
  • Filesystem PD on the agent: other data stored on the agent (build result logs, caches, artifacts)

Structured PD

User profiles contain specific PD elements used to represent users in Bamboo system. 

This data is mainly used in:

User profiles hold the following PD elements: 

User profile data Description
Full name

Text used to represent a user in Bamboo interface. All links to user profile will be using this text.

In many cases, it is holding PD such as name and surname.

User name / login

Text representing a person during login.

It is used internally in a database to correlate additional data with a user profile. It can be also visible in some REST and pages URL.

Email Email associated with a user account. Accessible on the user profile.
IM IM address associated with user IM account. Accessible on the user profile.

Unstructured PD

PD could also be stored in free-form text data fields. Because these fields allow any content, topic or label, they may or may not contain PD, depending on the instance configuration.

Domain Objects (Plans, Results, Deployment projects, Releases)and associated entities can hold any type of information, as they can contain many free text values.

Global entities (project descriptions, variables, repositories, shared credentials, other configuration etc.) can hold free text values.

Incidental PD

Various processes that run within or alongside Bamboo may store PD incidental to their functions. Below are is a list of processes that may store PD incidentally.

Filesystem

Lucene index

To speed up searching Bamboo uses Lucene library (search index). This index will duplicate some information from the DB and store it into a filesystem. When SQL queries are executed against DB there's a risk that stale data will remain in the Lucene index (e.g. authors in the build results index, or project/plan names and descriptions in quick search index). In order to refresh Lucene index, reindex needs to be performed. See https://confluence.atlassian.com/display/BAMBOO/Reindexing+data.

Lucene indexes are located in the ${bamboo_home}/index directory.

If reindexing is not possible, selected documents could be searched and deleted using this tool: https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/luke/lukeall-3.5.0.jar

Artifacts

Placement of artifacts depends on artifact handler that was used for plan result (or global artifact handler if it was not set for a specific plan).

The most popular artifact handler is Bamboo Remote Handler - artifacts are stored on Bamboo server and are located in the ${bamboo_home}/artifacts directory.

Other popular artifact handler is Amazon S3 Handler - artifact are stored on Amazon S3 servers, location is configured in administration panel in Bamboo.

To read more about artifact handlers and its configuration, see: https://confluence.atlassian.com/display/BAMBOOSERVERM/Artifact+handlers

Server Logs

Name Location DP details
Bamboo server logs  ${bamboo_home}/log/*${bamboo_install}/logs/catalina.out Can contain arbitrary data (hard to tell because of possible extensive logging)
Bamboo build logs {bamboo_home}/xml-data/builds/JOB_KEY/* Information specific to all builds, can contain arbitrary data
Analytics logs  ${bamboo_home}/analytics-logs/* Generally should not contain PD
Access logs ${bamboo_install}/logs/access_log.* Can contain username/ip address and URL of accessed resources.
Tomcat logs ${bamboo_install}/logs/* Might contain some PD.
Other server logs ${bamboo_home}/log/*, ${bamboo_install}/logs/* Might contain some PD.


To read more about logging in Bamboo, see https://confluence.atlassian.com/bamboo/logging-in-bamboo-289277239.html

Memory

Bamboo caches 

In order to speed up certain actions, Bamboo uses internal caches that make DB calls unnecessary.

Certain data from DB is cached in memory to speed up things. It's inaccessible for users directly, used by the system to speed serving of the data.

It's recommended to update DB with manual SQL queries only while the Bamboo server is stopped, otherwise cached data being different than data in DB may lead to data inconsistency.

Agents

Remote agent

All remote agent activity is recorded in the atlassian-bamboo-agent.log file stored on the agent machine in the running directory of the agent. The running directory can be viewed in the remote agent's system properties in the Bamboo Paths section. These logs can contain arbitrary data, and in general, they do not contain PD used by Bamboo.

When the agent is performing builds, it stores data in ${bamboo_agent_home}/xml-data/build-dir/JOB_KEY/*. The default name of the Bamboo agent home directory is bamboo-agent-home and its location depends on your operating system. To read more about it, check Bamboo agent home directory section here: https://confluence.atlassian.com/bamboo/locating-important-directories-and-files-289277247.html

Elastic agent

All elastic agent activity is logged inside the elastic instance where the elastic agent runs. By default, it's stored in two files: atlassian-bamboo.log and bamboo-elastic-agent.out, but it depends very much on elastic image configuration. It will also depend on the operating system of the elastic agent.

Builds data on the elastic agent is stored in the same way it's stored on remote agent.

To read more about elastic agent logs, see here: https://confluence.atlassian.com/bamboo/viewing-an-elastic-instance-289277134.html.

External storage

Backups 

It's up to you to define purpose/retention policy for backed up files. Bamboo just generates the backup to be used by the end user. See more: https://confluence.atlassian.com/bamboo/exporting-data-for-backup-289277255.html.

Deleting and/or Modifying PD in Bamboo

Once you've identified where PD may be stored in your Bamboo instance, this section describes how to delete or modify that PD.

Workaround

Follow best practices for Change Management - test and validate these settings in a Test/Development and Staging environment prior to rolling any changes in a Production environment. You must test and validate these changes to ensure that they will function well within your infrastructure prior to placing these changes in production.

Deleting or modifying PD

Deleting and modifying user PD is virtually the same process. This is because we do not recommend deleting an entire user account from Bamboo. They are an integral part of Bamboo data structure and critical for maintaining data consistency of our system.

Rather than deleting the data, we recommend modifying PD elements in the account to display elements that do not identify the user. For example, replacing the username johnsmith with deleteduser1. This way the system will be able to properly function while allowing you to remove profile-level PD that otherwise could identify the user. You can also use this process if you are simply looking to modify a user's PD - for example, if nicholassmith is actually nicksmith.

Modifying user PD

Modifying user data PD has to be performed in several steps, depending on where data are stored.

To modify user data:

  1. Handle PD in "structured" data fields
    1. (UI) Modify data in user profile - this step depends on the type of Directory that Bamboo is using for managing users.
    2. (SQL) Optionally, modify "username" - only if "username" contains PD (SQL update statements have to be executed against stopped Bamboo instance)
  2. Handle PD in "free-form text" data fields
    1. (SQL) handle PD in other entries (SQL update statements have to be executed against the stopped Bamboo instance).
  3. After change actions (only if SQL update statements were executed)
    1. Reindex Bamboo. See Reindexing data.

Handle PD in "structured" data fields

Modify PD in user profile - external user directory

If you're using Bamboo 6.5...
  1. Disable or delete a user in the external directory
  2. Restart Bamboo to refresh external user directory cache.
  3. Perform steps from "Modify PD in user profile - Internal User Directory"

If you're using Bamboo 6.6 or later...
  1. Disable or delete a user in the external directory
  2. Restart Bamboo to refresh external user directory cache.
  3. Perform steps from "Modify PD in user profile - Internal User Directory"
  4. Go to Bamboo Administration > Overview > User directories.
  5. Next to the user directory in which you've made a change, click Synchronise


Modify PD in user profile - internal User Directory

How to modify PD in user profile using internal directory

Anonymize user data in Bamboo

Before you begin

You must have global administrator permission to be able to manage users in Bamboo applications.

  1. Select  User Management.
  2. Find the user in the user list using the filter form at the top of the page.
  3. Access the user details.
  4. Store data from "User details" section for later usage (e.g. in additional SQL queries on database):
    1. Username
    2. Full name
    3. Email
  5. Modify username, full name, email, uncheck 'Active' and save changes. (https://confluence.atlassian.com/bamboo/managing-users-289277208.html)


Modifying username (Optional - only when username contains PD) 

This could possibly break the third party plugin that could reference username.

Changing username (only when username is PD)

Always back up your data before performing any modifications to the database. If possible, test any alter, insert, update, or delete SQL commands on a staging server first.

  1. You have to generate "a new username" that will be anonymized e.g. anonymized10001. Don't use any hashing function that would depend on original username!
  2. You have to modify the provided SQL file - replace <OLD_VALUE> to the old username and <NEW_VALUE> to "new username".
  3. Stop Bamboo instance to avoid Bamboo caching a lot of data and updating database directly when Bamboo is still working could cause data loss/inconsistencies)
  4. Execute script from the SQL file. For each table execute "select" script and decide if the change is acceptable then execute "update" script. If change is not acceptable then you will need to modify the SQL script.

  5. Start Bamboo instance to make sure that DB modified values are properly loaded into caches.

Handle PD in "free-form text" data fields

Dealing with free-form text fields

Always back up your data before performing any modifications to the database. If possible, test any alter, insert, update, or delete SQL commands on a staging server first.

  1. You have to modify the provided SQL file - replace <OLD_VALUE> to the PD that you are searching for and <NEW_VALUE> to "new PD value".

  2. Stop Bamboo instance (this step is required because Bamboo caches a lot of data and updating the database directly when Bamboo is still working could cause data loss/inconsistencies)
  3. Execute script from the SQL file manually one table by one table. For each table execute "select" script and decide if the change is acceptable then execute "update" script. If change is not acceptable, you will need to update record manually. We recommend to edit data in Bamboo if it's possible.

  4. Start Bamboo instance to make sure that DB modified values are properly loaded into caches.

After change actions (if SQL update statements were executed)

If SQL update statements were executed you will have to reindex Bamboo.

  • Reindex Bamboo - Lucene reindex is required because some data are stored and read from Lucene index and after updating DB Lucene index could contain stale data. Reindexing data

Version Compatibility

All workarounds are compatible with Bamboo 6.5 and later.

Limitations

  • SQL statements are using pattern matching so they require manual inspection before each update.

  • MySQL doesn't have the REGEXP_REPLACE function (or any other functions that would work in a similar manner) so we are able to find matching records ignoring case, but we are not able to generate SQL that will update values in a case-insensitive way. Manual inspection/update is needed.

  • Microsoft SQL Server does not support regular expressions to the extent other supported databases - records are matched using the LIKE operator which can match longer substrings. In updates "replace" function is used, which in conjunction with case-insensitive collation will replace all occurrences case-insensitive to case-sensitive replacement eg. replace("and TEST second as test third", "test", "tESt") = "and tESt second as tESt third". 
  • Data could be stored inside third-party plugins and not discovered/altered/deleted via querying DB (plugin tables are not scanned for PD)

Additional notes

There may be limitations based on your product version.

Note, the above-related GDPR workaround has been optimized for the latest version of this product. If you are running on a legacy version of the product, the efficacy of the workaround may be limited. Please consider upgrading to the latest product version to optimize the workarounds available under this article.

Third-party add-ons may store personal data in their own database tables or on the filesystem.

The above article in support of your GDPR compliance efforts applies only to personal data stored within the Atlassian server and data center products. To the extent you have installed third-party add-ons within your server or data center environment, you will need to contact that third-party add-on provider to understand what personal data from your server or data center environment they may access, transfer or otherwise process and how they will support your GDPR compliance efforts.

If you are a server or data center customer, Atlassian does not access, store, or otherwise process the personal data you choose to store within the products. For information about personal data Atlassian processes, see our Privacy Policy.


Last modified on Jun 29, 2018

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.