Confluence: Right to erasure

Introduction

Under Article 17 of the GDPR, individuals have the right to have personal data erased. This is also known as the ‘right to be forgotten’. The right is not absolute and only applies in certain circumstances. Whether or not you are required to honor an individual's request to have personal data deleted will vary on a case-by-case basis, and is determination you should always make with the assistance of legal counsel. Once you have determined you have an obligation to delete personal data, we have provided the following instructions on how to do so within certain Atlassian products.  

Personal data stored within the product can be divided into one of two areas: 1) account-level personal data; and 2) free-form text. Account-level personal data are data fields that exist within the product for the sole purpose of identifying an individual throughout the product. Examples of account-level personal data include the user's display name, profile picture or avatar and email address. These data elements are generally visible from the user's profile and are used throughout the product to point back to the user's profile when the user is @mentioned or tagged on in certain spaces or content. Deleting account-level personal data elements will automatically remove those data elements throughout the product where the relevant account-level data elements appear and in the database (subject to some limitations discussed below). 

If you have included personal data in free-form text, either typed into content spaces or as a custom field label, you will need to use the product's global search feature to surface this personal data and delete it on a case-by-case basis.

Version Compatibility

Confluence 5.10 and higher.

Description

Personal data for a specific user can be spread across multiple components of Confluence. We've compiled workarounds for you, to ensure that you can remove personal data for a specific user. 

Workarounds

User account-level personal data

We've documented all areas of a user's account-level personal data on Confluence Server: Right of access by the data subject. The account-level personal data may be used inside pages via a Confluence macro, can be searched, and is part of the user mentions feature.

Removing the user account-level personal data will prevent the user's account-level personal data (including avatar, display name, and any profile information) from being searched, the user's name will no longer be shown as an author of content, and the user's name will not be available to be mentioned.

To delete a user's account-level personal data, use one of the following methods, depending on whether your Confluence instance is using an internal, delegated or external user directory.

Before attempting any of the workarounds below, please ensure that a backup of your instance is created first. If possible, test the workaround on a staging environment before attempting in your production environment.

Step 1 - Disabling or Removing the User

Internal user directory
  1. Disable the user by following the instructions at Delete or disable users.
  2. After the user is disabled, follow one of the methods in Step 2 below – auto generate the SQL query via a script, or manually create the SQL query.
External user directory - Connector
  1. Delete the user from the External Directory, and perform a resync by following the instructions at Synchronizing data from external directories
  2. After the user is deleted, follow one of the methods in Step 2 below – auto generate the SQL query via a script, or manually create the SQL query.
External user directory - Delegated
  1. Delete the user from the delegated External Directory by following the instructions at Connecting to an Internal Directory with LDAP Authentication
  2. Disable the user by following the instructions at Delete or disable users
  3. After the user is disabled, follow one of the methods in Step 2 below – auto generate the SQL query via a script, or manually create the SQL query.

Step 2 - Running the SQL workaround 

Python script to generate SQL queries per user
  1. Download or clone this repository: https://bitbucket.org/atlassian/gdpr/overview. There are some installation prerequisites before running the script, which is documented in the README file inside the repository.
  2. Run the script, passing the username (the login name of the user) as the first parameter. If it contain spaces, quote it.

    python3 parser4confluence.py -u '<USERNAME>' -f metadata/confluence_db.json -d oracle|postgresql|mysql|mssql
  3. The above script will generate multiple SQL files under the folder confluence_db_queries/<database-name>/

    01_insert_journalentry.sql
    02_delete_OS_PROPERTYENTRY.sql
    03_delete_BODYCONTENT.sql
    04_delete_CONTENTPROPERTIES.sql
    05_delete_IMAGEDETAILS.sql
    06_delete_CONTENT.sql
    07_delete_NOTIFICATIONS.sql
    09_delete_CONTENT.sql
    10_delete_LIKES.sql
    11_delete_CONTENT.sql
    12_delete_cwd_membership.sql
    13_delete_cwd_user_attribute.sql
    14_delete_cwd_user.sql
    15_update_user_mapping.sql
  4. Execute the SQL queries on your database, in the same order as the filenames.
  5. (warning) If you don't have autocommit enabled, make sure to commit your changes to persist on the database.
  6. Flush all caches to force UI to update by following the instructions at Cache statistics
  7. Flush the content index queue by going to the Content Indexing administration, and selecting Queue Contents > Flush Queue.
Manually construct SQL queries per user
  1. Go to this directory https://bitbucket.org/atlassian/gdpr/src/HEAD/confluence_db_queries/?at=master and download the pre-populated SQL scripts for your respective database.
  2. Open the SQL scripts in your preferred text editor.
  3. Replace the username that's already set __username__ with the required username.
  4. Run the SQL queries on the database, in the same order as the filenames.
  5. (warning) If you don't have autocommit enabled, make sure to commit your changes to persist on the database.
  6. Flush all caches to force UI to update by following the instructions at Cache statistics
  7. Flush the content index queue by going to the Content Indexing administration, and selecting Queue Contents > Flush Queue.

Step 3 - Patching the Collaborative Editor

For Confluence instances that have enabled Collaborative Editing (Confluence 6.0, or later), you will need to download the appropriate version of the patched Collaborative Editor Plugin for your version of Confluence. See the table below:

Confluence Version Download Link
5.10.x or earlier Not applicable - no collaborative editing
6.0.x confluence-collaborative-editor-plugin-1.3.24.jar
6.1.x to 6.2.3 confluence-collaborative-editor-plugin-1.4.18.jar

6.2.4 to 6.3.x,

6.4.x,

6.5.x,

6.6.x

6.7.x,

6.8.x

6.9.x
  1. After downloading the add-on jar, install it by going to Cog menuAdd-ons.
  2. Choose Upload add-on, then upload the add-on jar.

This may take several minutes, during which time the Collaborative Editing feature may not be available. This patched version of the Collaborative Editor will replace the bundled version that came with your installation of Confluence.

Uninstalling the collaborative editor plugin

If you wish to uninstall the above patched version of the collaborative editor plugin and restore the bundled version, you must:

  1. Disable the collaborative editing feature (see Administering Collaborative Editing for a guide)
  2. Disable the plugin called 'Synchrony Interop Bootstrap Plugin', by going to the Manage Addons link in the administration console, and searching for the plugin, and click disable
  3. Uninstall the patched version of the collaborative editor plugin
  4. Restart Confluence (which will restore the bundled version of the collaborative editor plugin)
  5. Re-enable the 'Synchrony Interop Bootstrap Plugin'
  6. Re-enable the collaborative editing feature

If these steps are not performed when uninstalling the patched version of the collaborative editor plugin, then the collaborative editor plugin may not re-enable correctly when restarting.

Known Limitations and Issues

Mentions in Pages, Drafts and Comments

Any existing mentions of the deleted user will become "Unknown User (xxxxxxxxxxx)", (where xxxxxxxxxx is the user key stored in the database). In certain parts of Confluence (for example, Activity Stream macro), the deleted user will be displayed as "Anonymous" rather than "Unknown User".

Any mentions that exist in an unpublished draft (see Drafts) at the time the workaround is deployed will remain as they were before the workaround (showing the display name). Those mentions will turn into "Unknown User (xxxxxxxxxx)"  when the draft is published. In Confluence version 6.6.x or earlier, the mention may be changed into a link to the current page, with the text being the old username of the deleted user, appended to the '~' (tilde) character, rather than 'Unknown User'.

There is currently no method to automatically force a publish of all drafts by the administrator, or see a list of all unpublished drafts. Each user can view their own unpublished drafts in their recently worked on list. They can also go to their user menu, and click on the draft item. An administrator may choose to turn off collaborative editing, which will remove the unpublished drafts. However, this method may cause data loss of the unpublished drafts, and is not advised unless all deleted user mentions must be removed at all costs (regardless of any content loss).

If a deleted user is mentioned on a page after performing the workarounds above, there may be an error when publishing the page. Refreshing the page, or removing the mention should fix the problem. However, if the Collaborative Editing feature is turned off, the user mention will need to be deleted from the page to fix the problem.

Similarly, if the deleted user is mentioned when adding or editing a comment, the save will fail with an error. The workaround is to remove the mention of the deleted user in the comment before saving. This issue affects both inline comments, as well as image attachment comments.

Clearing the browser local storage will remove the ability to mention the deleted user (see the Browser Cache section below for details), and thus preventing the publishing problem from occurring for both pages and comments.

Workbox Notifications

Workbox Notifications (see Workbox notifications) will continue to have the display name of the user that created the notification. They are automatically cleared by Confluence regularly, and old entries will be removed after 28 days. These jobs are run once per day, starting from the time the server starts up.


Filesystem

Search Index

Search indexes are files stored in the Confluence home directory (under /index), and they are essential for the search features. These indexes store some information that are also stored in the database, for example, the display name, email address, and username.

The 'User account-level personal data' workarounds above will trigger an update to the index, which will remove the user's personal data from the index. The administrator may also opt to rebuild the index by follow the instructions at How to Rebuild the content indexes from scratch on Confluence Server. Rebuilding the index should not be necessary, if all the steps in the 'User account-level personal data' section are followed.

Location of Access Logs

Access logging may be enabled in Confluence by following the instructions at Audit Confluence using the Tomcat valve component and How to enable user access logging. If access logging is enabled, the username of the user accessing a page, as well as some URLs, will remain in the access log files. These logs are not accessible directly via Confluence, but may be accessed by an administrator.

A non-exhaustive list of possible personal data that could be in the log files are:

  • IP address
  • Username/Display Name
  • Email address

The location of the access logs is <confluence install>/logs/conf_access_log.log (if created by using the instructions at Audit Confluence using the Tomcat valve component). The location of the Confluence logs is at <confluence home>/logs/atlassian-confluence.log. Please note that logging parameters (such as the log file location, and contents) may be different due to configuration options, and/or third-party add-ons. Administrators are advised to check the contents of the logs, and remove them if required.

Personal Spaces

Personal spaces are spaces created by a user, where the space key is their username. If any content is created in Confluence that links to this personal space, the link will contain, as part of it's URL, the username of the user to which the personal space belongs. Deleting the user from Confluence does not remove their personal space or change the urls. However, moving the pages being linked to, will update those URLs. 

If the administrator wants the content in the personal space to be preserved, but still require the deletion of the username from the URLs, then a workaround is to first move the content to a new space, then remove the personal space.

Moving pages to a new Space

Create a new space, then follow the instructions at Move and reorder pages. This should update all URLs linking to those moved pages, which should eliminate the username in those URLs from appearing in the future.

Deleting a Personal Space

An administrator can also delete the personal space by following the instructions at Delete a space. An administrator may need to grant themselves delete permission on the space, by following the instructions at Assign space permissions.

Alternatively, use the REST api for space removal at Confluence REST api documentation

Limitations

The URLs that contain the username must be created using the "link to page" feature in Confluence. If those links are added as direct web links, then they will not be automatically updated.

Please read Links for details on links to specific types of content. 

Free-form textual personal data in the database

Other potential sources of personal data which could be stored in Confluence's database include:

  • Free-form text in Pages, Blogs, Comments, and other custom content that may be added by third party add-ons from the Atlassian Marketplace.
  • Audit logs contain information about configuration changes made to Confluence. These audit logs store the username of the user who performed the change.
  • Free-form text in customizations made to Confluence, such as custom site headers and footers, site title, and custom user macros and templates.

Free form text personal data in content (pages, blogs, comments)

For free-form text in pages, blogs, comments, and other content, the search feature should be used to identify sources of personal data that a user requests to be deleted. Read Confluence search fields for a list of fields and syntax that can be used to locate any personal data.

When a page or comment is found to contain personal data which needs to be scrubbed, the administrator will need to edit the page and remove it. Confluence, however, stores historical versions of pages, which may also need to be deleted manually by clicking on the delete link in the page history. Please following the instructions at How to remove all previous versions of a page manually in the database using SQL commands, if bulk removal is required. 

Audit Logs

Audit logs can have a retention period defined (by default, of 3 years). This can be changed, and Confluence will automatically remove older entries. Please read Audit logs for information about administering them.

Free form text personal data in Confluence customizations and configuration

There are various free form text fields in which an administrator may be able to add personal data as part of configuring Confluence. The administrator is advised to look through any customizations made, and check that no deleted user's personal data remains.

Here is a non-exhaustive list:

Location Please read
Site theme and other layout customizations Changing the look and feel of Confluence
Administrator contact page Configuring the administrator contact page
Custom site header and footer via Custom HTML Styling Confluence with CSS
Site Title Changing the site title 
User Macros Writing user macros
Page Templates and Blueprints Administering site templates
Shortcut Links Configuring shortcut links
PDF export customizations Customize exports to PDF 
Email template customizations Customizing email templates
Interface text customizations Modifying Confluence interface text

Browser Cache

The browser may cache a large amount of data for performance reasons. Confluence doesn't completely control the browser's behaviour regarding such caching. Therefore, there are some cases where removal of personal data automatically from the browser's cache is not feasible for Confluence, and must be done in each individual browser client.

For Chrome, local storage may be cleared by navigating to chrome://settings/siteData , and finding the Confluence website url, clicking on the 'Local Storage' section, and then click remove all. For Firefox, this can be done in Settings, by following these instructions. However, with browsers constantly evolving, these instructions may change. Please see the browser vendor's documentation for clearing local storage for the most up-to-date instructions.

Mentions

Browser local storage is used to cache a list of recently used mentions for the logged-in user, and is never persisted on the Confluence server database. However, being cached in the browser means that any users who have had previously mentioned a user which has since been deleted, would continue to see that user in their mentions list. This list can be cleared from the browser directly by following the browser's documentation on clearing local storage. This list also continuously updates as the logged-in user makes new mentions of other users, and the deleted users will eventually stop showing up as they are no longer mentioned.

The mentions list may also contain a URL of the avatar image for the deleted user. This URL is defunct after running the SQL workaround, but the browser may be caching the image located at this URL. Therefore, some users may still continue to see the avatar when the mentions feature is used. The avatar should expire from the browser's cache after some time, but the exact timing may differ based on the browser's configuration.

Avatars

Avatars are images uploaded to Confluence by users into their profile, and they may be displayed when Confluence is listing users (such as in a mentions list). The avatars should be deleted from the Confluence database after running the workaround SQL, but these images are also cached by the browser, and as such, are not completely under the control of Confluence. The browser may periodically refresh/purge their cache, and so the avatars ought to eventually disappear, but exact timing depends on the browser's configuration. 

Additional notes

There may be limitations based on your product version.

Note, the above-related GDPR workaround has been optimized for the latest version of this product. If you are running on a legacy version of the product, the efficacy of the workaround may be limited. Please consider upgrading to the latest product version to optimize the workarounds available under this article.

Third-party add-ons may store personal data in their own database tables or on the filesystem.

The above article in support of your GDPR compliance efforts applies only to personal data stored within the Atlassian server and data center products. To the extent you have installed third-party add-ons within your server or data center environment, you will need to contact that third-party add-on provider to understand what personal data from your server or data center environment they may access, transfer or otherwise process and how they will support your GDPR compliance efforts.

If you are a server or data center customer, Atlassian does not access, store, or otherwise process the personal data you choose to store within the products. For information about personal data Atlassian processes, see our Privacy Policy.

Last modified on May 11, 2018

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.