_confluence_right_to_erasure

_inclusions

On this page

Still need help?

The Atlassian Community is here for you.

Ask the community

Version Compatibility

Confluence 5.10 and higher.

Description

Personal data for a specific user can be spread across multiple components of Confluence.

User account-level personal data

We've documented all areas of a user's account-level personal data on Confluence: Right of access by the data subject. The account-level personal data may be used inside pages via a Confluence macro, can be searched, and is part of the user mentions feature.

Removing the user account-level personal data will prevent the user's account-level personal data (including avatar, display name, and any profile information) from being searched, the user's name will no longer be shown as an author of content, and the user's name will not be available to be mentioned.

How you remove account-level personal data depends on your Confluence version:

Filesystem

Search Index

If the search platform is Lucene, the search indexes are saved as files in the Confluence home directory under /index. In the case of OpenSearch as the search platform, the search indexes are stored within OpenSearch.

These indexes contain information, such as the display name, email address, and username, which are also stored in the database. They are essential for search features.

In Confluence 6.13 and later, after a user account is deleted, all content the user had created, contributed to, or been @mentioned on is automatically reindexed. 

In Confluence 6.12 and earlier, the 'user account-level personal data' SQL workaround will trigger an update to the index, which will remove the user's personal data from the index. The administrator may also opt to rebuild the index by following the instructions at How to Rebuild the Content Indexes From Scratch on Confluence Server. Rebuilding the index should not be necessary, if all the steps in the 'user account-level personal data' SQL workaround section are followed.

Location of Access Logs

Access logging is enabled by default from Confluence 7.11, and may have been manually enabled in earlier versions by following the instructions at Audit Confluence using the Tomcat valve component and How to enable user access logging. If access logging is enabled, the username of the user accessing a page, as well as some URLs, will remain in the access log files. These logs are not accessible directly via Confluence, but may be accessed by an administrator.

A non-exhaustive list of possible personal data that could be in the log files are:

  • IP address
  • Username/Display Name
  • Email address

The location of the access logs is <confluence install>/logs/conf_access_log.log (from Confluence 7.11 or if created by using the instructions at Audit Confluence using the Tomcat valve component). Access log data can also be written to the application logs at <confluence home>/logs/atlassian-confluence.log. 

Please note that logging parameters (such as the log file location, and contents) may be different due to configuration options, and/or third-party add-ons. Administrators are advised to check the contents of the logs, and remove them if required.

Location of data pipeline export files

From Confluence 7.12, system administrators can use the data pipeline feature to export current state data from the Confluence database for analysis in an external business intelligence tool. The data is exported in CSV format, and stored in the <confluence home>/data-pipeline/ directory. This directory can be found in the local home directory for non-clustered installations, or in the shared home directory when Confluence is running in a cluster.

A non-exhaustive list of personal data that could be in the export files include:

  • Username / display name
  • Email address
  • Free text including space titles, page titles, raw page content, and raw comment content.

Administrators are advised to check the contents of these exports and remove data if required.

Personal Spaces

Personal spaces are spaces created by a user, where the space key is their username. If any content is created in Confluence that links to this personal space, the link will contain, as part of its URL, the username of the user to which the personal space belongs. Deleting the user from Confluence does not remove their personal space or change the URLs. However, moving the pages being linked to, will update those URLs. 

If the administrator wants the content in the personal space to be preserved, but still require the deletion of the username from the URLs, they should first move the content to a new space, then remove the personal space.

Moving pages to a new Space

Create a new space, then follow the instructions at Move and Reorder Pages. This should update all URLs linking to those moved pages, which should eliminate the username in those URLs from appearing in the future.

Deleting a Personal Space

An administrator can also delete the personal space by following the instructions at Delete a Space. An administrator may need to grant themselves delete permission on the space, by following the instructions at Assign space permissions.

Alternatively, use the REST API for space removal at Confluence REST API documentation

Limitations

The URLs that contain the username must be created using the "link to page" feature in Confluence. If those links are added as direct web links, then they will not be automatically updated.

Please read Links for details on links to specific types of content. 

Free-form textual personal data in the database

Other potential sources of personal data which could be stored in Confluence's database include:

  • Free-form text in Pages, Blogs, Comments, and other custom content that may be added by third party add-ons from the Atlassian Marketplace.
  • Free-form text in mentions, where the user's name was been overtyped.
  • Audit logs contain information about configuration changes made to Confluence. These audit logs store the username of the user who performed the change.
  • Free-form text in customizations made to Confluence, such as custom site headers and footers, site title, and custom user macros and templates.

Free-form text personal data in content (pages, blogs, comments)

For free-form text in pages, blogs, comments, and other content, the search feature should be used to identify sources of personal data that a user requests to be deleted. Read Confluence Search Fields for a list of fields and syntax that can be used to locate any personal data.

When a page or comment is found to contain personal data which needs to be scrubbed, the administrator will need to edit the page and remove it. Confluence, however, stores historical versions of pages, which may also need to be deleted manually by clicking on the delete link in the page history. Please following the instructions at How to Remove all Previous Versions of a Page Manually in the Database Using SQL Commands, if bulk removal is required. 

Free-form text personal data in mentions

If the mention name as it displays on the page is changed, for example to include just the first name or a preferred name, this is treated as free-form text. The free-form text will still display, even after the user account has been deleted. 

Audit Logs

Audit logs can have a retention period defined (by default, 3 years). This can be changed, and Confluence will automatically remove older entries. Please read Auditing in Confluence for information about administering them.

Free-form text personal data in Confluence customizations and configuration

There are various free form text fields in which an administrator may be able to add personal data as part of configuring Confluence. The administrator is advised to look through any customizations made, and check that no deleted user's personal data remains.

Here is a non-exhaustive list:

LocationPlease read
Site theme and other layout customizationsChanging the Look and Feel of Confluence
Administrator contact pageConfiguring the Administrator Contact Page
Custom site header and footer via Custom HTMLStyling Confluence with CSS
Site TitleChanging the Site Title 
User MacrosWriting User Macros
Page Templates and BlueprintsAdministering Site Templates
Shortcut LinksConfiguring Shortcut Links
PDF export customizationsCustomize Exports to PDF 
Email template customizationsCustomizing Email Templates
Interface text customizationsModify Confluence Interface Text

Synchrony data

If you have collaborative editing enabled, every keystroke in the editor is stored by Synchrony in the Confluence database. This means that any references to a person's full name, user name, or other personal information typed in the editor will remain in the Synchrony tables in the database, seperately to where the page or comment content is stored.  This data remains in the relevant Synchrony tables, even after the pages or comments themselves have been deleted. 

In Confluence 7.0 and later two scheduled jobs are available to remove Synchrony data:

  • The Synchrony data eviction (soft) job evicts all Synchrony data for any pages / blog posts that have not been modified in the last 3 days, and do not have an active editor session. This job runs every 10 minutes by default. This job helps keep your database tables small. 
  • The Synchrony data eviction (hard) job evicts all Synchrony data for any pages / blog posts that are 15 days or older, regardless of whether they've been modified more recently. This job is disabled by default, but can be scheduled to run on a regular basis. This job ensures there is no Synchrony data older than 15 days in your database. 

See How to remove Synchrony data for more information. 

In Confluence 6.x versions, there is a workaround to remove this data

Browser Cache

The browser may cache a large amount of data for performance reasons. Confluence doesn't completely control the browser's behaviour regarding such caching. Therefore, there are some cases where removal of personal data automatically from the browser's cache is not feasible for Confluence, and must be done in each individual browser client.

For Chrome, local storage may be cleared by navigating to chrome://settings/siteData , and finding the Confluence website url, clicking on the 'Local Storage' section, and then click remove all. For Firefox, this can be done in Settings, by following these instructions. However, with browsers constantly evolving, these instructions may change. Please see the browser vendor's documentation for clearing local storage for the most up-to-date instructions.

Mentions

Browser local storage is used to cache a list of recently used mentions for the logged-in user, and is never persisted on the Confluence server database. However, being cached in the browser means that any users who have had previously mentioned a user which has since been deleted, would continue to see that user in their mentions list. This list can be cleared from the browser directly by following the browser's documentation on clearing local storage. This list also continuously updates as the logged-in user makes new mentions of other users, and the deleted users will eventually stop showing up as they are no longer mentioned.

The mentions list may also contain a URL of the avatar image for the deleted user. This URL is defunct after deleting the user or running the SQL workaround, but the browser may be caching the image located at this URL. Therefore, some users may still continue to see the avatar when the mentions feature is used. The avatar should expire from the browser's cache after some time, but the exact timing may differ based on the browser's configuration.

Avatars

Avatars are images uploaded to Confluence by users into their profile, and they may be displayed when Confluence is listing users (such as in a mentions list). 

In Confluence 6.13 and later, when you delete a user account their avatar is deleted.  

In Confluence 6.12 and earlier, when you use the 'user account-level personal data' SQL workaround the user's avatar is deleted. 

In all versions of Confluence these images are also cached by the browser, and as such, are not completely under the control of Confluence. The browser may periodically refresh/purge their cache, and so the avatars ought to eventually disappear, but exact timing depends on the browser's configuration. 

Workarounds

User account-level personal data

This workaround only applies to Confluence 6.12 or earlier

To delete a user's account-level personal data, use one of the following methods, depending on whether your Confluence instance is using an internal, delegated or external user directory.

Before attempting any of the workarounds below, please ensure that a backup of your instance is created first. If possible, test the workaround on a staging environment before attempting in your production environment.

Step 1 - Disabling or Removing the User

Internal user directory
  1. Disable the user by following the instructions at Delete or Disable Users.
  2. After the user is disabled, follow one of the methods in Step 2 below – auto generate the SQL query via a script, or manually create the SQL query.
External user directory - Connector
  1. Delete the user from the External Directory, and perform a resync by following the instructions at Synchronizing data from external directories
  2. After the user is deleted, follow one of the methods in Step 2 below – auto generate the SQL query via a script, or manually create the SQL query.
External user directory - Delegated
  1. Delete the user from the delegated External Directory by following the instructions at Connecting to an Internal Directory with LDAP Authentication
  2. Disable the user by following the instructions at Delete or Disable Users
  3. After the user is disabled, follow one of the methods in Step 2 below – auto generate the SQL query via a script, or manually create the SQL query.

Step 2 - Running the SQL workaround 

Some parts of the process, such as removing account-level personal data from the search index, can have a performance impact on your site. You may want to run these scripts at a time that would have the least impact on your users. We found it took approximately 30 minutes to run the scripts on a site with about 10 million pages.

Python script to generate SQL queries per user
  1. Download or clone this repository: https://bitbucket.org/atlassian/gdpr/overview. There are some installation prerequisites before running the script, which is documented in the README file inside the repository.
  2. Run the script, passing the username (the login name of the user) as the first parameter. If it contain spaces, quote it.

    python3 parser4confluence.py -u '<USERNAME>' -f metadata/confluence_db.json -d oracle|postgresql|mysql|mssql
  3. The above script will generate multiple SQL files under the folder confluence_db_queries/<database-name>/

    01_insert_journalentry.sql
    02_delete_OS_PROPERTYENTRY.sql
    03_delete_BODYCONTENT.sql
    04_delete_CONTENTPROPERTIES.sql
    05_delete_IMAGEDETAILS.sql
    06_delete_CONTENT.sql
    07_delete_NOTIFICATIONS.sql
    09_delete_CONTENT.sql
    10_delete_LIKES.sql
    11_delete_CONTENT.sql
    12_delete_cwd_membership.sql
    13_delete_cwd_user_attribute.sql
    14_delete_cwd_user.sql
    15_update_user_mapping.sql
  4. Execute the SQL queries on your database, in the same order as the filenames.
  5. (warning) If you don't have autocommit enabled, make sure to commit your changes to persist on the database.
  6. Flush all caches to force UI to update by following the instructions at Cache Statistics
  7. Flush the content index queue by going to the Content Indexing administration, and selecting Queue Contents > Flush Queue.
Manually construct SQL queries per user
  1. Go to this directory https://bitbucket.org/atlassian/gdpr/src/HEAD/confluence_db_queries/?at=master and download the pre-populated SQL scripts for your respective database.
  2. Open the SQL scripts in your preferred text editor.
  3. Replace the username that's already set __username__ with the required username.
  4. Run the SQL queries on the database, in the same order as the filenames.
  5. (warning) If you don't have autocommit enabled, make sure to commit your changes to persist on the database.
  6. Flush all caches to force UI to update by following the instructions at Cache statistics
  7. Flush the content index queue by going to the Content Indexing administration, and selecting Queue Contents > Flush Queue.

Step 3 - Patching the Collaborative Editor

For Confluence instances running Confluence 6.0.x to Confluence 6.9.x  that have enabled Collaborative Editing, you will need to download the appropriate version of the patched Collaborative Editor Plugin for your version of Confluence. See the table below:

Confluence VersionDownload Link
5.10.x or earlierNot applicable - no collaborative editing
6.0.xconfluence-collaborative-editor-plugin-1.3.24.jar
6.1.x to 6.2.3confluence-collaborative-editor-plugin-1.4.18.jar

6.2.4 to 6.3.x,

6.4.x,

6.5.x,

6.6.x

6.7.x,

6.8.x

6.9.x
6.10.x and laterPatching the collaborative editor is not required.
  1. After downloading the add-on jar, install it by going to Cog menu > Add-ons.
  2. Choose Upload add-on, then upload the add-on jar.

This may take several minutes, during which time the Collaborative Editing feature may not be available. This patched version of the Collaborative Editor will replace the bundled version that came with your installation of Confluence.

Uninstalling the collaborative editor plugin

If you wish to uninstall the above patched version of the collaborative editor plugin and restore the bundled version, you must:

  1. Disable the collaborative editing feature (see Administering Collaborative Editing for a guide). Note that unpublished changes may be lost. Consider asking people to publish their pages first.
  2. Disable the plugin called 'Synchrony Interop Bootstrap Plugin', by going to the Manage Addons link in the administration console, and searching for the plugin, and click disable
  3. Uninstall the patched version of the collaborative editor plugin
  4. Restart Confluence (which will restore the bundled version of the collaborative editor plugin)
  5. Re-enable the 'Synchrony Interop Bootstrap Plugin'
  6. Re-enable the collaborative editing feature

If these steps are not performed when uninstalling the patched version of the collaborative editor plugin, then the collaborative editor plugin may not re-enable correctly when restarting.

Known Limitations and Issues

Mentions in Pages, Drafts and Comments

Any existing mentions of the deleted user will become "Unknown User (xxxxxxxxxxx)", (where xxxxxxxxxx is the user key stored in the database). In certain parts of Confluence (for example, Activity Stream macro), the deleted user will be displayed as "Anonymous" rather than "Unknown User".

Any mentions that exist in an unpublished draft (see Drafts) at the time the workaround is deployed will remain as they were before the workaround (showing the display name). Those mentions will turn into "Unknown User (xxxxxxxxxx)"  when the draft is published. In Confluence version 6.6.x or earlier, the mention may be changed into a link to the current page, with the text being the old username of the deleted user, appended to the '~' (tilde) character, rather than 'Unknown User'.

There is currently no method to automatically force a publish of all drafts by the administrator, or see a list of all unpublished drafts. Each user can view their own unpublished drafts in their recently worked on list. They can also go to their user menu, and click on the draft item. An administrator may choose to turn off collaborative editing, which will remove the unpublished drafts. However, this method may cause data loss of the unpublished drafts, and is not advised unless all deleted user mentions must be removed at all costs (regardless of any content loss).

If a deleted user is mentioned on a page after performing the workarounds above, there may be an error when publishing the page. Refreshing the page, or removing the mention should fix the problem. However, if the Collaborative Editing feature is turned off, the user mention will need to be deleted from the page to fix the problem.

Similarly, if the deleted user is mentioned when adding or editing a comment, the save will fail with an error. The workaround is to remove the mention of the deleted user in the comment before saving. This issue affects both inline comments, as well as image attachment comments.

Clearing the browser local storage will remove the ability to mention the deleted user (see the Browser Cache section on this page for details), and thus preventing the publishing problem from occurring for both pages and comments.

Workbox Notifications

Workbox Notifications (see Workbox Notifications) will continue to have the display name of the user that created the notification. They are automatically cleared by Confluence regularly, and old entries will be removed after 28 days. These jobs are run once per day, starting from the time the server starts up.

Synchrony data

This workaround only applies to Confluence 6.x versions.  

If you have collaborative editing enabled, every keystroke in the editor is stored by Synchrony in the Confluence database. This means that any references to a person's full name, user name, or other personal information typed in the editor will remain in the Synchrony tables in the database, separately to where the page or comment content is stored. 

A workaround is to truncate the relevant Synchrony tables in the database. See How to reduce the size of Synchrony tables to find out how to truncate these tables. 

Known issues

There are a few known issues that you should be aware of, where personally identifiable information remains after running the SQL workaround scripts or deleting the user in the UI.

CONFSERVER-55963 - Getting issue details... STATUS

6.11.0 and earlier

After running the SQL workaround scripts, the deleted user's full name and username may remain in the AO_9412A1_AOTASK and AO_9412A1_AONOTIFICATION database tables. It is not visible in the UI. 

From Confluence 6.11.1 this residual data is automatically cleaned up by a scheduled job, which runs once a day, and when Confluence starts up. 

CONFSERVER-55951 - Getting issue details... STATUS

6.11.0 and earlier

After running the SQL workaround scripts, the deleted user's username may remain in the AO_187CCC_SIDEBAR_LINK  database table. It is not visible in the UI. 

This issue was resolved in Confluence 6.11.1.  An upgrade task will automatically remove this data when you upgrade to 6.11.1.

CONFSERVER-55952 - Getting issue details... STATUS

6.12.x and earlier

After running the SQL workaround scripts, the deleted user's username may remain in the bandana database table. It is not visible in the UI. 

A fix for this issue will be available in a future Confluence release.

CONFSERVER-55755 - Getting issue details... STATUS

All versions

The deleted user's full name may remain in the AO_950DC3_TC_EVENTS database table. It is not visible in the UI, and only applies if you have the Team Calendars for Confluence add-on. 

A fix for this issue will be available in a future Confluence release.

CONFSERVER-56354 - Getting issue details... STATUS


6.12.x and earlier

After running the SQL workaround scripts, the deleted user's username may still be present in the search index, and may appear in search results in the UI.

A fix for this issue is available in the latest version of the SQL workaround scripts.

CONFSERVER-57553 - Getting issue details... STATUS

6.13.0 and later

After deleting a user, their username may still appear in search results if they have been mentioned in a task, but are not the assignee (they were not the first person mentioned).

See the issue for more details and a workaround.

CONFSERVER-57554 - Getting issue details... STATUS

6.13.0 and later

After deleting a user, their username may appear in the All Updates tab of the dashboard, if they have mentioned themselves in a comment. Their username will also remain in the Lucene index, but will not be returned in search results.

See the issue for more details and a workaround.

CONFSERVER-59042 - Getting issue details... STATUS

6.15 and laterAfter renaming a user whose original full name included umlauts, their original full name still appears in mentions on pages and blog posts.
Last modified on Jul 25, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.