Disk space hotspots and cleanup best practices in Bamboo

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform Notice: Server and Data Center Only - This article only applies to Atlassian products on the server and data center platforms.

 

Summary

Please follow this guide to find out which are the most space consuming areas in your Bamboo system.

This guide will help you with:

  • Determining which files are contributing to the space and see if there is an unusual behaviour or if it just increased in normal usage.
  • Best practice on cleanups and what can be done to reduce the size or lower the increase moving forward
  • Tips on narrowing down on which plans are using the most space as well as best practices on build expiry, branch expiry and how to find plans that are overriding global expiry.

Diagnosis

Once you detect abnormal use of disk space on your Bamboo server, it is advised to validate with your development team if there is anything unusual going on with the development of applications that may be generating excessive builds. That may be related to a specific task, a sprint, a spike or something urgent within your company. That phenomenon could generate multiple builds that may be putting an extra load on your CI/CD infrastructure, leading to this unexpected growth.

Filesystem usage

Firstly, list the directories and their sizes. This strategy will tell you the occupied space on a per-plan basis.

Artifacts

Find large artifacts

This command will list the disk space used by each artifact within a plan. Please check the next session on how to match an artifact with its respective build.

Find large artifacts
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# du -h ${BAMBOO_HOME}/artifacts/ --max-depth 1 | sort -rh
60K	/var/atlassian/application-data/bamboo/artifacts
32K	/var/atlassian/application-data/bamboo/artifacts/plan-15106049
12K	/var/atlassian/application-data/bamboo/artifacts/globalStorage
4.0K	/var/atlassian/application-data/bamboo/artifacts/plan-6062084
0	/var/atlassian/application-data/bamboo/artifacts/tmp
0	/var/atlassian/application-data/bamboo/artifacts/plan-1638401
(...)

More information:  How do I know what Bamboo plan is storing artifacts in which directory on disk?

Run a SQL query to link each artifact with its respective builds

This query will list all artifacts that are still valid, along with their sizes, Project/Plan, Location and Build dates. This is really useful if you are looking to remove a specific Build that is occupying a lot of space. The query below works on PostgreSQL.

Link each artifact with their respective builds
SELECT build.build_type,
       build.full_key,
       artifact.build_number,
       artifact.chain_artifact                             AS SHARED,
       artifact.artifact_size                              AS SIZE,
       Concat('<bamboo-home>', '/artifacts/', storage_tag) AS Location,
       buildresultsummary.build_date
FROM   build
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND artifact.link_type LIKE '%BambooRemoteArtifactHandler'
ORDER  BY plan_key,
          build_number,
          shared; 
 build_type | full_key | build_number | shared | size |              location                 |       build_date        
------------+----------+--------------+--------+------+---------------------------------------+-------------------------
 CHAIN      | BAM-TOM  |            1 | t      | 4606 | <bamboo-home>/artifacts/plan-15106049 | 2021-06-17 12:24:22.883
 CHAIN      | BAM-TOM  |            3 | t      | 4606 | <bamboo-home>/artifacts/plan-15106049 | 2021-06-17 12:25:55.841
 CHAIN      | BAM-TOM  |            4 | t      | 4606 | <bamboo-home>/artifacts/plan-15106049 | 2021-06-17 12:29:48.19
 CHAIN      | BAM-TOM  |            5 | t      | 4606 | <bamboo-home>/artifacts/plan-15106049 | 2021-06-17 12:30:52.179
 CHAIN      | MSP-BA   |           12 | t      |   66 | <bamboo-home>/artifacts/plan-6062084  | 2021-05-31 14:23:49.514
(5 rows) 

Find cumulative values for each Default branch and any Plan-branches

Find cumulative values for each Default branch and any Plan-branches
SELECT build.build_type,
       build.full_key,
       artifact.chain_artifact                             AS SHARED,
       Sum(artifact.artifact_size)                         AS SIZE,
       Concat('<bamboo-home>', '/artifacts/', storage_tag) AS Location
FROM   build
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND artifact.link_type LIKE '%BambooRemoteArtifactHandler'
GROUP  BY artifact.chain_artifact,
          build.build_type,
          build.full_key,
          build.storage_tag,
          artifact.plan_key
ORDER  BY full_key,
          size DESC,
          build_type,
          full_key; 
  build_type  | full_key | shared |   size   |              location                  
--------------+----------+--------+----------+--------------------------------------
 CHAIN        | BAM-TOM  | t      |    18424 | <bamboo-home>/artifacts/plan-15106049
 CHAIN        | MSP-BA   | t      |      264 | <bamboo-home>/artifacts/plan-6062084
 CHAIN_BRANCH | MSP-BA4  | t      |  5242946 | <bamboo-home>/artifacts/plan-20054025
 CHAIN_BRANCH | MSP-BA5  | t      | 10485826 | <bamboo-home>/artifacts/plan-20054026
 CHAIN_BRANCH | MSP-BA6  | t      | 31457346 | <bamboo-home>/artifacts/plan-20054027
 CHAIN_BRANCH | MSP-BA7  | t      | 26214466 | <bamboo-home>/artifacts/plan-20054028
(6 rows)

If you need specific totals for the Default plans or Plan branches report

Totals for Default plan builds
SELECT build.build_type,
       build.full_key,
       artifact.chain_artifact                             AS SHARED,
       Sum(artifact.artifact_size)                         AS SIZE,
       Concat('<bamboo-home>', '/artifacts/', storage_tag) AS Location
FROM   build
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND artifact.link_type LIKE '%BambooRemoteArtifactHandler'
       AND build.master_id IS NULL
GROUP  BY artifact.chain_artifact,
          build.build_type,
          build.full_key,
          build.storage_tag,
          artifact.plan_key
ORDER  BY full_key,
          size DESC,
          build_type,
          full_key; 
 build_type | full_key | shared | size  |              location                  
------------+----------+--------+-------+--------------------------------------
 CHAIN      | BAM-TOM  | t      | 18424 | <bamboo-home>/artifacts/plan-15106049
 CHAIN      | MSP-BA   | t      |   264 | <bamboo-home>/artifacts/plan-6062084
(2 rows)
Totals for Branch plan builds
SELECT BM.build_type,
       BM.full_key,
       artifact.chain_artifact     AS SHARED,
       Sum(artifact.artifact_size) AS SIZE
FROM   build
       JOIN build BM
         ON build.master_id = BM.build_id
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND artifact.link_type LIKE '%BambooRemoteArtifactHandler'
GROUP  BY artifact.chain_artifact,
          BM.build_type,
          BM.full_key,
          BM.storage_tag
ORDER  BY full_key,
          size DESC,
          build_type,
          full_key; 
 build_type | full_key | shared |   size   
------------+----------+--------+----------
 CHAIN      | MSP-BA   | t      | 73400584
(1 row)

Check for artifacts in Global Storage

Globally stored artifacts will be located in a folder within <bamboo-home>/artifacts/globalStorage. Artifacts on this location had their build results expired (I.e. cleaned-up, removed) and are there if there is still a deployment plan referencing it. This way if deployments of older versions will not fail if actioned.

Find largest artifacts in Global Storage
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# du -h ${BAMBOO_HOME}/artifacts/globalStorage --max-depth 1 | sort -rh
12K	/var/atlassian/application-data/artifacts/globalStorage
4.0K	/var/atlassian/application-data/artifacts/globalStorage/8617988
4.0K	/var/atlassian/application-data/artifacts/globalStorage/6520840
4.0K	/var/atlassian/application-data/artifacts/globalStorage/6520836

Check the following documentation for specific queries on how to identify and locate Global Storage artifacts:

Build results and logs

Depending on your workload, build jobs may end up generating tons of logs that may be impacting your disk space threshold. That means that even if your build plan is not using any significant disk space on artifacts, it may be using it on their logs. 

Find largest build logs
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# find ${BAMBOO_HOME}/xml-data/builds/*/download-data/build_logs -maxdepth 1 ! -name "." -type d -print0 | xargs -0 -n1000 du -h | sort -rh
704K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-11927555-JOB1/download-data/build_logs
68K	/var/atlassian/application-data/bamboo/xml-data/builds/15335425-14712834/download-data/build_logs
32K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-6062084-JOB1/download-data/build_logs
20K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-15106049-JOB1/download-data/build_logs
16K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-688129-JOB1/download-data/build_logs
16K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-1638401-JOB1/download-data/build_logs
16K	/var/atlassian/application-data/bamboo/xml-data/builds/8388609-8552449/download-data/build_logs
8.0K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-11927557-JOB1/download-data/build_logs
8.0K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-10649603-JOB1/download-data/build_logs
8.0K	/var/atlassian/application-data/bamboo/xml-data/builds/6225921-6389761/download-data/build_logs
4.0K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-11927555/download-data/build_logs
4.0K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-10649603/download-data/build_logs
4.0K	/var/atlassian/application-data/bamboo/xml-data/builds/plan-10649601-RUN/download-data/build_logs
0	/var/atlassian/application-data/bamboo/xml-data/builds/plan-6062084/download-data/build_logs
0	/var/atlassian/application-data/bamboo/xml-data/builds/plan-1638401-JSJ/download-data/build_logs
0	/var/atlassian/application-data/bamboo/xml-data/builds/plan-1638401-GPFV/download-data/build_logs
0	/var/atlassian/application-data/bamboo/xml-data/builds/plan-1638401/download-data/build_logs
0	/var/atlassian/application-data/bamboo/xml-data/builds/plan-1638401-CHEC/download-data/build_logs
0	/var/atlassian/application-data/bamboo/xml-data/builds/plan-1638401-BUIL/download-data/build_logs 

Once you understand which Plans are good candidates for a cleanup, you can adjust individual Plan expiry for that specific plan or even modify your Global expiry settings to be more aggressive.

More information about important Bamboo directories:

Count build results

The build results volume that is stored in the Bamboo database has a very important impact on system performance.

Run this SQL statement to find the number of Build results per Plan. You can then use these numbers to plan a more aggressive individual Plan expiry.

Count build results
SELECT CB.full_key,
       Count(DISTINCT CSR.chainresult_id) AS chainresults
FROM   build B
       JOIN buildresultsummary BRS
         ON B.full_key = BRS.build_key
       JOIN chain_stage_result CSR
         ON BRS.chain_result = CSR.chainresult_id
       JOIN chain_stage CS
         ON B.stage_id = CS.stage_id
       JOIN build CB
         ON CS.build_id = CB.build_id
GROUP  BY CB.full_key
ORDER  BY Count(DISTINCT CSR.chainresult_id) DESC;
  full_key   | chainresults 
-------------+--------------
 MY-PROJ1    |           201
 BAM-TOM     |           113
 LARD-MOM    |           102
 MSP-BA      |            83
 MFP-MVFP    |            62
 BAM-FOO     |            52
 BAM-BOO     |            44
 DRA-MAIN    |            20
 PRJ-PLANKEY |            12
(9 rows)

Look for Plan branches

Plan branches are linked to a repository and will be generated once a new Branch is created.

It is important to understand if those Plan branches can be cleaned up after use.

Show plan branches
SELECT build_id,
       build_type,
       created_date,
       updated_date,
       full_key
FROM   build
WHERE  marked_for_deletion IS NOT NULL
       AND build_type = 'CHAIN_BRANCH'
ORDER  BY updated_date ASC;

Plan branches without any active expiry

Plan branches need to have their expiry settings explicitly set. So if the plan branch expiry is not configured, even if the Branch is deleted from the remote repository, the Plan branch will still stay around and you will have to delete it manually if expiry is not set.

The following SQL statement helps you to locate the Plan branches without any expiry settings.

Show plan branches without expiry settings
SELECT build.full_key,
       build_definition.build_id,
       build.created_date,
       build.updated_date
FROM   build
       JOIN build_definition
         ON build.build_id = build_definition.build_id
WHERE  build.marked_for_deletion IS NOT NULL
       AND build_definition.xml_definition_data LIKE
           '%<branchRemovalCleanUpEnabled>false</branchRemovalCleanUpEnabled>%'
       AND build_type = 'CHAIN'
ORDER  BY updated_date ASC; 


Solution

Once you have investigated what Plans are the top consumers you can just straight and start deleting them one by one or you can act preventatively and program Global and Plan based expiry settings.

Global expiry

By using Global expiry, it is important to understand a few technical aspects of it.

When the data is erased

Once you configure the global expiry criteria (and their exceptions) and activate them, the cleanup process will only start:

  • Manually: Click "Run now" over the "Expiry" menu in Settings (manual method)
  • Scheduled: When the "next scheduled run" under the Removal schedule is reached 

If you are concerned that the cleanup might start immediately, you can set a long date in the future under the Removal schedule just in case.

What data is erased

You have the following choices in terms of data that will be cleaned up:

  • Complete build & deployment results, build & release artifacts and all logs (Excludes historical deployment records)
  • Build and release artifacts only
  • Build and deployment result logs only

Bamboo expiry will clean up the "Affected" builds logs and result files located in the xml-data/builds/ within your <bamboo-home>.

Please note, expiry will not clean the working directory for agents and plans within xml-data/build-dir. That must be managed by the job itself with a Clean working directory task or Plan Configuration >> Job >> Other >> Clean working directory after each build

Check Locating important directories and files to understand how Bamboo stores each type of data.

Cleaning criteria

You can configure the global expiry retention criteria. Bamboo will keep the results as long as they meet the configured criteria.

Individual plan expiry, branch expiry

You can also have per-plan expiry rules that will override the global expiry settings that affect all plans in Bamboo. If you disable build expiry for a plan, that plan's build result data will never be automatically deleted from your Bamboo server. You can select the build result data that will be kept for a plan and for how long this data will be kept (e.g. for reporting purposes) before Bamboo automatically deletes it.

It is also important to understand that your top "Plan" results need to be combined with your "Plan branches" results. If you have lots of branches coming from your linked repositories, you may have to check for their sizes and consider them as a whole. Every artifact generated in a Default/master branch will also be contained in its Plan-branches.

More information here:

You can also delete individual build results for a plan as a one-off, manually. You can use this method if you want to have control over past build results that you wish to delete.

Plan branches expiry

You can also enable Plan-branches expiry, which will delete branches from Bamboo once they are removed from your repository. This will make Bamboo leaner in terms of the number of branches you have. Follow the path below for each Plan you want to be activated:

  • Plan Configuration -> Branches -> Delete plan branch (choose criteria)

Configuring a Plan branch cleanup:

Override global plan expiry report

To get a list of Plans that are overriding your Global expiry settings you can simply go to:

  • Bamboo Administration -> Plans -> Expiry -> See plans with custom expiry settings (under Expiry overrides)

Alternatively, if using the Bamboo UI is too slow or you have too many results to analyse, you can use a REST API call or SQL SELECT statement for that:

REST API show custom plan expiry
$ curl -vvv --user $BAMBOO_ADMIN:$PASSWORD http://localhost:8085/rest/api/latest/admin/expiry/custom/plan | jq
>
{
  "self": "http://localhost:8085/rest/api/latest/admin/expiry/custom/plan?start=0&limit=25",
  "start": 0,
  "limit": 25,
  "results": [
    {
      "planName": "BAM - BOO",
      "planKey": "BAM-BOO",
      "configLink": {
        "href": "http://localhost:8085/chain/admin/config/editChainMiscellaneous.action?buildKey=BAM-BOO",
        "rel": "edit"
      },
      "expiryConfig": {
        "expiryTypeNothing": false,
        "expiryTypeResult": true,
        "expiryTypeArtifact": true,
        "expiryBuildLog": true,
        "duration": 0,
        "period": "days",
        "labelsList": "dontexpire",
        "buildsToKeep": 0,
        "maximumBuildsToKeep": 3
      }
    },
SQL show custom plan expiry
SELECT b.full_key ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/enabled/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE 'no'
       end AS is_overwriting_expiry ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeNothing/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE 'no'
       end AS do_not_expire_anything ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE 'no'
       end AS is_expiring_result ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE
                     CASE
                            WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeBuildLog/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
                            ELSE 'no'
                     end
       end AS is_expiring_build_log ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE
                     CASE
                            WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeArtifact/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
                            ELSE 'no'
                     end
       end AS is_expiring_artifacts ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/duration/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/duration/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
       end AS expire_after_days ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/buildsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/buildsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
       end AS minimum_builds_to_keep ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/labelsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/labelsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
       end AS labels_to_keep
FROM   build_definition bd
JOIN   build b
ON     (
              bd.build_id = b.build_id)
WHERE  b.build_type = 'CHAIN'
AND    cast((xpath('//custom/buildExpiryConfig/enabled/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true'
OR     cast((xpath('//custom/buildExpiryConfig/expiryTypeNothing/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'false';

Please note, this may not work due to invalid XML that can be stored as a result of:

If it doesn't we can use this but it does not lay out the data as cleanly for presentation:

SQL show custom plan expiry - simplified
SELECT B.full_key,
       BD.*
FROM   build_definition BD
       JOIN build B
         ON BD.build_id = B.build_id
WHERE  B.build_type = 'CHAIN'
       AND BD.xml_definition_data LIKE
           '%<buildExpiryConfig>%<enabled>true</enabled>%</buildExpiryConfig>%'; 
Last modified on Mar 31, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.