How to find the size of a repository hosted on Bitbucket Server and Datacenter

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

    

Summary

How to check the repository size hosted on Bitbucket Server/Data Center.

Environment

Bitbucket Server/Data Center.

Solution

You can check the repository size through UI, database or filesystem:

UI

Under Settings > Repository Details after clicking on "Retrieve Size Details".

The repository size displayed in the repository settings is known to show only the approximate size of the repository and not the exact size as calculating the same would be extremely time-consuming at times depending on the number of files in the repo. Giving that, when the size of the repository is considerably large the results may seem to be inconsistent.

Database

The repo size information is stored in bb_repo_size table in the database. You can use the below query to find the repository size of all repositories using a single query, this query is for PostgreSQL and may need to be modified for alternate database engines.

Note

The database schema can change at any time without prior communication. Please make sure you take this into account prior to adopting this method.

The below query is provided on a best-effort basis, and Atlassian Support is unable to directly support this query or any customizations made to it to achieve your team's business needs.

Db Query:
## Get project,repo details like name,id,slug, with repository size

select 
  r.id as repo_id, 
  r.slug as repo_slug, 
  r.name as repo_name, 
  r.project_id as project_id, 
  p.name as project_name, 
  s.total as repo_size 
from 
  repository r 
  join bb_repo_size s on r.id = s.repo_id 
  join project p on p.id = r.project_id 
  order by repo_size desc


Filesystem

You can retrieve the size of the repositories by checking the content of the $BITBUCKET_HOME/shared/data/repositories directory. A way to achieve this is to run the following command:

du -s $BITBUCKET_HOME/shared/data/repositories/* | sort -rn


This is already sorted and will provide the biggest repositories first. You could add a head command to limit the results to the biggest repository:

du -s $BITBUCKET_HOME/shared/data/repositories/* | sort -rn | head -10


You can identify the repository by then navigating into the directory in question and reading out the file "repository-config" file for the "project" and "repository" fields or it can be matched back to the repository name by checking the results of the /rest/api/1.0/repos  (looking to match the ID field).  If you are on Windows then you can navigate to that repo path and check the properties of the directory for Size

An important aspect that should be considered is the difference between the repository size on the Bitbucket server and the size of the cloned repo on a client machine because there are additional objects and files that are on the server that a client does not need.

A git clone typically just asks for all the repository's branches and tags refs/heads and refs/tags/. That's not necessarily going to include every object in the repository on the server. The server will have some objects pending garbage collection and this contributes to the difference in size. In addition, if the repository have pull requests, it's guaranteed that a clone won't include every object. The server-side repository includes objects like pull request auto-merges that it needs in order to function, but a client will never request those data. So the cloned repository is almost always going to be of size lesser than the server-side repository.


API:(Not recommended, due to performance issues)

We do have an API to determine repository size; however, it is not publicly available due to potential performance concerns. You can refer to the related feature request here BSERV-4988 - Getting issue details... STATUS

Last modified on Sep 9, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.