How to get summary statistics from Bitbucket Data Center access logs

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

This page explains how to get summary statistics from Bitbucket Data Center access logs.

Sometimes we need summary statistics information derived from Bitbucket's access log. The script given here accepts raw access logs as its standard input - we can use cat to concatenate a number of access log files and pipe them to the script, as is shown in the example. The script will write information on:

Total number of requests processed.
Total number of bytes read from the clients and written to them.
Number of requests per request result code, including details on HTTP and SSH codes.
Per-user statistics, including number of requests, bytes read, and bytes written. Since there are access log entries not associated with any user, they are shown as "NO_USER".

Environment

8.9.9, but also applicable to other versions.

Solution

The script given below can be used for summary analysis of Bitbucket access logs. You can modify it to better suit your needs; modify the cat argument to include only access logs you actually want to analyze.

Please don't run it directly on production servers since loading and parsing large access log files can add pressure to the production server.

The script below is given as an example, it is provided as-is and only as an example, and Atlassian can't guarantee its correct functionality.

cat atlassian-bitbucket-access*.log | \
    awk '
        BEGIN {
            FS = "|";
        }

	{
		printf("Requests processed: %d total\r", record_count) > "/dev/stderr";
	}

        $3 ~ /o[@*]/ {
            # total bytes read
            bytes_read += $9;
            bytes_written += $10;
            # result codes
            str = sprintf("%s:%s", $2, $8);
            gsub(/ /, "", str);
            res_codes[str] ++;
            # per-user statistics
            user = $4;
            gsub(/ /, "", user);
            if (user == "-") user = "NO_USER";
            user_count[user]++;
            user_bytes_read[user] += $9;
            user_bytes_written[user] += $10;
            # rows count
            record_count++;
        }

        END {
            printf("\nTotal bytes read: %d, total bytes written: %d\n", bytes_read, bytes_written);
            printf("\nResult code, Count\n");
            printf("-----\n");
            # sort on res_codes value
            PROCINFO["sorted_in"] = "@val_num_desc"
            for (rc in res_codes) {
                printf("%s, %s\n", rc, res_codes[rc]);
            }
            printf("\nPer user statistics: user, requests count, bytes read, bytes written\n");
            printf("-----\n");
            # sort on user_count value
            PROCINFO["sorted_in"] = "@val_num_desc"
            for (user in user_count) {
                printf("%s, %d, %d, %d\n", user, user_count[user], user_bytes_read[user], user_bytes_written[user]);
            }
        }
    '

The script will write out summary statistics, for example:

Requests processed: 86776 total
Total bytes read: 8656, total bytes written: 260948064

Result code, Count
-----
https:200, 86671
https:302, 58
ssh:0, 15
https:204, 13
https:401, 6
https:201, 5
https:404, 4
ssh:1, 3
https:202, 1
https:0, 1

Per user statistics: user, requests count, bytes read, bytes written
-----
NO_USER, 84879, 0, 57416924
john.doe, 1388, 7499, 200871299
jane.doe, 510, 1157, 2659841

Updated on March 17, 2025

Was this helpful?

It wasn't accurateIt wasn't clearIt wasn't relevant

Atlassian Support

How to get summary statistics from Bitbucket Data Center access logs

Summary

Environment

Solution

Still need help?