Testing NFS disk access speed for Bitbucket Data Center and git operations
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Bitbucket Data Center appears to be experiencing performance issues and is running slowly. Git and UI operations are taking longer than expected.
Important note on testing
The guidelines provided here are a starting point for you to build your staging environment ahead of testing
These guidelines are not a substitute for extensive testing prior to deployment using realistic loads that reflect your individual requirements
It is not practical or possible to provide simple guidelines for capacity planning and this will always require extensive performance testing
The complexity of environments will vary greatly and there are many other variables outside of NFS that can affect performance (e.g. networking, instance types, size of instances, workload) and for this reason, it is essential that you perform extensive testing for your specific environment
Atlassian support does not provide consulting services on capacity planning. You can contact an Atlassian partner if you require consulting services for capacity planning
Disk access speed is critical for Bitbucket and Git operations performance, especially when running a multi-node cluster where a NFS share is required to store data.
If the application is running slowly, disk speed on the NFS share can be a potential root cause. It's possible to isolate that cause using Bonnie++ to benchmark disk access speed.
Install Bonnie++ using the OS package manager or download it from https://doc.coker.com.au/projects/bonnie/
Bonnie++ is a file system benchmarking tool that allows you to easily execute a test. It does a good job of performing similar operations that mirror git operations to give you a more accurate test. Bonnie++ tests two different things - I/O throughput and creating/reading/deleting lots of small files (similar to what git does).
bonnie++ -d /path/to/remote/nfs/filesystem -r 65536 -u someuser -z 1234 -n 1024
This test must be executed on one of the Bitbucket data center application nodes against the shared home directory, which is mounted to the NFS
directory to remote NFS for the test
RAM size in megabytes
user to use for the test
Random number seed to get repeatable tests
Number of files for the file creation test
The file system should have 3x the disk space relative to RAM for the test
Interpreting the results
Your shared NFS storage layer is a critical piece of your data center infrastructure. You must make sure you have sufficient I/O performance.
The important values are under the Random Create - Create and Random Create - Read sections on the output below.
Version 1.97 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP ip-10-217-3-33 128G 1302 99 426371 22 208657 13 648 98 428862 14 9522 100 Latency 6237us 126ms 4654ms 3273ms 307ms 4146us Version 1.97 ------Sequential Create------ --------Random Create-------- ip-10-217-3-33 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 1024 361 1 120910 58 5807 9 366 1 12437 19 3441 4 Latency 239ms 8497us 658ms 190ms 204ms 838ms
(higher is better)
Random Create - Create
Random Create - Read
Results from above output
Results lower than the ones in the benchmark above indicate that I/O performance on the NFS is not optimal for Bitbucket and Git operations, leading to performance issues.
Examples of environmental factors that can cause slow disk access are as follows:
- Anti-Virus software running on the cluster or on the NFS server scanning Bitbucket structure.
- Network latency.
- A disk defragmentation job may be running.
- Hardware issues such as disk failures.
- File system encryption turned on.
- Automated compression of files controlled by the OS.
- Specific issues with the Java version and OS. This is a rare occurrence, however, a bug or known issue within the JVM may cause it to perform poorly on a specific OS.
- Other applications or operations that are currently using the disk.
- The disk capacity may be nearing full, which on some OS can slow the performance of the disk (in this particular example, it was on Solaris).
- File server running out of server processes.
Not having the recommended NFS mount options
Additional Resources About Disk Benchmark
Here are some additional resources a sysadmin would like to review.
- bonnie++ - Linux man page
- Simple Bonnie++ Example
- Common disk benchmarking mistakes
- How fast are your disks? Find out the open source way, with fio