Monitoring your mirror farm
There are a number of helpful tools and techniques you can use to monitor the health of your mirror farm.
Performance monitoring using JMX metrics
Java Management eXtensions (JMX) is a technology used for monitoring and managing Java applications. JMX can be used to determine the overall health of each mirror node and the mirror farm. The following statistics are most important to monitor:
Hosting tickets on mirror nodes
Mirror hosting tickets on the primary
Incremental sync time on mirror nodes
Snapshot sync time on mirror nodes
As always its important to monitor your nodes for disk space, CPU and memory.
For more information and a complete list of JMX metrics, see Enabling JMX counters for performance monitoring.
Synchronization and consistency
A repo-hash endpoint is provided on both the mirror farm and the primary server. It’s used to check the consistency of a mirror farm and nodes with respect to the primary. This is the same endpoint that Mirror farm vet uses to repair any inconsistencies that come up, such as the result of a missing a webhook. There are some important considerations to keep in mind when using this endpoint:
The endpoint,
rest/mirroring/latest/repo-hashes
, is available on both the primary and the mirror nodes. It returns a stream of JSON containing acontent
andmetadata
hash for each repository. Thecontent
hash is a digest of the Git repository itself, while themetadata
hash is a digest of the metadata that Bitbucket holds concerning the repository, such as the repository name.Content hashes or just metadata hashes are individually requested by calling
rest/mirroring/latest/repo-hashes/content
orrest/mirroring/latest/repo-hashes/metadata
.
This is what the payload looks like:
{ "projects": [ { "id": 1, "public": false, "repositories": [ { "id": 1, "hashes": { "content": "082a2ffa1520447bb6c0072f9f9d850c76f111c0ff9a08cca8838b12b0ccc31a", "metadata": "b8fae6cb4704174f8dafae601355279950f921ba55b7620f4bdaa1280e735d14" } }, { "id": 2, "hashes": { "content": "0000000000000000000000000000000000000000", "metadata": "e80aeaf459a69e7000b9e785eb39640a5d929f7ec4f09512a9ab6fabf4a0c80a" } } ] } ] }
The process to generate content hashes while reasonably fast needs to run against every repository on the instance, for larger instances this could take quite some time so we make an optimisation. When a upstream is first upgraded to a mirror farm capable version a “empty” content hash is generated for each repository this appears as
0000000000000000000000000000000000000000
as can be seen in thecontent
attribute of the second repository above. When the farm vet encounters a repository with a content hash of0000000000000000000000000000000000000000
it considers that repository up to date.A mirror will only return entries for the project or repository it’s mirroring. While the content returned from a mirror and the primary will be the same, the order of entries could be different. One way to sort the order consistently for diffing is to use the
JQ
queryjq '.projects | sort | .[].repositories |= sort_by(.id)'
Webhook
The mirror synchronized webhook can be used to trigger builds as soon as the mirror has finished synchronizing. It’s also useful for monitoring the repository in your mirror farm. Details of this repository event can be found in the Event payload page.
Monitoring the status of your mirrors
You can configure your load balancer to check the node’s status using the /status
endpoint. A response code of 200
is returned if the mirror node is in a SYNCHRONIZED
state. If there are no nodes in the SYNCHRONIZED
state, a 200
response code will be returned for any mirror that is in one of the following states:
BOOTSTRAPPED
BOOTSTRAPPING
METADATA_SYNCHRONIZED
For customers who want a “strict” status endpoint we provide a plugin.mirroring.strict.hosting.status configuration property that when set to true, the /status endpoint returns a 200 response code only if the mirror is in the SYNCHRONIZED state. The setup for this configuration is outside the scope of the document. It is important to note that at least one mirror node should be accessible from the upstream server.
The table below displays each state and it’s description:
State | Description |
---|---|
STARTING | A Bitbucket application is starting. |
STOPPING | A Bitbucket application is stopping. |
BOOTSTRAPPING | The mirror component is started. |
BOOTSTRAPPED | The mirror has joined the cluster. If this is the first time the mirror farm has been connected to a primary, this is the state the application will wait in until it has been authorized. |
METADATA_SYNCHRONIZED | Project or repository metadata has been synchronized from the primary and Git repositories have started synchronization. |
SYNCHRONIZED | The mirror farm has synchronized all Git repositories from the primary. If new projects or repositories are added to the mirror farm this state will not change. It indicates that the initial set of projects or repositories that where configured at startup time have been synchronized. |
ERROR | There was an error starting the application node. |
When performing a GET
operation against the /status
endpoint, the returned data is made up of JSON with two properties, status
and nodeCoun
t.
For example; {"state":"SYNCHRONIZED","nodeCount":"4"}