Git manifest

Introduction

FishEye and Crucible 3.4.0, and later versions, store information about the Git manifest. What is the manifest? The manifest for a commit is the list of the latest commit to affect each path in a repository. Maintaining this information speeds up a number of operations in FishEye indexing, which could, in some cases, be very slow in prevous FishEye/Crucible releases.The operations that are affected include new branch creation and tag application.

Git ls-tree

Git provides the ls-tree command to give the content manifest for a commit. For a commit, ls-tree gives the content hash for each path. It does not give any information about the commit that created that content. Prior to 3.4.0, FishEye used the ls-tree to fetch the content manifest and used this information to derive the commit manifest. In many cases there is a 1:1 relation between a content hash and the responsible commit at a given path in the repository. For those cases, the commit determination is very quick.

In other cases, however, the same content hash can be associated with multiple commits. In these cases FishEye would need to perform a history search to see which commit is the relevant commit. This search, performed per path, could become relatively slow. If you have seen long running ls-tree commands in previous versions of FishEye it is because these searches are occurring in the context of the ls-tree execution and not due to the time that ls-tree itself takes to execute.

Manifest storage

The solution in FishEye from version 3.4.0 is to avoid the execution of ls-tree in most situations, and to maintain the manifest for each commit in FishEye's index. Normally a single commit only changes a very small number of files in a repository. FishEye 3.4.0 uses a combination of full and delta manifests to store manifest information. Delta manifest entries only record the files that change in a commit, whereas full manifests record the full manifest. FishEye lets users control the maximum number of delta manifests that it writes before it writes a fresh full manifest. This lets users trade off the storage and performance characteristics of manifest storage using the FishEye system property fisheye.manifest.maxdepth. The default value for this property in FishEye is currently 3 as this gives a good balance between storage requirements and indexing performance.

An example repository showing the cache size and indexing times at various manifest depths, with FishEye 3.4, is shown below:

When FishEye 3.3 was used with this example repository the initial indexing time was 9 hours and the cache size was 155M. The speed up and space requirements will vary for each repository and depend on a number of characteristics of the repository including the number of active branches, the number of tags and the number of files in the repository

Changing manifest depth

The manifest depth can be changed by changing the FishEye system property and restarting FishEye. The new manifest depth value only affects new manifests that are written after the FishEye restart. There is no need to reindex the repository as the manifests written with the old depth value continue to operate correctly.

Manifest upgrade

When deploying FishEye 3.4.0, and later, the manifest information needs to be created to allow the new manifest code to work correctly. There are various system properties which allow users to decide how the upgrade is performed. These are detailed in the Upgrade guide.

Was this helpful?

Thanks for your feedback!

Why was this unhelpful?

Have a question about this article?

See questions about this article

Powered by Confluence and Scroll Viewport