How Fisheye uses memory
This page describes how Fisheye and Crucible use the Java heap when running.
The heap is, of course, used for all Fisheye and Crucible operations. The description here, however, is focused mainly on the per-repository aspects of memory usage. The remaining usage of the heap, such as SQL database query caches, Crucible review data, etc is not covered here.
On this page:
Per-repository heap components
For an active repository, the major components of memory usage for that repository are:
- A database cache. This is an in-memory cache used by Fisheye's on-disk database.
- A set of string tables. This is an application level cache of string data used by Fisheye. It includes information such as changeset comments, file paths, file names, etc.
- Transient SCM data. This is data fetched from the underlying SCM for the repository. It can include data needed to create Fisheye's database and index. It would also include data fetched from the SCM to satisfy UI requests, such as file revision content. While it is transient, it can still be a significant component of heap usage.
- Transient indexing data. Fisheye uses Lucene to index repository metadata and content. Building the Lucene documents and index management uses heap memory.
Every active Fisheye repository requires an active database connection. When there is a large number of repositories, there may not be sufficient heap available to support having all repository database connections open at once. To support large numbers of repositories running at once, Fisheye will transparently "passivate" the repository by closing its database connection and freeing the memory used by both the database cache and the repository's string tables. The repository will be activated, that is, reopen its database connection, when needed. This activation can be triggered by either a UI request or an indexing operation. As a repository activates, Fisheye may need to passivate another repository to keep the number of active repositories at a manageable level.
Prior to Fisheye 3.5, the repositories were selected for passivation using an LRU algorithm. This makes sense for UI activity but may not have been optimal for indexing activity, where the regular polling behavior means the least recently accessed repository is most likely to be next repository to be accessed. In Fisheye 3.5, the UI and indexing needs are balanced by choosing a repository in the middle of the LRU list of repositories.
Database memory allocation
By default Fisheye allocates one third of the maximum Java heap size to the repository on-disk database caches. Note that this is for the on-disk B-Tree database and not the SQL relational database, which is managed separately. For example, when running with a 1 GB heap, Fisheye will allocate up to 340 MB for database caches.
In Fisheye 3.5, the passivation behavior has been changed to take advantage of larger heaps. This includes:
- Raising the minimum per-repository cache size to 5 MB (was previously 1 MB).
- Making the maximum cache size 20 MB.
- Calculating the maximum number of active repositories based on the available heap.
|Heap size||Cache allocation|
Repository cache size
|Max active repositories|
|1 GB||341 MB||5 MB||68|
|2 GB||682 MB||5 MB||137|
|4 GB||1.37 GB||5 MB||273|
|8 GB||2.7 GB||5 MB||546|
In addition, in Fisheye 3.5, the level of Java heap garbage collection activity is monitored and additional repositories will be passivated as the time spent in garbage collection increases. This makes the Fisheye instance more reactive to increasing UI load.
Prior to Fisheye 3.5, the available database cache allocation was shared by the active repositories. If there were 10 repositories in a 1 GB heap, each would be given a 34 MB cache allocation. Also, by default there was a maximum of 50 active repositories before Fisheye would begin to passivate repositories. While the 50 repository limit was configurable, it was not automatically changed for Fisheye instances with larger heaps. Rather, the increased cache available was used to give the 50 active repositories a larger database cache allocation.
|Heap size||Cache allocation|
Repository cache size
|Max active repositories|
|1 GB||341 MB||6.8 MB||50|
|2 GB||682 MB||13.7 MB||50|
|4 GB||1.37 GB||27 MB||50|
|8 GB||2.7 GB||55 MB||50|
The string table component of repository memory usage is currently implemented as a weakly referenced cache. This means that as heap memory is used, the JVM garbage collector can remove elements in the cache. The size of the cache is not bounded to a particular size but is bounded by garbage collection activity in the JVM.
The unbounded size of the StringTable can lead to it appearing in heap dumps as a large consumer of heap. This is normal and the string table will decrease in size as the garbage collector (GC) removes entries. In Fisheye 3.5, by passivating in response to GC activity, the string tables are flushed more actively when required. This reduces GC activity in the instance as a whole.
Finally, in Fisheye 3.5 some string tables that were large in size but little accessed are no longer cached and are now taken directly from the on-disk database.
Guidelines for heap sizing for a given number of repositories
In some respects, repository passivation is like virtual memory in a modern operating system. It allows a Fisheye instance to support a larger number of repositories than can be active at once in a given-sized Java heap. Repositories are swapped into the active set on demand and swapped out if not in use.
As the number of repositories configured in a Fisheye instance increases beyond the maximum active number supported by the heap, the instance will begin to passivate repositories. Obviously, as a repository database connection needs to be reopened and the in-memory caches need to be repopulated, the onset of passivation comes with a performance impact. As the number of repositories increases, the rate of passivation and activation will increase and the performance impact will also increase. Nevertheless, quite high load factors (ratio of number of repositories to the maximum number of active repositories) can be supported without appreciable performance impact.
Given the cache size limits in Fisheye 3.5, the following limits are guidelines to reasonable maximum number of repositories:
At max repo cache size
|At min cache size / no passivation||Load factor 2||Load factor 5||Load factor 10|
What load factors are acceptable? That depends a lot on the nature of the repositories in the instance. If all repositories are equally busy and support a large user base, lower load factors are appropriate. If, on the other hand, some proportion of repositories are accessed sporadically, perhaps a historic codebase, the load factor can be higher as these repositories are not going to activate that frequently.
Fisheye 3.5 can, in general, support more repositories with smaller heaps and lower resulting garbage collection load than previous versions. It is also more reactive to memory usage increasing, allowing increasing UI usage to trigger passivation of inactive repositories.
Configuring application heap size
A Fisheye or Crucible system administrator can configure the Java VM heap size by setting FISHEYE_OPTS. See Environment variables.
Configuring server memory maps
If you have a large number of repositories, we recommend that you increase the default number of maps that Fisheye is allowed to have. See this knowledge base article for more info: JVM crashes after Fisheye Crucible upgrade - Native memory allocation mmap.