Starting Bitbucket Server takes a long time after upgrading to version 4.12 or newer

Still need help?

The Atlassian Community is here for you.

Ask the community


Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Problem

After the upgrade of Bitbucket Server or Data Center to version 4.12 or newer, the initial startup is taking significantly longer. In the case of Data Center installation, the issue affects every node while it is attached to the cluster.

During startup, you can see the following errors in the logs:

2019-06-11 12:00:08,114 ERROR [spring-startup]  c.a.s.i.s.g.u.s.SalGitUpgradeManager IncludeSystemConfigTask failed for repository TEST/document[79998]
java.nio.file.NoSuchFileException: /var/atlassian/application-data/bitbucket/shared/data/repositories/79998/tmp-61d593556c512d39-config.lock
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
	at java.nio.channels.FileChannel.open(FileChannel.java:287)
	at java.nio.channels.FileChannel.open(FileChannel.java:335)
	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.openLockWithHardLink(DefaultGitRepositoryLayout.java:293)
	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.withLock(DefaultGitRepositoryLayout.java:160)
	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.editConfig(DefaultGitRepositoryLayout.java:83)
	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.upgrade(IncludeSystemConfigTask.java:96)
	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.lambda$parallelUpgrade$1(IncludeSystemConfigTask.java:143)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.lang.Thread.run(Thread.java:748)
	... 1 frame trimmed

Diagnosis

Environment

  • The instance has been upgraded to version 4.12 or newer

  • The issue started happening only after the upgrade

  • No significant load on the database or filesystem is observable during the long startup
  • The issue still reproducible in UPM Safe Mode 
  • The startup is stuck on Preparing plugin framework

     

Diagnostic Steps

  1. Enable Debug logging and profiling

  2. Restart the instance
  3. Search the logs for the occurrence of 

    SalGitUpgradeManager IncludeSystemConfigTask failed for repository
  4. Verify the time in atlassian-bitbucket-profiler.log consumed by the task git: apply IncludeSystemConfigTask

Cause

In version 4.12 we introduced IncludeSystemConfigTask which rewrites the config files for all repositories to add its own settings for a shared config file and to add a repository config file for each repository. We also introduced additional filesystem locks in order to provide required isolation to prevent concurrent changes to the individual repository settings. In other words, while Bitbucket Server has a config.lock file in place, if someone was to try and use git config to edit the configuration as well, Git would reject their edit.

In order to implement this locking mechanism in version 4.12, the new upgrade task has been added to perform the following actions:

  1. Query all the repositories from the database (git: apply IncludeSystemConfigTask)
  2. Create the tmp-<some_hash>-config.lock file in each repository as a hard link
  3. If the creation fails throw an exception with ERROR level and reschedule the retry for all repositories during the next restart.
  4. Retry with each restart until the task finishes successfully for all repositories.

The described above logic is causing an issue in the case of the list of repositories stored in the database differs from the real repositories on the filesystem. In that case, Bitbucket will fail to create the lock file as the path does not exist on the filesystem. And when Bitbucket fails to create the lock the tasks are marked as failed:

c.a.sal.core.upgrade.PluginUpgrader Upgrade failed: IncludeSystemConfigTask failed for one or more repositories
java.lang.RuntimeException: IncludeSystemConfigTask failed for one or more repositories

This task is then rescheduled for the next restart. Meaning that each node restart will trigger the task to execute and when it cannot create the lock, it will be scheduled to run again at the next restart. The effect this has is that each node will encounter increased startup times. The startup times will be more pronounced as more repositories are added to the system (more repositories for IncludeSystemConfigTask to check).

Resolution

There are two resolutions available:

  1. As the main root cause is the inconsistency between the database and the filesystem the issue can be resolved with Bitbucket Integrity checker in Data Center installations
    (warning) Please note that it can take a very long time to run the integrity check on the instance with a significant number of repositories. You should only use the Integrity checker as a resolution if the errors reported in the logs affect more than 50 repositories.
  2. For Bitbucket Server installations or Bitbucket Data Center installations with fewer than 50 affected repositories follow these steps:
    1. Recreate (delete and create again) all the impacted repositories via UI
    2. Restart the instance
    3. Verify that there are no errors reported from IncludeSystemConfigTask in atlassian-bitbucket.log i.e:

      Errors showing the tasks failed:

      2019-01-11 12:53:08,114 ERROR [spring-startup]  c.a.s.i.s.g.u.s.SalGitUpgradeManager IncludeSystemConfigTask failed for repository TEST/document[79998]
      java.nio.file.NoSuchFileException: /var/atlassian/application-data/bitbucket/shared/data/repositories/79904/tmp-61d593556c518d39-config.lock
      	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
      	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
      	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
      	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
      	at java.nio.channels.FileChannel.open(FileChannel.java:287)
      	at java.nio.channels.FileChannel.open(FileChannel.java:335)
      	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.openLockWithHardLink(DefaultGitRepositoryLayout.java:293)
      	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.withLock(DefaultGitRepositoryLayout.java:160)
      	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.editConfig(DefaultGitRepositoryLayout.java:83)
      	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.upgrade(IncludeSystemConfigTask.java:96)
      	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.lambda$parallelUpgrade$1(IncludeSystemConfigTask.java:143)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.lang.Thread.run(Thread.java:748)
      	... 1 frame trimmed
      2019-01-11 12:53:08,126 ERROR [spring-startup]  c.a.sal.core.upgrade.PluginUpgrader Upgrade failed: IncludeSystemConfigTask failed for one or more repositories
      java.lang.RuntimeException: IncludeSystemConfigTask failed for one or more repositories
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.perform(SalGitUpgradeManager.java:366)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.perform(SalGitUpgradeManager.java:317)
      	at com.atlassian.stash.internal.user.DefaultEscalatedSecurityContext.call(DefaultEscalatedSecurityContext.java:58)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$DelegatingUpgradeTask.apply(SalGitUpgradeManager.java:264)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.doUpgrade(SalGitUpgradeManager.java:325)
      	at com.atlassian.sal.core.upgrade.PluginUpgrader.doUpgrade(PluginUpgrader.java:72)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalPluginUpgrader.apply(SalPluginUpgrader.java:27)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SynchronousUpgrader.doInTransaction(SalGitUpgradeManager.java:382)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SynchronousUpgrader.doInTransaction(SalGitUpgradeManager.java:373)
      	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager.start(SalGitUpgradeManager.java:133)
      	at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:173)
      	at org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:50)
      	at org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:350)
      	at org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:149)
      	at org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:112)
      	at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:880)
      	at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:546)
      	at javax.servlet.GenericServlet.init(GenericServlet.java:158)
      	at java.lang.Thread.run(Thread.java:748)
      	... 8 frames trimmed


      Messages for successful task completion:

      c.a.s.i.s.g.u.IncludeSystemConfigTask Executor service has shutdown gracefully
      
      c.a.sal.core.upgrade.PluginUpgrader Upgraded plugin com.atlassian.bitbucket.server.bitbucket-git to version 8 - Updates all repositories to include system-config for common configuration 
      



    4. Perform another restart to confirm that the issue is resolved.
    5. If you do not see any errors but the instance still takes a lot of time to startup please contact Atlassian Support and attach the log files.


DescriptionSlow startup issue troubleshooting after the upgrade.
ProductBitbucket Server

Last modified on Sep 24, 2020

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.