Some Bitbucket nodes are taking longer to start while others start almost instantaneously
Platform Notice: Data Center Only - This article only applies to Atlassian products on the data center platform.
Bitbucket was stuck on start-up page and the logs didn't "go forward":
Bitbucket DC 7.14.1
The node managed to start after ~40min. The second node had the same behavior for ~50 minutes and the third node started almost instantly without the behavior the first and the second node experienced.
Logs Node 1:
Checking the logs from the first node, the application took ~40 minutes to keep logging after the Started SSH server successfully entry:
2021-05-21 13:06:03,680 INFO [spring-startup] c.a.b.internal.ssh.server.SshServer Starting SSH server on port 7999... 2021-05-21 13:06:03,804 INFO [spring-startup] c.a.b.internal.ssh.server.SshServer Started SSH server successfully. 2021-05-21 13:45:23,431 INFO [spring-startup] c.a.b.i.s.c.j.c.HealthCheckRunner New health check registered: SearchIndexCheck 2021-05-21 13:45:23,432 INFO [spring-startup] c.a.b.i.s.c.c.DefaultClusterJobManager Registering job for ElasticsearchSynchronizeJob
The access logs started to written data also at 13:45 (the time the node got effectively started).
Logs Node 2:
~46 minutes difference, similar to the 1st node seen previously:
2021-05-21 12:19:08,709 INFO [spring-startup] c.a.b.internal.ssh.server.SshServer Started SSH server successfully. 2021-05-21 13:05:11,047 INFO [hz.hazelcast.event-3] c.a.s.i.c.HazelcastClusterService Node '/10.10.10.15:5701' was ADDED to the cluster. Updated cluster: [/10.10.10.16:5701 master this uuid='e7287c8c-ce03-4383-848e-3a76d34d9781' vm-id='s357a960-ce4f-4321-bafd-42c4e535d172'], [/10.10.10.15:5701 uuid='5366655d-099d-48b1-b13f-e5bce56a70cd' vm-id='b4ec4tcl-1c30-4faf-9f42-5b4eefe6fab9']
2021-05-21 13:10:52,982 ERROR [active-objects-init-compatibility-tenant-0] net.java.ao.sql Exception executing SQL update <CREATE INDEX "index_ao_c77861_aud96775159" ON "AO_C77861_AUDIT_ENTITY"("RESOURCE_ID_5","RESOURCE_TYPE_5","ENTITY_TIMESTAMP")> org.postgresql.util.PSQLException: ERROR: could not extend file "base/11000/9006030": No space left on device Hint: Check free disk space. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553) ...
It's important to get the logs from all nodes, since not all of them may point to a lack of disk space.
The root cause of the error is the disk space allocated for the database, or the disk quota for the user owning the database has been exceeded. Therefore any additional data (such as rescoping for an update of a pull request) cannot be written on the database tables. The DB disk space is the reason behind the slow start.
The DB disk space needs to be increased/adjusted accordingly. Check the disk space or the disk quota for the user associated with the Bitbucket database at the time the nodes were started.