Bitbucket Data Center clustering does not work on Docker default network
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Bitbucket Data Center nodes running on Docker may fail to form a cluster even when the Hazelcast group name and password are set correctly.
Environment
- Each Bitbucket Data Center node(s) is running on its own individual Docker host.
Diagnosis
When the second Bitbucket Data Center node tries to join the cluster, it will fail with the following log messages in the $BITBUCKET_HOME/log/atlassian-bitbucket.log
:
2024-09-06 06:57:38,707 WARN [hz.hazelcast.cached.thread-3] c.a.s.i.c.DefaultClusterJoinManager CONNECT(172.17.0.2:40857 -> 10.229.129.63:5701): Node authentication failed: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
2024-09-06 06:57:39,849 WARN [hz.hazelcast.cached.thread-2] c.a.s.i.c.DefaultClusterJoinManager CONNECT(172.17.0.2:39181 -> 10.229.129.63:5701): Node authentication failed: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
The first node will throw these log messages in the $BITBUCKET_HOME/log/atlassian-bitbucket.log
:
2024-09-06 06:57:38,687 WARN [hz.hazelcast.cached.thread-2] c.a.s.i.c.DefaultClusterJoinManager ACCEPT(172.17.0.2:5701 <- 10.229.31.19:40857): Node authentication failed: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
2024-09-06 06:57:38,694 WARN [hz.hazelcast.cached.thread-2] c.h.i.server.tcp.TcpServerAcceptor [10.229.129.63]:5701 [bb-cluster] [5.2.5] com.atlassian.stash.internal.cluster.NodeConnectionException: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
com.atlassian.stash.internal.cluster.NodeConnectionException: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
at com.atlassian.stash.internal.cluster.DefaultClusterJoinManager.accept(DefaultClusterJoinManager.java:103)
at com.atlassian.stash.internal.hazelcast.ClusterJoinSocketInterceptor.onAccept(ClusterJoinSocketInterceptor.java:49)
at com.hazelcast.internal.server.tcp.TcpServerContext.interceptSocket(TcpServerContext.java:241)
at com.hazelcast.internal.server.tcp.TcpServerAcceptor$AcceptorIOThread.newConnection0(TcpServerAcceptor.java:308)
at com.hazelcast.internal.server.tcp.TcpServerAcceptor$AcceptorIOThread.lambda$newConnection$0(TcpServerAcceptor.java:300)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at com.hazelcast.internal.util.executor.CompletableFutureTask.run(CompletableFutureTask.java:64)
at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.lang.Thread.run(Thread.java:840)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102)
... 1 frame trimmed
2024-09-06 06:57:39,661 WARN [hz.hazelcast.cached.thread-2] c.a.s.i.c.DefaultClusterJoinManager ACCEPT(172.17.0.2:5701 <- 10.229.31.19:39181): Node authentication failed: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
2024-09-06 06:57:39,661 WARN [hz.hazelcast.cached.thread-2] c.h.i.server.tcp.TcpServerAcceptor [10.229.129.63]:5701 [bb-cluster] [5.2.5] com.atlassian.stash.internal.cluster.NodeConnectionException: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
com.atlassian.stash.internal.cluster.NodeConnectionException: Cluster authentication failed. Please make sure all members share the same value for 'hazelcast.group.name' and 'hazelcast.group.password' in bitbucket.properties.
at com.atlassian.stash.internal.cluster.DefaultClusterJoinManager.accept(DefaultClusterJoinManager.java:103)
at com.atlassian.stash.internal.hazelcast.ClusterJoinSocketInterceptor.onAccept(ClusterJoinSocketInterceptor.java:49)
at com.hazelcast.internal.server.tcp.TcpServerContext.interceptSocket(TcpServerContext.java:241)
at com.hazelcast.internal.server.tcp.TcpServerAcceptor$AcceptorIOThread.newConnection0(TcpServerAcceptor.java:308)
at com.hazelcast.internal.server.tcp.TcpServerAcceptor$AcceptorIOThread.lambda$newConnection$0(TcpServerAcceptor.java:300)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at com.hazelcast.internal.util.executor.CompletableFutureTask.run(CompletableFutureTask.java:64)
at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.lang.Thread.run(Thread.java:840)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102)
... 1 frame trimmed
Explanation
When the cluster is being formed with the Hazelcast authentication enabled and the default Docker network (which is the bridge type network), the Docker container IP will be evaluated instead of the Docker host IP. This IP mismatch will cause the cluster validation to fail and prevent the nodes from forming a cluster.
Workaround
- Use host type network instead of bridge type network for Bitbucket containers. Refer to Docker - Networking overview for more details. What happens in here is that the Docker container will share the same network as the Docker host, hence the IP Address of the Hazelcast request will be the from the Docker host IP Address.
OR - Use a multi-host Docker network architecture, such as Docker Swarm. This will allow Docker containers from multiple host to run within the same Docker network.
OR - Consider moving to Kubernetes environment where there is built in Kubernetes Hazelcast discovery; Atlassian DC Helm Charts.