Confluence will not start due to fatal error in Confluence cluster
Problem
An error message appears when accessing Confluence:
Fatal error in Confluence cluster: Database is being updated by an instance which is not part of the current cluster.
You should check network connections between cluster nodes, especially multicast traffic.
Background
Confluence has a CLUSTERSAFETY table (located in the database). This table exists even for non clustered environments. Every 30 seconds, Confluence checks this table and compares its value with the one it has in memory. If the new value differs from the one in memory, this error appears, and Confluence cannot proceed. This is the cluster safety mechanism.
Causes
Though it appears to be a cluster related problem, this error occurs in non-clustered environments as well. There are several issues that can all cause the same error message:
- Multiple instances of Confluence are deployed, connecting to the same database. Happens often when a production environment is cloned or a staging environment is started without changing the hibernate.connection.url which is pointing to the original database.
- Confluence is using a duplicate Server ID that already exists in another environment. Happens often when a production environment is cloned, or a staging environment is started without creating a new Server ID.
- A performance issue (usually invasive Garbage Collection) has suspended the cluster safety job from running. In Confluence 3.0.1 and 3.0.2, this is exacerbated and happens with a much higher frequency. See Cluster panics (Non Clustered Confluence 2.10.4, 3.0.1 and 3.0.2).
- Communication between the nodes in a cluster has been severed.
- Confluence is using a read-only DB.
- In a single-node (non-clustered) deployment, there are two records in
clustersafety
table. - Confluence is connected to a MySQL database that is configured to be a master server on a replication.
Diagnosis and Resolutions
- If the problem occurs shortly after startup in a single-node (non-clustered) deployment, see Cluster Panic due to Multiple Deployments.
- If the problem occurs shortly after startup in a single-node (non-clustered) deployment and it wasn't caused by multiple deployments, see if
clustersafety
table (in the database) has more than one record. If so, just delete one of them. - If you are using 3.0.1 or 3.0.2, you are likely experiencing a bug. See Cluster panics (Non Clustered Confluence 2.10.4, 3.0.1 and 3.0.2).
- If the problem happens spontaneously during production usage in a single-node (non-clustered) deployment, see Cluster Panic due to Performance Problems.
- If the problem happens in a multi-node cluster (Confluence 5.4 and earlier), see Cluster Panic due to Multicast Traffic Communication Problem.
- If the problem happens in Confluence Data Center 5.6 or later (clustered), see Recovering from a Data Center cluster split-brain.
- If Confluence is connected on MySQL master database follow the resolution method of this guide.
- If Confluence is using a duplicate Server ID as a result of a cloned environment, see How to change the server ID of Confluence.
- If the above documents aren't able to point to the problem, check Data Center Troubleshooting