Upgrade a Confluence cluster through the API without downtime

This document provides guidance on how to initiate and finalize a rolling upgrade through API calls. This upgrade method is suitable for admins with the skills and automation tools to orchestrate maintenance tasks (like upgrades).

For an overview of rolling upgrades (including planning and preparation information), see Upgrade Confluence without downtime.

API reference

The entire rolling upgrade process is governed by the following API:

http://<host>:<port>/rest/zdu/cluster/zdu/

This API has the following calls:

/zdu	Get an overview of the cluster's status.
/zdu/start	Enable upgrade mode.
/zdu/state	Get the status of the cluster.
/zdu/nodes/{nodeId}	Get an overview of a node's status, including the number of running tasks.
/zdu/cancel	Disable upgrade mode. You can only use this call if the upgrade progress is not MIXED.
/zdu/approve	Once all nodes are upgraded, finalize the rolling upgrade. This will automatically disable upgrade mode.

For detailed information about each API call, see Confluence REST API Documentation.

Initiating a rolling upgrade

To initiate a rolling upgrade, enable rolling upgrade first. To do this, use:

http://<host>:<port>/rest/zdu/cluster/zdu/start

Upgrade mode allows your cluster to temporarily accept nodes running different Confluence versions. This lets you upgrade a node and let it rejoin the cluster (along with the other non-upgraded nodes). Both upgraded and non-upgraded active nodes work together to keep Confluence available to all users. You can disable upgrade mode as long as you haven’t upgraded any nodes yet.

Upgrading each node individually

Before you upgrade a node, you'll need to gracefully shut down Confluence on it. To do this, run the stop script corresponding to your operating system and configuration. Learn more about graceful Confluence shutdowns.

For example, if you installed Confluence as a service on Linux, run the following command:

$ sudo /etc/init.d/confluence stop

After upgrading Confluence on the node, wait for it to transition to an Active status first before upgrading another node.

Node statuses

To get the status of a node, use:

http://<host>:<port>/rest/zdu/cluster/zdu/nodes/<nodeID>

ACTIVE	Confluence is connected to the cluster and running with no errors.
STARTING	Confluence is still loading, and should transition to Active once finished.
TERMINATING	Confluence was gracefully shut down, and should transition to Offline once finished.
OFFLINE	Confluence is not responding on the node. This node will be removed from the cluster completely if it is still offline after Upgrade mode is disabled.
ERROR	Something went wrong with Confluence on the node.

Cluster statuses

To get the status of the cluster, use:

http://<host>:<port>/rest/zdu/cluster/zdu/state

STABLE	You can turn on Upgrade mode now.
READY_TO_UPGRADE	Upgrade mode is enabled, but no nodes have been upgraded yet. You can start upgrading your first node now.
MIXED	At least one node is upgraded, but you haven't finished upgrading all nodes yet. Your cluster has nodes running different Confluence versions. You need to upgrade all nodes to the same bug fix version to transition to the next status (READY_TO_RUN_UPGRADE_TASKS).
READY_TO_RUN_UPGRADE_TASKS	All nodes have node been upgraded. You can now finalize the rolling upgrade: `http://<host>:<port>/rest/zdu/cluster/zdu/approve`

Enable and disable Upgrade mode

How you roll back depends on the upgrade stage you have reached. See Roll back a rolling upgrade for more information.

Mixed status with Upgrade mode disabled

If a node is in an Error state with Upgrade mode disabled, you can't enable Upgrade mode. Fix the problem or remove the node from the cluster to enable Upgrade mode.

Troubleshooting

Node errors during rolling upgrade

If a node’s status transitions to Error, it means something went wrong during the upgrade. You can’t finish the rolling upgrade if any node has an Error status. However, you can still disable Upgrade mode as long as the cluster status is still Ready to upgrade.

There are several ways to address this:

Shut down Confluence gracefully on the node. This should disconnect the node from the cluster, allowing the node to transition to an Offline status.
If you can’t shut down Confluence gracefully, shut down the node altogether.

Once all active nodes are upgraded with no nodes in Error, you can finalize the rolling upgrade. You can investigate any problems with the problematic node afterwards and re-connect it to the cluster once you address the error.

Disconnecting a node from the cluster through the load balancer

If a node error prevents you from gracefully shutting down Confluence, try disconnecting it from the cluster through the load balancer. The following table provides guidance how to do so for popular load balancers.

NGINX	NGINX defines groups of cluster nodes through the upstream directive . To prevent the load balancer from connecting to a node, delete the node's entry from its corresponding upstream group. Learn more about the upstream directive in the ngx_http_upstream_module module.
HAProxy	With HAProxy, you can disable all traffic to the node by putting it in a `maint` state: `set server <node IP or hostname> state maint` Learn more about forcing a server's administrative state.
Apache	You can disable a node (or "worker") by setting its activation member attribute to disabled. Learn more about advanced load balancer worker properties in Apache.
Azure Application Gateway	We provide a deployment template for Confluence Data Center on Azure; this template uses the Azure Application Gateway as its load balancer. The Azure Application Gateway defines each node as a target within a backend pool. Use the Edit backend pool interface to remove your node's corresponding entry. Learn more about adding (and removing) targets from a backend pool.

Traffic is disproportionately distributed during or after upgrade

Some load balancers might use strategies that send a disproportionate amount of active users to a newly-upgraded node. When this happens, the node might become overloaded, slowing down Confluence for all users logged in to the node.

To address this, you can also temporarily disconnect the node from the cluster. This will force the load balancer to re-distribute active users between all other available nodes. Afterwards, you can add the node again to the cluster.

Node won't start up

If a node is Offline or Starting for too long, you may have to troubleshoot Confluence on the node directly. See Confluence Startup Problems Troubleshooting for related information.

Products

Jira Software

Jira Service Management

Jira Work Management

Confluence

Bitbucket

Resources

Documentation

Community

System Status

Suggestions and bugs

Marketplace

Billing and licensing

Upgrade a Confluence cluster through the API without downtime

Upgrade Confluence without downtime

On this page

Still need help?

API reference

Initiating a rolling upgrade

Upgrading each node individually

Node statuses

Cluster statuses

Troubleshooting

Node errors during rolling upgrade

Disconnecting a node from the cluster through the load balancer

Traffic is disproportionately distributed during or after upgrade

Node won't start up

Page

Viewport

Confluence

Upgrade a Confluence cluster through the API without downtime

Upgrade Confluence without downtime

On this page

Related content

Still need help?

API reference

Initiating a rolling upgrade

Upgrading each node individually

Node statuses

Cluster statuses

Troubleshooting

Node errors during rolling upgrade

Disconnecting a node from the cluster through the load balancer

Traffic is disproportionately distributed during or after upgrade

Node won't start up

Related content