Upgrade a Confluence cluster on AWS without downtime

This document provides step-by-step instructions on performing a rolling upgrade on an AWS deployment orchestrated through CloudFormation. In particular, these instructions are suitable for Confluence Data Center deployments based on our AWS Quick Starts.

For an overview of rolling upgrades (including planning and preparation information), see Upgrade Confluence without downtime

Step 1: Enable upgrade mode

You need System Administrator global permissions to do this. 

To enable upgrade mode:

  1. Go to Administration  > General Configuration > Rolling upgrades.
  2. Select the Upgrade mode toggle (1). 

Screenshot: The Rolling upgrades screen.

The cluster overview can help you choose which node to upgrade first. The Tasks running (2) column shows how many long-running tasks are running on that node, and the Active users shows how many users are logged in. When choosing which node to upgrade first, start with the ones with the least number of tasks running and active users.

Upgrade mode allows your cluster to temporarily accept nodes running different Confluence versions. This lets you upgrade a node and let it rejoin the cluster (along with the other non-upgraded nodes). Both upgraded and non-upgraded active nodes work together to keep Confluence available to all users. You can disable upgrade mode as long as you haven’t upgraded any nodes yet.

Step 2: Find all the current application nodes in your stack

In AWS, note the Instance IDs of all running application nodes in your stack. These are all the application nodes running your current version. You'll need these IDs for a later step.

  1. In the AWS console, go to Services > CloudFormation. Select your deployment’s stack to view its Stack Details.

  2. Expand the Resources drop-down. Look for the ClusterNodeGroup and click its Physical ID. This will take you to a page showing the Auto Scaling Group details of your application nodes.
  3. In the Auto Scaling Group details, click on the Instances tab. Note all of the Instance IDs listed there; you'll be terminating them at a later step.

Step 3: Update your CloudFormation template

Your deployment uses a CloudFormation template that defines each component of your environment. In this case, upgrading Confluence means updating the version of Confluence used in the template. During the upgrade, we highly recommend that you add a node temporarily to your cluster as well.

  1. In the AWS console, go to Services > CloudFormation. Select your deployment’s stack to view its Stack Details.
  2. In the Stack Details screen, click Update Stack.
  3. From the Select Template screen, select Use current template and click Next.
  4. Set the Version parameter to the version you’re updating to. Since this is a rolling upgrade, you can only set this to a later bug fix version. 
  5. Add an extra node to your cluster. This will help ensure that your cluster won't have a shortage of nodes for user traffic. To do this, increase the value of the following parameters by 1:
    • Maximum number of cluster nodes
    • Minimum number of cluster nodes
  6. Select Next. Click through the next pages, and then to apply the change using the Update button.

After updating the stack, you will have one extra node already running the new Confluence version. With Upgrade mode enabled, that node will be allowed to join the cluster and start work. Your other nodes won't be upgraded yet.

As soon as the first upgraded node joins the cluster, your cluster status will transition to Mixed. This means that you won’t be able to disable Upgrade mode until all nodes are running the same version.

Once the new upgraded node is running an in an Active state, you should check the application logs for that node, and log in to Confluence on that node to make sure everything is working. It's still possible to roll back the upgrade at this point, so taking some time to test is recommended. 

Once you've tested the first node, you can start upgrading another node. To do that, shut down and terminate the node – AWS will then replace the node with a new one running the updated Confluence version.

Step 4: Upgrade another node

Start with the least busy node

We recommend that you start upgrading the node with the least number of running tasks and active users. On the Rolling upgrades page, you’ll find both in the Cluster overview section.

In Step 2, you noted the instance ID of each node in your cluster. Terminate the node where you gracefully shut down Confluence. To do this:

  1. In the AWS console, go to Services > EC2. From there, click Running Instances.
  2. Check the instance of matching the node where you gracefully shut down Confluence. 
  3. From the Actions drop-down, select Instance State > Terminate
  4. Click through to terminate the instance. 

Each time you terminate a node, AWS will automatically replace it. The replacement will be running the new version of Confluence. Once the new node's status is Active, you can move on to upgrading another node.

Step 5: Upgrade all other nodes individually

At this point, your cluster should have two nodes running the new version of Confluence. You can now upgrade other nodes. To do so, simply repeat the previous step on another node. As always, we recommend that you upgrade the node with the least number of running tasks each time.

If your deployment uses standalone Synchrony, you may need to update the version used by each Synchrony node as well. To do this, terminate each Synchrony node one after the other after you upgrade all nodes to the new version. 

Step 6: Finalize the upgrade

The steps to finalize your upgrade will differ slightly depending on whether you are upgrading to a bugfix version, or to the next feature version which may require upgrade tasks to be run. You should do this soon as possible, as some tasks are put on hold while your cluster is in upgrade mode. 

Finalize upgrade to a bugfix version

To finalize the upgrade:

  1. Wait for the cluster status to change to Ready to finalize. This won't happen until all nodes are active, and running the same upgraded version. 
  2. Select the Finalize upgrade button.
  3. Wait for confirmation that the upgrade is complete. The cluster status will change to Stable

Your upgrade is now complete. 

Finalize upgrade to a feature version

To finalize the upgrade:

  1. Wait for the cluster status to change to Ready to run upgrade tasks. This won't happen until all nodes are active, and running the same upgraded version. 
  2. Select the Run upgrade tasks and finalize upgrade button. 
  3. One node will start running upgrade tasks. Tail the logs on this node if you want to monitor the process. 
  4. Wait for confirmation that the upgrade is complete. The cluster status will change to Stable

Your upgrade is now complete. 

Screenshot: One cluster node running upgrade tasks for the whole cluster. 

More about upgrade tasks...

Upgrade tasks make any required changes to your database and file system, for example changing the database schema or the way index files are stored in the local home directories. 

There are a few things you should know about upgrade tasks:

  • One cluster node will run the upgrade tasks on the database and other nodes. If there's a problem, logs will be written to the application log on this node. 
  • The status of other nodes in the cluster may change to Running upgrade tasks momentarily to indicate that an upgrade task is making a change to the file system on that node. The node actually running the upgrade tasks does not change. 
  • Depending on the the size or complexity of your data, some upgrade tasks can take several hours to complete. We generally include a warning in the upgrade notes for the particular version if an upgrade task is likely to take a significant amount of time. 
  • It's not necessary to direct traffic away from the node running upgrade tasks, but if you know the upgrade tasks are likely to be significant, you may want to do this to avoid any performance impact. 

Step 7: Scale down your cluster

In Step 3, we added a node temporarily to the cluster as a replacement for each one we terminated. This was to help ensure we'd have enough nodes to handle normal user traffic. After finalizing the upgrade, you can remove that node:

  1. In the AWS console, go to Services > CloudFormation. Select your deployment’s stack to view its Stack Details.
  2. In the Stack Details screen, click Update Stack.
  3. From the Select Template screen, select Use current template and select Next.
  4. Decrease the value of the following parameters by 1:
    • Maximum number of cluster nodes
    • Minimum number of cluster nodes
  5. Select Next. Click through the next pages, and then to apply the change using the Update button.

You can now remove one node from your cluster without AWS replacing it. To do this:

  • Choose the node with the least number of running tasks. 
  • Shut down Confluence gracefully on the node.  
  • Terminate the node. 

Refer to Step 4 for detailed instructions.

Troubleshooting

Disconnect a node from the cluster through the load balancer

If an error prevents you from terminating a node, try disconnecting the node from the cluster through the load balancer. In the AWS Application Load Balancer, each node is registered as a target – so to disconnect a node, you'll have to de-register it. For more information on how to do this, see Target groups for your Application Load Balancers and Registered targets.

Traffic is disproportionately distributed during or after upgrade

Some load balancers might use strategies that send a disproportionate amount of active users to a newly-upgraded node. When this happens, the node might become overloaded, slowing down Confluence for all users logged in to the node.

To address this, you can also temporarily disconnect the node from the cluster. This will force the load balancer to re-distribute active users between all other available nodes. Afterwards, you can add the node again to the cluster.

Node errors during rolling upgrade

If a node’s status transitions to Error, it means something went wrong during the upgrade. You can’t finish the rolling upgrade if any node has an Error status. However, you can still disable Upgrade mode as long as the cluster status is still Ready to upgrade.

There are several ways to address this:

  • Shut down Confluence gracefully on the node. This should disconnect the node from the cluster, allowing the node to transition to an Offline status.

  • If you can’t shut down Confluence gracefully, shut down the node altogether.

Once all active nodes are upgraded with no nodes in Error, you can finalize the rolling upgrade. You can investigate any problems with the problematic node afterwards and re-connect it to the cluster once you address the error.

Roll back to the original version

How you roll back depends on the upgrade stage you have reached. See Roll back a rolling upgrade for more information. 

Mixed status with Upgrade mode disabled

If a node is in an Error state with Upgrade mode disabled, you can't enable Upgrade mode. Fix the problem or remove the node from the cluster to enable Upgrade mode.

Node won't start up 

If a node is Offline or Starting for too long, you may have to troubleshoot Confluence on the node directly. See Confluence Startup Problems Troubleshooting for related information.


Last modified on Oct 5, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.