Running Confluence Data Center in AWS

Confluence Data Center is an excellent fit for the Amazon Web Services (AWS) environment. Not only does AWS allow you to scale your clustered deployment elastically by resizing and quickly launching additional nodes, it also provides a number of managed services that work out of the box with Confluence Data Center instances and handle all their configuration and maintenance automatically.

Interested in learning more about the benefits of Data Center? Check out our overview of Confluence Data Center.


Deploying Confluence Data Center in a cluster using the AWS Quick Start

The simplest way to deploy your entire Data Center cluster in AWS is by using the Quick Start. The Quick Start launches, configures, and runs the AWS compute, network, storage, and other services required to deploy a specific workload on AWS, using AWS best practices for security and availability.

The Quick Start provides two deployment options, each with its own template. The first option deploys the Atlassian Standard Infrastructure (ASI) and then provisions Confluence Data Center into this ASI. The second option only provisions Confluence Data Center on an existing ASI.

Atlassian Standard Infrastructure (ASI)

The ASI is a virtual private cloud (VPC) that contains the components required by all Atlassian Data Center applications. For more information, see Atlassian Standard Infrastructure (ASI) on AWS.


Here's an overview of the Confluence Data Center Quick Start's architecture:

The deployment consists of the following components:

  • Instances/nodes: One or more Amazon Elastic Cloud (EC2) instances as cluster nodes, running Confluence.
  • Load balancer: An Application Load Balancer (ALB), which works both as load balancer and SSL-terminating reverse proxy.
  • Amazon EFS host: A shared file system for storing artifacts in a common location, accessible to multiple Confluence nodes. The Quick Start architecture implements the shared file system using the highly available Amazon Elastic File System (Amazon EFS) service.
  • Database: Your choice of shared database instance – Amazon RDS or Amazon Aurora.
  • Amazon CloudWatch: Basic monitoring and centralized logging through Amazon's native CloudWatch service.

For more information on the architecture, components and deployment process, see our Quick Start Guide

Confluence will use the Java Runtime Engine (JRE) that is bundled with Confluence (/opt/atlassian/confluence/jre/), and not the JRE that is installed on the EC2 instances (/usr/lib/jvm/jre/). 


Advanced customizations

To get you up and running as quickly as possible, the Quick Start doesn't allow the same level of customization as a manual installation. You can, however, further customize your deployment through the variables in the  Ansible playbooks we use.

All of our AWS Quick Starts use Ansible playbooks to configure specific components of your deployment. These playbooks are available publicly on this repository:

https://bitbucket.org/atlassian/dc-deployments-automation

You can override these configurations by using Ansible variables. Refer to the repository’s README file for more information.

Deploying the Quick Start from your own S3 bucket (recommended)

The fastest way to launch the Quick Start is directly from its AWS S3 bucket. However, when you do, any updates we make to the Quick Start templates will propagate directly to your deployment. These updates sometimes involve adding or removing parameters from the templates. This could introduce unexpected (and possibly breaking) changes to your deployment.

For production environments, we recommend that you copy the Quick Start templates into your own S3 bucket. Then, launch them directly from there. Doing this gives you control over when to propagate Quick Start updates to your deployment.

  1. Clone the Quick Start templates (including all of its submodules) to your local machine. From the command line, run:

    git clone --recurse-submodules https://github.com/aws-quickstart/quickstart-atlassian-confluence.git

  2. (Optional) The Quick Start templates repository uses the directory structure required by the Quick Start interface. If needed (for example, to minimize storage costs), you can remove all other files except the following:


    quickstart-atlassian-confluence 
    ├─ submodules 
    │ └─ quickstart-atlassian-services 
    │ └─ templates 
    │ └── quickstart-vpc-for-atlassian-services.yaml 
    └─ templates 
    ├── quickstart-confluence-master-with-vpc.template.yaml 
    └── quickstart-confluence-master.template.yaml
  3. Install and set up the AWS Command Line Interface. This tool will allow you to create an S3 bucket and upload content to it.

  4. Create an S3 bucket in your region:

    aws s3 mb s3://<bucket-name> --region <AWS_REGION>

At this point, you can now upload the Quick Start templates to your own S3 bucket. Before you do, you'll have to choose which Quick Start template you’ll be using:

    • quickstart-confluence-master-with-vpc.template.yaml: use this for deploying into a new ASI (end-to-end deployment).

    • quickstart-confluence-master.template.yaml: use this for deploying into an existing ASI.

  1. In the template you’ve chosen, the QSS3BucketName default value is set to aws-quickstart. Replace this default with the name of the bucket you created earlier.
  2. Go into the parent directory of your local clone of the Quick Start templates. From there, upload all the files in local clone to your S3 bucket:

    aws s3 cp quickstart-atlassian-confluence s3://<bucket-name> --recursive --acl public-read

  3. Once you’ve uploaded everything, you’re ready to deploy your production stack from your S3 bucket. Go to Cloudformation → Create Stack. When specifying a template, paste in the Object URL of the Quick Start template you’ll be using.

Amazon Aurora database for high availability

The Quick Start also allows you to deploy Confluence Data Center with an Amazon Aurora clustered database (instead of RDS). 

This cluster will be PostgreSQL-compatible, featuring a primary database writer that replicates to two database readers. You can also set up the writers and readers in separate availability zones for better resiliency.

If the writer fails, Aurora automatically promotes one of the readers to take its place. For more information, see Amazon Aurora Features: PostgreSQL-Compatible Edition.

If you want to set up an existing Confluence Data Center instance with Amazon Aurora, you’ll need to perform some extra steps. See Configuring Confluence Data Center to work with Amazon Aurora for detailed instructions.

Synchrony setup

If you have a Confluence Data Center license, two methods are available for running Synchrony:

  • managed by Confluence (recommended)
    Confluence will automatically launch a Synchrony process on the same node, and manage it for you. No manual setup is required. 
  • Standalone Synchrony cluster (managed by you)
    You deploy and manage Synchrony standalone in its own cluster with as many nodes as you need. Significant setup is required. 

If you want simple setup and maintenance, we recommend allowing Confluence to manage Synchrony for you.  If you want full control, or if making sure the editor is highly available is essential, then managing Synchrony in its own cluster may be the right solution for your organisation. 

By default, the Quick Start will configure Synchrony to be managed by Confluence. However, you can use the Quick Start to configure standalone Synchrony. When you do, the Quick Start creates an Auto Scaling group containing one or more Amazon EC2 instances as cluster nodes, running Synchrony. 

For more information about Synchrony configuration, see Possible Confluence and Synchrony Configurations.

Managed mode is only available in 6.12 and later

If you plan to deploy a Confluence Data Center version earlier than 6.12, you can only use Standalone mode. In the Quick Start, this means you should set your Collaborative editing mode to synchrony-separate-nodes.

Amazon CloudWatch for basic monitoring and centralized logging

The Quick Start also installs and configures Amazon CloudWatch to monitor each node in your deployment. This will allow you to monitor each node's CPU, disk, and network activity – all from a pre-configured CloudWatch dashboard. By default, Amazon CloudWatch will also collect and store logs from each node into a single, central source. This centralized logging allows you to search and analyze your deployment's log data more easily and effectively. See Analyzing Log Data with CloudWatch Logs Insights and Search Log Data Using Filter Patterns for more information.

Amazon CloudWatch provides basic logging and monitoring, but also costs extra. To help reduce the cost of your deployment, you can disable logging or turn off Amazon CloudWatch integration altogether.

tip/resting Created with Sketch.

To download your log data (for example, to archive it or analyze it outside of AWS), you’ll have to export it first to S3. From there, you can download it. See Exporting Log Data to Amazon S3 for details.

Auto Scaling groups

This Quick Start uses Auto Scaling groups, but only to statically control the number of its cluster nodes. We don't recommend that you use Auto Scaling to dynamically scale the size of your cluster. Adding an application node to the cluster usually takes more than 20 minutes, which isn't fast enough to address sudden load spikes.

If you can identify any periods of high and low load, you can schedule the application node cluster to scale accordingly. See Scheduled Scaling for Amazon EC2 Auto Scaling for more information. 

To study trends in your organization's load, you'll need to monitor the performance of your deployment. Refer to Confluence Data Center sample deployment and monitoring strategy for tips on how to do so. 

EC2 sizing recommendations

For Large or XLarge deployments, check out our AWS infrastructure recommendations for application, Synchrony, and database sizing advice. For smaller deployments, you can use instances that meet Confluence's system requirements.  Smaller instance types (micro, small, medium) are generally not adequate for running Confluence.

Supported AWS regions

Not all regions offer the services required to run Confluence.  You'll need to choose a region that supports Amazon Elastic File System (EFS). You can currently deploy Confluence using the Quick Start in the following regions:  

  • Americas
    • Northern Virginia
    • Ohio
    • Oregon
    • Northern California
    • Montreal
  • Europe/Middle East/Africa
    • Ireland
    • Frankfurt
    • London
    • Paris
  • Asia Pacific
    • Singapore
    • Tokyo
    • Sydney
    • Seoul
    • Mumbai

This list was last updated on .

The services offered in each region change from time to time. If your preferred region isn't on this list, check the Regional Product Services table in the AWS documentation to see if it already supports EFS. 

If you are deploying Confluence 6.3.1 or earlier....

There is an additional dependency for Confluence versions earlier than 6.3.2. Synchrony (which is required for collaborative editing) uses a third party library to interact with the Amazon API, and the correct endpoints are not available in all regions. This means you can't run Synchrony in the following regions:

  • US East (Ohio)
  • EU (London)1
  • Asia Pacific (Mumbai) 1
  • Asia Pacific (Seoul) 1
  • Canada (Central) 1

1 At the time of writing, these regions did not yet support EFS, so also can't be used to run Confluence.


Internal domain name routing with Route53 Private Hosted Zones

Even if your Confluence site is hosted on AWS, you can still link its DNS with an internal, on-premise DNS server (if you have one). You can do this through Amazon Route 53, creating a link between the public DNS and internal DNS. This will make it easier to access your infrastructure resources (database, shared home, and the like) through friendly domain names. You can make those domain names accessible externally or internally, depending on your DNS preferences.

Step 1: Create a new hosted zone

Create a Private hosted zone in Services > Route 53. The Domain Name is your preferred domain. For the VPC, use the existing Atlassian Standard Infrastructure.

Step 2: Configure your stack to use the hosted zone

Use your deployment’s Quick Start template to point your stack to the hosted zone from Step 1. If you’re setting up Confluence for the first time, follow the Quick Start template as below:

  1. Under DNS (Optional), enter the name of your hosted zone in the Route 53 Hosted Zone field.

  2. Enter your preferred domain sub-domain in the Sub-domain for Hosted Zone field. If you leave it blank, we'll use your stack name as the sub-domain.

  3. Follow the prompts to deploy the stack.

If you already have an existing Confluence site, you can also configure your stack through the Quick Start template. To access this template:

  1. Go to to Services > CloudFormation in the AWS console

  2. Select the stack, and click Update Stack.

  3. Under DNS (Optional), enter the name of your hosted zone in the Route 53 Hosted Zone field.

  4. Enter your preferred domain sub-domain in the Sub-domain for Hosted Zone field. If you leave it blank, we'll use your stack name as the sub-domain.

  5. Follow the prompts to update the stack.

In either case, AWS will generate URLs and Route 53 records for the load balancer, EFS, and database. For example, if your hosted zone is my.hostedzone.com and your stack is named mystack, you can access the database through the URL mystack.db.my.hostedzone.com.

Step 3: Link your DNS server to the Confluence site’s VPC

If you use a DNS server outside of AWS, then you need to link it to your deployment’s VPC (in this case, the Atlassian Standard Infrastructure). This means your DNS server should use Route 53 to resolve all queries to the hosted zone’s preferred domain (in Step 1).

For instructions on how to set this up, see Resolving DNS Queries Between VPCs and Your Network.

If you want to deploy an internal facing Confluence site, using your own DNS server, you can use Amazon Route 53 to create a link between the public DNS and internal DNS. 

  1. In Route 53, create a Private hosted zone. For the VPC, you can use the existing Atlassian Services VPC. The domain name is your preferred domain.
  2. If you've already set up Confluence, go to Services > CloudFormation in the AWS console, select the stack, and click Update Stack. (If you're setting up Confluence for the first time, follow the Quick Start template as below). 
  3. Under Other Parameters, enter the name of your hosted zone in the Route 53 Hosted Zone field. 
  4. Enter your preferred sub-domain or leave the Sub-domain for Hosted Zone field blank and we'll use your stack name as the sub-domain.
  5. Follow the prompts to update the stack. We'll then generate the load balancer and EFS url, and create a record in Route 53 for each. 
  6. In Confluence, go to  > General Configuration and update the Confluence base URL to your Route 53 domain. 
  7. Set up DNS resolution between your on-premises network and the VPC with the private hosted zone. You can do this with:
    1. an Active Directory (either Amazon Directory Service or Microsoft Active Directory)
    2. a DNS forwarder on EC2 using bind9 or Unbound.
  8. Finally, terminate and re-provision each Confluence and Synchrony node to pick up the changes.
tip/resting Created with Sketch.

For related information on configuring Confluence's base URL, see Configuring the Server Base URL.


Scaling up and down

To increase or decrease the number of Confluence or Synchrony cluster nodes:

  1. Sign in to the AWS Management Console, use the region selector in the navigation bar to choose the AWS Region for your deployment, and open the AWS CloudFormation console at https://console.aws.amazon.com/cloudformation/.
  2. Click the Stack name of your deployment. This will display your deployment's Stack info. From there, click Update.
  3. On the Select Template page, leave Use current template selected, and then choose Next.
  4. On the Specify Details page, go to the Cluster nodes section of Parameters. From there, set your desired number of application nodes in the following parameters:
    1. Minimum number of cluster nodes
    2. Maximum number of cluster nodes
  5.  Click through to update the stack.

Disabled Auto Scaling

Since your cluster has the same minimum and maximum number of nodes, Auto Scaling is effectively disabled.

Setting different values for the minimum and maximum number of cluster nodes enables Auto Scaling. This dynamically scale the size of your cluster based on system load.

However, we recommend that you keep Auto Scaling disabled. At present, Auto Scaling can't effectively address sudden spikes in your deployment's system load. This means that you'll have to manually re-scale your cluster depending on the load.

Vertical VS Horizontal scaling

Adding new cluster nodes, especially automatically in response to load spikes, is a great way to increase capacity of a cluster temporarily. Beyond a certain point,  adding very large numbers of cluster nodes will bring diminishing returns. In general, increasing the size of each node (i.e., "vertical" scaling) will be able to handle a greater sustained capacity than increasing the number of nodes (i.e., "horizontal" scaling), especially if the nodes themselves are small.  See Infrastructure recommendations for enterprise Confluence instances on AWS for more details.

See the AWS documentation for more information on auto scaling groups. 

Connecting to your nodes over SSH

You can perform node-level configuration or maintenance tasks on your deployment via SSH. To do this, you'll need your SSH private key file (the PEM file you specified for the Key Name parameter). Remember, this key can access all nodes in your deployment, so keep this key in a safe place.

To help restrict access to the deployment, our Quick Start deploys a Bastion host. To connect to your deployment over SSH, you'll need to access the Bastion host first. This host acts as your "jump box" to any instance in your deployment's internal subnets. That is, you SSH first to the Bastion host, and from there to any instance in your deployment. 

The Bastion host's public IP is the BastionPubIp output of your deployment's ATL-BastionStack stack. This stack is nested in your deployment's Atlassian Standard Infrastructure (ASI). To access the Bastion host, use ec2-user as the user name, for example:

ssh -i keyfile.pem ec2-user@<BastionPubIp>

The ec2-user has sudo access. SSH access is by root is not allowed.

tip/resting Created with Sketch.

Alternatively, you can also access instances in your deployment directly through the Session Manager of the AWS Systems Manager. This will allow you to skip the Bastion host entirely.

Upgrading

Consider upgrading to an  Atlassian Enterprise release (if you're not on one already). Enterprise releases get fixes for critical bugs and security issues throughout its two-year support window. This gives you the option to keep a slower upgrade cadence without sacrificing security or stability. Enterprise releases are suitable for companies who can't keep up with the frequency at which we ship feature releases.

Here's some useful advice for upgrading your deployment:

  1. Before upgrading to a later version of Confluence Data Center, check if your apps are compatible with that version. Update your apps if needed. For more information about managing apps, see Using the Universal Plugin Manager.
  2. If you need to keep Confluence Data Center running during your upgrade, we recommend using read-only mode for site maintenance Your users will be able to view pages, but not create or change them. 
  3. We strongly recommend that you perform the upgrade first in a staging environment before upgrading your production instance. Create a staging environment for upgrading Confluence provides helpful tips on doing so.

When the time comes to upgrade your deployment, perform the following steps:

Step 1: Terminate all running Confluence Data Center application nodes

Set the number of application nodes used by the Confluence Data Center stack to 0. Then, update the stack.

If your deployment uses standalone Synchrony, scale the number of Synchrony nodes to 0 at the same time.

Click here for detailed instructions


  1. In the AWS console, go to Services > CloudFormation. Select your deployment’s stack to view its Stack Details.

  2. In the Stack Details screen, click Update Stack.

  3. From the Select Template screen, select Use current template and click Next.

  4. You’ll need to terminate all running nodes. To do that, set the following parameters to 0:

    1. Maximum number of cluster nodes

    2. Minimum number of cluster nodes

  5. Click Next. Click through the next pages, and then to apply the change using the Update button.

  6. Once the update is complete, check that all application nodes have been terminated.



Step 2: Update the version used by your Confluence Data Center stack

Set the number of application nodes used by Confluence Data Center to 1. Configure it to use the version you want. Then, update the stack again.

If your deployment uses standalone Synchrony, scale the number of Synchrony nodes to 1 at the same time.

Click here for detailed instructions

  1. From your deployment’s Stack Details screen, click Update Stack again.

  2. From the Select Template screen, select Use current template and click Next.

  3. Set the Version parameter to the version you’re updating to.

  4. Configure your stack to use one node. To do that, set the following parameters to 1:

    1. Maximum number of cluster nodes

    2. Minimum number of cluster nodes

  5. Click Next. Click through the next pages, and then to apply the change using the Update button.

Step 3: Scale up the number of application nodes

You can now scale up your deployment to your original number of application nodes. You can do so for your Synchrony nodes as well, if you have standalone Synchrony. Refer back to Step 1 for instructions on how to re-configure the number of nodes used by your cluster.

Confluence Data Center in AWS currently doesn't allow upgrading an instance without some downtime in between the last cluster node of the old version shutting down and the first cluster node on the new version starting up.  Make sure all existing nodes are terminated before launching new nodes on the new version. 

 

Backing up

We recommend you use the AWS native backup facility, which utilizes snap-shots to back up your Confluence Data Center. For more information, see AWS Backup

Migrating your existing Confluence site to AWS

After deploying Confluence on AWS, you might want to migrate your old deployment to it. To do so:

  1. Upgrade your existing site to the version you have deployed to AWS (Confluence 6.1 or later).
  2. (Optional) If your old database isn't PostgreSQL, you'll need to migrate it. See Migrating to Another Database for instructions. 
  3. Back up your PostgreSQL database and your existing <shared-home>/attachments directory.
  4. Copy your backup files to /media/atl/confluence/shared-home in your EC2 instance.  
  5. Restore your PostgreSQL database dump to your RDS instance with pg_restore.
    See Importing Data into PostgreSQL on Amazon RDS in Amazon documentation for more information on how to do this.   


Important notes

  • When you create a cluster using the CloudFormation template, the database name is confluence. You must maintain this database name when you restore, or there will be problems when new nodes are provisioned.  You will need to drop the new database and replace it with your backup. 
  • You don't need to copy indexes or anything from your existing local home or installation directories, just the attachments from your existing shared home directory.  
  • If you've modified the <shared-home>/config/cache-settings-overrides.properties file you may want to reapply your changes in your new environment.  
  • The _copy method described in this AWS page, Importing Data into PostgreSQL on Amazon RDS, is not suitable for migrating Confluence.

Last modified on Sep 2, 2019

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.