Infrastructure recommendations for enterprise-scale Confluence instances

On this page:

In Confluence Data Center load profiles, we presented simple guidelines for finding out if your instance was Small, Medium, Large, or XLarge. We based these size profiles on different Server and Data Center case studies, covering instances of varying infrastructure sizes and configurations. Knowing your load profile is useful for planning your instance's growth, looking for inflated metrics, or simply keeping your instance at a reasonable size.

As your load grows closer to Large or XLarge sizes, it may be time to consider an infrastructure upgrade. Planning this involves knowing how to deploy your Confluence application and database nodes. However, it's not always clear how to do that effectively – for example, adding more application nodes to a growing Medium-sized instance doesn't always improve performance (in fact, the opposite might happen). 

To help you out, we ran a series of performance tests on typical Large and XLarge Confluence instances. We designed these tests to get useful, data-driven recommendations for your deployment's application and database nodes. These recommendations can help you plan a suitable environment, or even check whether your current instance is adequate for the size of your content and traffic.

Approach

We ran all of our tests in AWS environments. This allowed us to easily define and automate many tests, giving us a large and fairly reliable sample of test results. 

Each part of our test infrastructure is a standard AWS component available to all AWS users. This means you can easily deploy our recommended configurations. You can even use AWS Quick Starts for deploying Confluence Data Center to do so. 

Since we used standard AWS components, you can look up their specifications in the AWS documentation. This lets you find equivalent components and configurations if your organization prefers to use a different cloud platform or bespoke clustered solution

Some things to consider

To gather a large sample of benchmarks for analysis, we designed tests that could be easily set up and replicated. As such, when referencing our benchmarks and recommendations for your infrastructure plans, consider the following:

  • We didn't install apps on our test instances, as our focus was finding the right configurations for the core product. When designing your infrastructure, you need to account for the performance impact of apps you want to install.
  • We used Postgresql 9.4.15 with default settings across all our tests. This allowed us to get consistent results with minimal setup and tuning.
  • Our test environment used dedicated AWS infrastructure hosted on the same subnet. This helped minimize network latency.
  • We tested a Large data set on Confluence Data Center 6.14, and XLarge on 6.15. This was a result of release timing: 6.14 was the latest available release when we tested for Large, and 6.15 was available when we tested XLarge. On top of that, we tested both data sets against the latest Confluence Enterprise release (namely, 6.13).

Analytics

We enabled Analytics on each test instance to collect usage data. For more information, see Data Collection Policy.

We also tested if Analytics had a huge impact on performance (it doesn't)

We ran a separate test to find out the performance impact of enabling Analytics. To do this, we simulated production load on two identical Confluence Data Center instances. We loaded both instances with an identical Large-sized data set and ran the same amount of traffic on both. The only difference between them was that we enabled Analytics on one but not the other.

The following chart shows the box plot of key operations with Analytics enabled (Baseline - left) and disabled (Test - right).

Overall, we observed minimal difference in response times.

Disk I/O considerations

Our data set featured a limited amount of attachments, resulting in traffic that was mostly composed of write operations (compared to a typical production instance). On average, this traffic produced 1kbps reads and 2,500 kbps writes on our shared home. This roughly equates to an average IOPS of 0.15 and 200 for reads and writes, respectively.

While we didn't set out to test disk I/O specifically, these results suggest that our shared home configuration (that is, a single m4.large node running NFS on gp2) was sufficient for our load. The disk load was stable throughout all our tests, and did not stress the NFS server.

Bear in mind, however, that the synthetic traffic we used here was mostly write traffic. This is not representative of typical production instances, which feature a higher proportion of reads.

Methodology

We ran two separate test phases: one for Large, and another for XLarge. Each phase involved testing a specific volume of traffic to the same Confluence data set, but on a freshly provisioned AWS environment. Each environment was exactly like the last, except for the configuration of application and database nodes.

Our objective was to benchmark different configurations for Large XLarge size profiles. Specifically, we analyzed how different AWS virtual machine types affected the performance of the instance. 

Benchmark

For all tests in both Large and XLarge size profiles, we used an Apdex of 0.8 as our threshold for acceptable performance. This Apdex assumes that a 1-second response time is our Tolerating threshold, while anything above 4 seconds is our Frustrated threshold.

By comparison, we target an Apdex of 0.7 for our own internal production Confluence Data Center instances (as we discussed in Confluence Data Center sample deployment and monitoring strategy). However, that 0.7 Apdex takes into account the performance impact of apps on those instances. We don't have any apps installed on our test instances, so we adjusted the target Apdex for our tests to 0.8. 


Click here for details about how we compute each test's overall Apdex

Apdex

Apdex (Application Performance Index) is a popular standard used by many companies to report, benchmark, and track application performance. See http://apdex.org/overview.html for more information.


To calculate our overall Apdex, we first assign an Apdex to each user action based on its response time:

Response time Request Apdex
Less than 1 second 1
1-4 seconds 0.5
over 4 seconds 0


Next, we apply a weight to the Apdex of that action, (based on its type). This weight represents how heavily the action's score affects the overall Apdex: 

User action type Weight (%)
confluence.page.view 84
confluence.blogpost.view 1
confluence.dashboard.view 6
confluence.page.create.collaborative.view 2
confluence.page.edit.collaborative.view 5
confluence.page.edit.collaborative.quick-view 2

A large number of Confluence customers provide us with usage statistics, and we based these weights on the top 1,000 instances in terms of traffic volume. Of those instances, page views (confluence.page.view) make up an overwhelming majority of user actions.

This means that each simulated user action gets scored as:

(Apdex) x (weight per page action type) = weighted Apdex

For example, if confluence.page.edit.collaborative.view takes 2 seconds to complete, it gets a weighted Apdex of 0.025 based on:

0.5 x 0.05 = 0.025

Finally, to get the overall Apdex, we add the weighted Apdex of all user actions. 

Architecture

We used the same basic architecture to test both Large and XLarge data sets:

 

We tested each configuration on a freshly-deployed instance of Confluence Data Center on AWS. Every configuration followed the same basic structure:

Function Number of nodes Virtual machine type/s Notes
Confluence application Variable

c5.xlarge
c5.2xlarge
c5.4xlarge
c4.8xlarge

For each test, nodes for both Confluence application and Synchrony used the same virtual machine type. However, while we used a variable number of nodes for the Confluence application, we always used one node for Synchrony.

When testing c5.xlarge (which only has 8GB of RAM), we used 4GB for JVM heap. For all others, we used 8GB.

Synchrony  1
Database 1

db.m4.xlarge
db.m4.2xlarge
db.m4.4xlarge

We used Amazon RDS Postgresql version 9.4.15, with default settings. Each test only featured one node.
Shared home 1

m4.large

Our NFS server used a 200GB General Purpose SSD (gp2)  for storage. This disk had an attached EBS volume with a baseline of 600 IOPS, burstable to 3,000 IOPS. 
Load balancer 1 AWS Application Load Balancer

Each Confluence application and Synchrony node used 50GB General Purpose SSD (gp2) for local storage. This disk had an attached EBS volume with a baseline of 150 IOPS, burstable to 3,000 IOPS. 

Refer to General Purpose Instances and Compute-Optimized Instances (from the AWS documentation on Instance Types) for details on each virtual machine type we tested.

Recommendations for Large-sized instances

The following table shows the data set and traffic we used on our performance tests for Large-size instances:

Load profile metric Value
Total Spaces 6,550
Content (All Versions) 16,000,000
Local Users 12,300
Traffic (HTTP requests per hour) 498,000
Click here for more data set and traffic details

Standard size for Large

According to Confluence Data Center load profiles, a Large-sized instance falls within the following range for each metric:

  • Total spaces: 2,500 to 5,000
  • Content (all versions): 2.5 million to 10 million
  • Local users: 10,000 to 100,000
  • HTTP calls per hour: 350,000 to 700,000 p/h

Test data set and traffic breakdown

The metrics we used for our Large data set are based on Confluence Data Center load profiles, which put the overall load profile of the instance at the upper range of Large. We believe that these metrics represent a majority of real-life, Large-sized Confluence Data Center instances.

Metric Total Components Value (approximate)
Total Spaces 6,550 Site Spaces 1,500
Personal Spaces 5,000
Content (all versions) 16,000,000 Content (Current Versions) 6,900,000
Comments 2,000,000
Local Users 12,300 Active Users 565
Traffic (HTTP requests) 498,000 per hour Reads (kbps) 1
Writes (kbps) 2,500


Our data set also uses 9,900 local groups.

tip/resting Created with Sketch.

Good to know

  • Content (all versions) is the total number of all versions of all pages, blog posts, comments, and files in the instance. It's the total number of rows in the CONTENT table in the Confluence database.

  • Content (current versions) is the number of pages, blog posts, comments, and files in the instance. It doesn't include historical versions of pages, blog posts, or files.

  • Local Users is the number of user accounts from local and remote directories that have been synced to the database. It includes both active and inactive accounts.

  • Active Users is the number of all Local Users logged in during each test. All HTTP requests were issued by a subset of all active users.


We analyzed the benchmarks and configurations from our Large testing phase and came up with the following recommendations:

Recommendation Application nodes Database node Cost per hour 1 Apdex (per Confluence version)
6.13 6.14
Performance c5.4xlarge x 2 db.m4.2xlarge 2.09 0.852 0.874
Stability c5.2xlarge x 4 db.m4.xlarge 1.72 0.817 0.837
Low cost c5.2xlarge x 3 db.m4.xlarge 1.38 0.820 0.834


The Performance option offered the best Apdex among all the configurations we tested. It can maintain an Apdex above 0.8 even when it loses one node, and can afford to lose two before going offline.

The Stability and Low cost options offer a good balance between price, fault tolerance, and performance. You'll notice that they both use the same virtual machine types – the Stability option just has an additional application node. The Stability option can afford to lose more nodes before going completely offline, but the Low cost option costs less. Any performance difference between the two is negligible. 

Cost per hour

1 In our recommendations for both Large and XLarge size profiles, we quoted a cost per hour for each configuration. We provide this information help inform you about the comparative price of each configuration. This cost only calculates the price of the nodes used for the Confluence application and database. It does not include the cost of using other components like Synchrony, shared home, or application load balancer.

These figures are in USD, and were correct as of January 2019.

Large tests: results and analysis

We ran two types of tests to get these recommendations: one to determine optimal configurations for the Confluence application node and another for the database node:

  1. Our first test type sought to find out which AWS virtual machine types to use (and how many) for the application node. For these tests, we used a single db.m4.xlarge node for the database.
  2. Our second test series benchmarked different virtual machine types for the database. Here, we tested different virtual machine types against two application node configurations: two c5.4xlarge and four c5.2xlarge. These application node configurations yielded the highest Apdex from the previous test. In each test from this series, we only used one database node as well.


Click here for more details about these test results

Application node test results

We first tested different virtual machine types for the application node against a single db.m4.xlarge node for the database in each. We ran these tests on Confluence 6.14, which was the latest version available at the time of our tests.

The following graph shows the actual Apdex benchmarks from each test:

These results show that the best performance came from c5.4xlarge nodes. You'll need at least two nodes for high availability, and the Apdex turned out higher than the others regardless of how many nodes we tested. Each of these nodes features 16 CPUs with 32GB of RAM.

Using three or more c5.2xlarge nodes still showed acceptable Apdex. Beyond four nodes, there aren't any considerable changes in performance. This is worth considering if you want to use more than two nodes for reliability and fault tolerance. The hourly cost 1 of four c5.2xlarge nodes is roughly the same as two c5.4xlarge nodes.

Database nodes test results

The application node test results showed that using two c5.4xlarge nodes or four c5.2xlarge nodes provided acceptable Apdex results. Using this information, we moved on to the next series of tests – testing optimal configurations for the database node. We also ran these tests on Confluence 6.14 to benchmark the best-performing Confluence application node configurations against the following virtual machine types for the database node: 

  • db.m4.large
  • db.m4.xlarge
  • db.m4.2xlarge
  • db.m4.4xlarge

The following graph shows how one of each virtual machine type performed as a database node:

Using db.m4.large for the database and four c5.2xlarge nodes for the application yielded results below our Apdex threshold of 0.8. Throughout all tests, using two c5.4xlarge nodes for the application showed better performance. 

Using the db.m4.2xlarge node for the database provided the best performance on both application node configurations. Interestingly, using db.m4.4xlarge showed a slight regression in performance. This shows that for our Large-sized load, using virtual machine types more powerful than db.m4.2xlarge yielded no performance improvement.

Summary of test results for Confluence 6.14

The benchmarks from our application node tests demonstrate that when it comes to Confluence's performance, vertical scaling works better than horizontal scaling. Or, you'll get better performance on less nodes with more powerful hardware than more nodes with less powerful hardware.

The following table shows some information on all the configurations that produced an Apdex of 0.8 or higher on Confluence Data Center 6.14: 

Application nodes Database node Apdex Cost per hour 1 
c5.4xlarge x 2 db.m4.2xlarge 0.874 2.09
c5.4xlarge x 3 db.m4.xlarge 0.863 2.40
c5.4xlarge x 2 db.m4.xlarge 0.856 1.72
c5.2xlarge x 4 db.m4.2xlarge 0.855 2.09
c5.2xlarge x 3 db.m4.xlarge 0.834 1.38
c5.2xlarge x 4 db.m4.xlarge 0.837 1.72


From this table, we can see which configurations produced the best performance, highest cost, and lowest cost for 6.14:

Configuration Application nodes Database node Apdex Cost per hour1
Performance c5.4xlarge x 2 db.m4.2xlarge 0.874 2.09
Stability c5.2xlarge x 4 db.m4.xlarge 0.837 1.72
Low cost c5.2xlarge x 3 db.m4.xlarge 0.834 1.38

We included the Stability configuration here because it features better fault tolerance than both Performance and Low cost options. The Stability configuration can lose four application nodes before the service goes offline (the Performance and Low cost ones can only lose two and three, respectively). Our tests also show that if the Low cost configuration loses one node, the Apdex dips below 0.8. An admin has more time to handle the loss of nodes on the Stability configuration. 

When choosing a configuration, you might also want to consider its performance when a node goes offline. Our test shows that when the Performance option is reduced to just one node, its Apdex still remains above 0.8. The Low cost option's Apdex, however, will dip below 0.8 if it loses even just one node.

The Stability option can still stay above 0.8 with the loss of one node.

Validating results for Confluence 6.13 (Enterprise release)

We also re-tested the PerformanceStability, and Lowest cost configurations from Confluence Data Center 6.14 to check whether they were still valid for 6.13. The following table shows the results from these tests:

Configuration Application nodes Database node Cost per hour 1 Apdex (per Confluence version)
6.13 6.14

Performance

c5.4xlarge x 2 db.m4.2xlarge 2.09 0.852 0.874
Stability c5.2xlarge x 4 db.m4.xlarge 1.72 0.817 0.837
Low cost c5.2xlarge x 3 db.m4.xlarge 1.38 0.820 0.834
Configuration Application nodes Database node Cost per hour 1 Apdex (per Confluence version)
6.13 6.14

Performance

c5.4xlarge x 2 db.m4.2xlarge 2.09 0.852 0.874
Stability c5.2xlarge x 4 db.m4.xlarge 1.72 0.817 0.837
Low cost c5.2xlarge x 3 db.m4.xlarge 1.38 0.820 0.834

The Stability and Low cost option show similar performance. This means that, between the two, price is the main trade-off for fault tolerance. As always, the Performance option has the highest price and Apdex of all three recommendations.  

Recommendations for XLarge-sized instances

The following table shows the data set and traffic we used on our performance tests for XLarge-size instances:

Load profile metric Value
Total Spaces 10,500
Content (All Versions) 34,900,000
Local Users 102,000
Traffic (HTTP requests per hour) 1,000,000
Click here for more data set and traffic details

Standard size for XLarge

According to Confluence Data Center load profiles, an XLarge-sized instance falls within the following range for each metric:

  • Total spaces: 5,000 to 50,000
  • Content (all versions): 10 million to 25 million
  • Local users: 100,000 to 250,000
  • HTTP calls per hour: 700,000 to 1,000,000

Test data set and traffic breakdown

Our XLarge data set puts the overall load of the instance in the lower range of XLarge. We believe that these metrics represent a majority of real-life, XLarge-sized Confluence Data Center instances.


Metric Total Components Value (approximate)
Total Spaces 10,500 Site Spaces 5,800
Personal Spaces 5,000
Content (All Versions) 34,900,000 Content (Current Versions) 26,600,000
Comments 15,800,000
Local Users 102,000 Active Users 1,200
Traffic (HTTP requests) 1,000,000 per hour Reads (kbps) 1
Writes (kbps) 5,500


Our data set also uses 9,900 local groups. 

tip/resting Created with Sketch.

Good to know

  • Content (all versions) is the total number of all versions of all pages, blog posts, comments, and files in the instance. It's the total number of rows in the CONTENT table in the Confluence database.

  • Content (current versions) is the number of pages, blog posts, comments, and files in the instance. It doesn't include historical versions of pages, blog posts, or files.

  • Local Users is the number of user accounts from local and remote directories that have been synced to the database. It includes both active and inactive accounts.

  • Active Users is the number of all Local Users logged in during each test. All HTTP requests were issued by a subset of all active users.


We analyzed the benchmarks and configurations from our XLarge testing phase and came up with the following recommendations:

Configuration Application nodes Database node Cost per hour1 Apdex (per Confluence version)
6.13 6.15
Stability c5.4xlarge x 4 db.m4.2xlarge 3.45 0.810 0.826
Low cost c5.4xlarge x 3 db.m4.2xlarge 2.77 0.811 0.825

The Stability configuration can maintain acceptable performance (that is, Apdex above 0.8) even if it lost one application node. At four application nodes, it is more fault tolerant overall than the Low costconfiguration.

Both Stability and Low cost configurations are identical except for the number of nodes. Their Apdex scores don't differ much either. The Low cost configuration is fairly fault-tolerant, in that it can afford to lose 3 nodes before the service goes offline. However, our tests show that if the Low cost configuration loses one node, the Apdex dips below 0.8.

XLarge tests: results and analysis

For our XLarge tests, we first tested different virtual machine types for the application node against a single db.m4.2xlarge node for the database in each. We ran these tests on Confluence 6.15, which was the latest available version at the time.

After checking which configurations produced the best performance, we re-tested them on Confluence 6.13.

Click here for more details about these test results

Application node test results

We first tested different virtual machine types for the application node against a single db.m4.xlarge node for the database in each. We ran these tests on Confluence 6.15, which was the latest available version at the time.

However, none of those resulted in Apdex scores above 0.8. The following graph shows our actual Apdex benchmarks when we ran the same tests using db.m4.2xlarge for the database:

Among the virtual machine types we tested for the application node, the only one that performed at an Apdex above 0.8 was c5.4xlarge. This virtual machine type did so at 3 or 4 nodes. Once we get down to 2 nodes, Apdex will dip below 0.8.

Database nodes test results

We also tested different combinations of virtual machine instances for the application node using db.m4.4xlarge for the database node, but did not see any improvement in Apdex.

Practical examples

If you'd like to see practical applications of these recommendations, check out these resources:

Both are Large-sized Confluence Data Center instances hosted on AWS. They also use the virtual machine types from the node configuration with the Lowest cost. However, we use four application nodes for even better fault tolerance. In our production Confluence instance, this configuration still gives our users acceptable performance – even with our overall load and installed apps.

We're here to help

Over time, we may change our recommendations depending on new tests, insights, or improvements in Confluence Data Center. Follow this page in case that happens. Contact an Atlassian Technical Account Manager for more guidance on choosing the right configuration for your Data Center instance.

Our Premier Support team performs health checks by meticulously analyzing your application and logs to ensure that your infrastructure configuration is suitable for your Data Center application. If the health check process reveals any performance gaps, Premier Support will recommend possible changes.


Last modified on Mar 27, 2019

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.