Infrastructure recommendations for enterprise Confluence instances on AWS

The AWS Quick Start template as a method of deployment is no longer supported by Atlassian. You can still use the template, but we won't maintain or update it.

We recommend deploying your Data Center products on a Kubernetes cluster using our Helm charts for a more efficient and robust infrastructure and operational setup. Learn more about deploying on Kubernetes.

AWS now recommends switching launch configurations, which our AWS Quick Start template uses, to launch templates. We won’t do this switch, however, as we’ve ended our support for the AWS Quick Start template. This means you're no longer able to create launch configurations using this template.

In Confluence Data Center load profiles, we presented simple guidelines for finding out if your instance was Small, Medium, Large, or XLarge. We based these size profiles on different Server and Data Center case studies, covering instances of varying infrastructure sizes and configurations. Knowing your load profile is useful for planning for your company's growth, looking for inflated metrics, or simply reviewing your infrastructure's suitability.

Recommendations are based on older versions of Confluence

The recommendations on this page are based on tests conducted on older versions of Confluence. We tested:

  • XLarge data set on Confluence Data Center 6.13 and 6.15
  • Large data set on Confluence Data Center 6.13 and 6.14
  • Medium data set on Confluence Data Center 6.13

A single node can be adequate for most Small or Medium size deployments, especially if you don't require high availability. 
If you have an existing Server installation, you can still use its infrastructure when you upgrade to Data Center. Many features exclusive to Data Center (like SAML single sign-onself-protection via rate limiting, and CDN support) don't require clustered infrastructure. You can start using these Data Center features by simply upgrading your Server installation’s license.
tip/resting Created with Sketch.

For more information on whether clustering is right for you, check out Data Center architecture and infrastructure options.

As your load grows closer to Large or XLarge, you should routinely evaluate your infrastructure. Once your environment starts to experience performance or stability issues, consider migrating to a clustered (or cluster-ready) infrastructure. When you do, keep in mind that it may not be always clear how to do that effectively –  for example, adding more application nodes to a growing Medium-sized instance doesn't always improve performance (in fact, the opposite might happen). 

To help you plan your infrastructure set-up or growth, we ran a series of performance tests on typical Medium, Large, and XLarge instances. We designed these tests to get useful, data-driven recommendations for your clustered deployment's application and database nodes. These recommendations can help you plan a suitable clustered environment, one that is adequate for the size of your projected content and traffic.


Executive summary

Medium

RecommendationApplication nodesDatabase nodeCost per hour 1Apdex (6.13)

Performance

c5.xlarge x 2

m4.large

$0.522

0.929

Stability

c5.large x 4

m4.large

$0.522

0.905


The Performance option offers the best Apdex among all the configurations we tested. It can maintain an Apdex above 0.9 even when it loses one node.

The Stability option offers better fault tolerance at the same cost, but there is a slight drop in performance.

Confluence performed well in all tests, demonstrating Apdex of above 0.90.

Large

RecommendationApplication nodesDatabase nodeCost per hour 1Apdex (per Confluence version)
6.136.14
Performancec5.4xlarge x 2m4.2xlarge2.090.8520.874
Stabilityc5.2xlarge x 4m4.xlarge1.720.8170.837
Low costc5.2xlarge x 3m4.xlarge1.380.8200.834


The Performance option offered the best Apdex among all the configurations we tested. It can maintain an Apdex above 0.8 even when it loses one node.

The Stability and Low cost options offer a good balance between price, fault tolerance, and performance. You'll notice that they both use the same virtual machine types – the Stability option just has an additional application node. The Stability option can afford to lose more nodes before going completely offline, but the Low cost option costs less. Any performance difference between the two is negligible. 


XLarge

ConfigurationApplication nodesDatabase nodeCost per hour1Apdex (per Confluence version)
6.136.15
Stabilityc5.4xlarge x 4m4.2xlarge3.450.8100.826
Low costc5.4xlarge x 3m4.2xlarge2.770.8110.825

The Stability configuration can maintain acceptable performance (that is, Apdex above 0.8) even if it lost one application node. At four application nodes, it is more fault tolerant overall than the Low cost configuration.

Both Stability and Low cost configurations are identical except for the number of nodes. Their Apdex scores don't differ much either. The Low cost configuration is fairly fault-tolerant, in that it can afford to lose 3 nodes before the service goes offline. However, our tests show that if the Low cost configuration loses one node, the Apdex dips below 0.8.

Important note

Performance results depend on many factors such as 3rd party apps, data, traffic or the instance type. Hence, the performance we achieved might not be replicable to your environment. Make sure you read through the methodology of our test to learn the details of these recommendations.



Approach

We ran all of our tests in AWS environments. This allowed us to easily define and automate many tests, giving us a large and fairly reliable sample of test results. 

Each part of our test infrastructure is a standard AWS component available to all AWS users. This means you can easily deploy our recommended configurations. 

Since we used standard AWS components, you can look up their specifications in the AWS documentation. This lets you find equivalent components and configurations if your organization prefers to use a different cloud platform or bespoke clustered solution

Some things to consider

To gather a large sample of benchmarks for analysis, we designed tests that could be easily set up and replicated. As such, when referencing our benchmarks and recommendations for your infrastructure plans, consider the following:

  • We didn't install apps on our test instances, as our focus was finding the right configurations for the core product. When designing your infrastructure, you need to account for the performance impact of apps you want to install.
  • We used Postgresql 9.4.15 with default AWS RDS settings across all our tests. This allowed us to get consistent results with minimal setup and tuning.
  • Our test environment used dedicated AWS infrastructure hosted on the same subnet. This helped minimize network latency. 

Analytics

We enabled Analytics on each test instance to collect usage data. For more information, see Data Collection Policy.

We also tested if Analytics had a huge impact on performance (it doesn't)

We ran a separate test to find out the performance impact of enabling Analytics. To do this, we simulated production load on two identical Confluence Data Center instances. We loaded both instances with an identical Large-sized data set and ran the same amount of traffic on both. The only difference between them was that we enabled Analytics on one but not the other.

The following chart shows the box plot of key operations with Analytics enabled (Baseline - left) and disabled (Test - right).

Overall, we observed minimal difference in response times.

Disk I/O considerations

Our data set featured a limited amount of attachments, resulting in traffic that was mostly composed of write operations (compared to a typical production instance). On average, this traffic produced 1kbps reads and 2,500 kbps writes on our shared home. This roughly equates to an average IOPS of 0.15 and 200 for reads and writes, respectively.

While we didn't set out to test disk I/O specifically, these results suggest that our shared home configuration (that is, a single m4.large node running NFS on gp2) was sufficient for our load. The disk load was stable throughout all our tests, and did not stress the NFS server.

Bear in mind, however, that the synthetic traffic we used here was mostly write traffic. This is not representative of typical production instances, which feature a higher proportion of reads.

Methodology

We ran three separate test phases: one for Medium, one for Large, and another for XLarge. Each phase involved testing a specific volume of traffic to the same Confluence data set, but on a freshly provisioned AWS environment. Each environment was exactly like the last, except for the configuration of application and database nodes.

Our objective was to benchmark different configurations for Medium,  Large, and XLarge size profiles. Specifically, we analyzed how different AWS virtual machine types affected the performance of the instance. 

Benchmark

For all tests in Medium,  Large, and XLarge size profiles, we used an Apdex of 0.8 as our threshold for acceptable performance. This Apdex assumes that a 1-second response time is our Tolerating threshold, while anything above 4 seconds is our Frustrated threshold.

By comparison, we target an Apdex of 0.7 for our own internal production Confluence Data Center instances (as we discussed in Confluence Data Center sample deployment and monitoring strategy). However, that 0.7 Apdex takes into account the performance impact of apps on those instances. We don't have any apps installed on our test instances, so we adjusted the target Apdex for our tests to 0.8. 


Click here for details about how we compute each test's overall Apdex

Apdex (Application Performance Index) is a popular standard used by many companies to report, benchmark, and track application performance. See http://apdex.org/overview.html for more information.


To calculate our overall Apdex, we first assign an Apdex to each user action based on its response time:

Response timeRequest Apdex
Less than 1 second1
1-4 seconds0.5
over 4 seconds0


Next, we apply a weight to the Apdex of that action, (based on its type). This weight represents how heavily the action's score affects the overall Apdex: 

User action typeWeight (%)
confluence.page.view84
confluence.blogpost.view1
confluence.dashboard.view6
confluence.page.create.collaborative.view2
confluence.page.edit.collaborative.view5
confluence.page.edit.collaborative.quick-view2

A large number of Confluence customers provide us with usage statistics, and we based these weights on the top 1,000 instances in terms of traffic volume. Of those instances, page views (confluence.page.view) make up an overwhelming majority of user actions.

This means that each simulated user action gets scored as:

(Apdex) x (weight per page action type) = weighted Apdex

For example, if confluence.page.edit.collaborative.view takes 2 seconds to complete, it gets a weighted Apdex of 0.025 based on:

0.5 x 0.05 = 0.025

Finally, to get the overall Apdex, we add the weighted Apdex of all user actions. 

Architecture

Architecture details

We used the same basic architecture to test Medium, Large, and XLarge data sets:

 

We tested each configuration on a freshly-deployed instance of Confluence Data Center on AWS. Every configuration followed the same basic structure:

FunctionNumber of nodesVirtual machine type/sNotes
Confluence applicationVariable

c5.large (Medium)

c5.xlarge

c5.2xlarge

c5.4xlarge

c4.8xlarge

For each test, nodes for both Confluence application and Synchrony used the same virtual machine type. However, while we used a variable number of nodes for the Confluence application, we always used one node for Synchrony.

When testing c5.xlarge (which only has 8GB of RAM), we used 4GB for JVM heap. For all others, we used 8GB.

When testing c5.large for Medium instances (which only has 4GB of RAM), we used 2GB for JVM heap. For all others, we used 4GB.

Synchrony 1
Database1

m4.xlarge
m4.2xlarge
m4.4xlarge

We used Amazon RDS Postgresql version 9.4.15, with default settings. Each test only featured one node.
Shared home1

m4.large

Our NFS server used a 200GB General Purpose SSD (gp2)  for storage. This disk had an attached EBS volume with a baseline of 600 IOPS, burstable to 3,000 IOPS. 
Load balancer1AWS Application Load Balancer

Each Confluence application and Synchrony node used 50GB General Purpose SSD (gp2) for local storage. This disk had an attached EBS volume with a baseline of 150 IOPS, burstable to 3,000 IOPS. 

Refer to General Purpose Instances and Compute-Optimized Instances (from the AWS documentation on Instance Types) for details on each virtual machine type we tested.

Recommendations for Medium-sized instances

The following table shows the data set and traffic we used on our performance tests for Medium-size instances:

Load profile metric

Value

Total Spaces

1,700

Site Spaces

1600

Content (All Versions)

1,520,000

Local Users

9,800

Traffic (HTTP requests per hour)

180,000

Click here for more data set and traffic details

Standard size for Medium

According to Confluence Data Center load profiles, a Medium-sized instance falls within the following range for each metric:

  • Total spaces: 1,000 to 2,500

  • Content (all versions): 500,000 to 2.5 million

  • Local users: 1,000 to 10,000

  • HTTP calls per hour: 70,000 to 350,000 p/h

Test data set and traffic breakdown

The metrics we used for our Medium data set are based on Confluence Data Center load profiles, which put the overall load profile of the instance in the middle range of Medium. We believe that these metrics represent a majority of real-life, Medium-sized Confluence Data Center instances.

 Metric

Total

Components

Value (approximate)

Total Spaces


1700

Site spaces

1600


Personal spaces100

Content (all versions)

1,520,000

Content (Current Versions)

1,490,000

Comments30,000

Local Users

9,800


Active users: 94

Traffic (HTTP requests)

180,000 per hour


~ 1915 HTTP requests per hour, per user

tip/resting Created with Sketch.

Good to know

  • Content (all versions) is the total number of all versions of all pages, blog posts, comments, and files in the instance. It's the total number of rows in the CONTENT table in the Confluence database.

  • Content (current versions) is the number of pages, blog posts, comments, and files in the instance. It doesn't include historical versions of pages, blog posts, or files.

  • Local Users is the number of user accounts from local and remote directories that have been synced to the database. It includes both active and inactive accounts.

  • Active Users is the number of all Local Users logged in during each test. All HTTP requests were issued by a subset of all active users.

We analyzed the benchmarks and configurations from our Medium testing phase and came up with the following recommendations:

Recommendation

Application nodes

Database node

Apdex (6.13)

Cost per hour 1

Performance

c5.xlarge x 2

m4.large

0.929

$0.522

Stability

c5.large x 4

m4.large

0.905

$0.522

The Performance option offers the best Apdex among all the configurations we tested. It can maintain an Apdex above 0.9 even when it loses one node.

The Stability option offers better fault tolerance at the same cost, but there is a slight drop in performance.

Confluence performed well in all tests, demonstrating Apdex of above 0.90.

Cost per hour

1 In our recommendations for Medium size profiles, we quoted a cost per hour for each configuration. We provide this information help inform you about the comparative price of each configuration. This cost only calculates the price of the nodes used for the Confluence application and database. It does not include the cost of using other components like Synchrony, shared home, or application load balancer.

These figures are in USD for deployments in US-EAST, and were correct as of April 2019.

Medium tests: results and analysis

We ran one type of tests to determine optimal configurations for the Confluence application node. The tests sought to find out which AWS virtual machine types to use (and how many) for the application node. For these tests, we used a single m4.xlarge node for the database.


Application node test results

We first tested different virtual machine types for the application node against a single m4.xlarge node for the database in each.We ran these tests on Confluence 6.13, which was the latest Long Term Support release version available at the time of our tests.

The following graph shows the Apdex benchmarks at different virtual machine types for the application node against a single m4.xlarge node:

We first tested different virtual machine types for the application node against a single m4.xlarge node for the database in each.We ran these tests on Confluence 6.13, which was the latest Long Term Support release version available at the time of our tests.

The following graph shows the Apdex benchmarks at different virtual machine types for the application node against a single m4.xlarge node:


These results show that the best performance came from instance sizes which are equal to or higher than c5.xlarge (4 CPUs). As shown in the graph, there were no further Apdex improvements in using stronger hardware. That is, Apdex remained the same for c5.2xlarge (8 CPUs) and c5.4xlarge (16 CPUs).

For this reason, we recommend c5.2xlarge 2 nodes for best performance or c5.large 4 nodes for better fault tolerance.

Database nodes test results
Database nodes

The application node tests showed that using two c5.xlarge nodes or four c5.large nodes provided acceptable Apdex results. Using this information, we moved on to the next series of tests – testing optimal configurations for the database node. We ran these tests on Confluence 6.13.

We benchmarked the best-performing Confluence application node configurations against the following virtual machine types for the database node:

  • m4.large

  • m4.xlarge

  • m4.2xlarge

  • m4.4xlarge

The following graph shows how one of each virtual machine type performed as a database node:

There were no further performance improvements when we used a larger database. Thus, for medium-sized load, m4.large is the instance type we recommend.

Recommendations for Large-sized instances

The following table shows the data set and traffic we used on our performance tests for Large-size instances:

Load profile metricValue
Total Spaces6,550
Content (All Versions)16,000,000
Local Users12,300
Traffic (HTTP requests per hour)498,000
Click here for more data set and traffic details

Standard size for Large

According to Confluence Data Center load profiles, a Large-sized instance falls within the following range for each metric:

  • Total spaces: 2,500 to 5,000
  • Content (all versions): 2.5 million to 10 million
  • Local users: 10,000 to 100,000
  • HTTP calls per hour: 350,000 to 700,000 p/h


The metrics we used for our Large data set are based on Confluence Data Center load profiles, which put the overall load profile of the instance at the upper range of Large. We believe that these metrics represent a majority of real-life, Large-sized Confluence Data Center instances.

MetricTotalComponentsValue (approximate)
Total Spaces6,550Site Spaces1,500
Personal Spaces5,000
Content (all versions)16,000,000Content (Current Versions)6,900,000
Comments2,000,000
Local Users12,300Active Users565
Traffic (HTTP requests)498,000 per hourN/A

Our data set also uses 9,900 local groups.


tip/resting Created with Sketch.

Good to know

  • Content (all versions) is the total number of all versions of all pages, blog posts, comments, and files in the instance. It's the total number of rows in the CONTENT table in the Confluence database.

  • Content (current versions) is the number of pages, blog posts, comments, and files in the instance. It doesn't include historical versions of pages, blog posts, or files.

  • Local Users is the number of user accounts from local and remote directories that have been synced to the database. It includes both active and inactive accounts.

  • Active Users is the number of all Local Users logged in during each test. All HTTP requests were issued by a subset of all active users.


We analyzed the benchmarks and configurations from our Large testing phase and came up with the following recommendations:

RecommendationApplication nodesDatabase nodeCost per hour 1Apdex (per Confluence version)
6.136.14
Performancec5.4xlarge x 2m4.2xlarge2.090.8520.874
Stabilityc5.2xlarge x 4m4.xlarge1.720.8170.837
Low costc5.2xlarge x 3m4.xlarge1.380.8200.834


The Performance option offered the best Apdex among all the configurations we tested. It can maintain an Apdex above 0.8 even when it loses one node.

The Stability and Low cost options offer a good balance between price, fault tolerance, and performance. You'll notice that they both use the same virtual machine types – the Stability option just has an additional application node. The Stability option can afford to lose more nodes before going completely offline, but the Low cost option costs less. Any performance difference between the two is negligible. 

Cost per hour

1 In our recommendations for both Large and XLarge size profiles, we quoted a cost per hour for each configuration. We provide this information help inform you about the comparative price of each configuration. This cost only calculates the price of the nodes used for the Confluence application and database. It does not include the cost of using other components like Synchrony, shared home, or application load balancer.

These figures are in USD for deployments in US-EAST, and were correct as of April 2019.

Large tests: results and analysis

We ran two types of tests to get these recommendations: one to determine optimal configurations for the Confluence application node and another for the database node:

  1. Our first test type sought to find out which AWS virtual machine types to use (and how many) for the application node. For these tests, we used a single m4.xlarge node for the database.
  2. Our second test series benchmarked different virtual machine types for the database. Here, we tested different virtual machine types against two application node configurations: two c5.4xlarge and four c5.2xlarge. These application node configurations yielded the highest Apdex from the previous test. In each test from this series, we only used one database node as well.


Click here for more details about these test results

Application node test results

We first tested different virtual machine types for the application node against a single m4.xlarge node for the database in each. We ran these tests on Confluence 6.14, which was the latest version available at the time of our tests.

The following graph shows the actual Apdex benchmarks from each test:

These results show that the best performance came from c5.4xlarge nodes. You'll need at least two nodes for high availability, and the Apdex turned out higher than the others regardless of how many nodes we tested. Each of these nodes features 16 CPUs with 32GB of RAM.

Using three or more c5.2xlarge nodes still showed acceptable Apdex. Beyond four nodes, there aren't any considerable changes in performance. This is worth considering if you want to use more than two nodes for reliability and fault tolerance. The hourly cost 1 of four c5.2xlarge nodes is roughly the same as two c5.4xlarge nodes.

Database nodes test results

The application node test results showed that using two c5.4xlarge nodes or four c5.2xlarge nodes provided acceptable Apdex results. Using this information, we moved on to the next series of tests – testing optimal configurations for the database node. We also ran these tests on Confluence 6.14 to benchmark the best-performing Confluence application node configurations against the following virtual machine types for the database node: 

  • m4.large
  • m4.xlarge
  • m4.2xlarge
  • m4.4xlarge

The following graph shows how one of each virtual machine type performed as a database node:

Using m4.large for the database and four c5.2xlarge nodes for the application yielded results below our Apdex threshold of 0.8. Throughout all tests, using two c5.4xlarge nodes for the application showed better performance. 

Using the m4.2xlarge node for the database provided the best performance on both application node configurations. Interestingly, using m4.4xlarge showed a slight regression in performance. This shows that for our Large-sized load, using virtual machine types more powerful than m4.2xlarge yielded no performance improvement.


Summary of test results for Confluence 6.14

The benchmarks from our application node tests demonstrate that when it comes to Confluence's performance, vertical scaling works better than horizontal scaling. Or, you'll get better performance on less nodes with more powerful hardware than more nodes with less powerful hardware.

The following table shows some information on all the configurations that produced an Apdex of 0.8 or higher on Confluence Data Center 6.14: 

Application nodesDatabase nodeApdexCost per hour 1 
c5.4xlarge x 2m4.2xlarge0.8742.09
c5.4xlarge x 3m4.xlarge0.8632.40
c5.4xlarge x 2m4.xlarge0.8561.72
c5.2xlarge x 4m4.2xlarge0.8552.09
c5.2xlarge x 3m4.xlarge0.8341.38
c5.2xlarge x 4m4.xlarge0.8371.72


From this table, we can see which configurations produced the best performance, highest cost, and lowest cost for 6.14:

ConfigurationApplication nodesDatabase nodeApdexCost per hour1
Performancec5.4xlarge x 2m4.2xlarge0.8742.09
Stabilityc5.2xlarge x 4m4.xlarge0.8371.72
Low costc5.2xlarge x 3m4.xlarge0.8341.38

We included the Stability configuration here because it features better fault tolerance than both Performance and Low cost options. The Stability configuration can lose four application nodes before the service goes offline (the Performance and Low cost ones can only lose two and three, respectively). Our tests also show that if the Low cost configuration loses one node, the Apdex dips below 0.8. An admin has more time to handle the loss of nodes on the Stability configuration. 

When choosing a configuration, you might also want to consider its performance when a node goes offline. Our test shows that when the Performance option is reduced to just one node, its Apdex still remains above 0.8. The Low cost option's Apdex, however, will dip below 0.8 if it loses even just one node.

The Stability option can still stay above 0.8 with the loss of one node.

Validating results for Confluence 6.13 (Enterprise release)

We also re-tested the PerformanceStability, and Lowest cost configurations from Confluence Data Center 6.14 to check whether they were still valid for 6.13. The following table shows the results from these tests:

ConfigurationApplication nodesDatabase nodeCost per hour 1Apdex (per Confluence version)
6.136.14

Performance

c5.4xlarge x 2m4.2xlarge2.090.8520.874
Stabilityc5.2xlarge x 4m4.xlarge1.720.8170.837
Low costc5.2xlarge x 3m4.xlarge1.380.8200.834
ConfigurationApplication nodesDatabase nodeCost per hour 1Apdex (per Confluence version)
6.136.14

Performance

c5.4xlarge x 2m4.2xlarge2.090.8520.874
Stabilityc5.2xlarge x 4m4.xlarge1.720.8170.837
Low costc5.2xlarge x 3m4.xlarge1.380.8200.834

The Stability and Low cost option show similar performance. This means that, between the two, price is the main trade-off for fault tolerance. As always, the Performance option has the highest price and Apdex of all three recommendations.  

Recommendations for XLarge-sized instances

The following table shows the data set and traffic we used on our performance tests for XLarge-size instances:

Load profile metricValue
Total Spaces10,500
Content (All Versions)34,900,000
Local Users102,000
Traffic (HTTP requests per hour)1,000,000
Click here for more data set and traffic details

Standard size for XLarge

According to Confluence Data Center load profiles, an XLarge-sized instance falls within the following range for each metric:

  • Total spaces: 5,000 to 50,000
  • Content (all versions): 10 million to 25 million
  • Local users: 100,000 to 250,000
  • HTTP calls per hour: 700,000 to 1,000,000

Test data set and traffic breakdown

Our XLarge data set puts the overall load of the instance in the lower range of XLarge. We believe that these metrics represent a majority of real-life, XLarge-sized Confluence Data Center instances.

MetricTotalComponentsValue (approximate)
Total Spaces10,500Site Spaces5,800
Personal Spaces5,000
Content (All Versions)34,900,000Content (Current Versions)26,600,000
Comments15,800,000
Local Users102,000Active Users1,200
Traffic (HTTP requests)1,000,000 per hourReads (kbps)1
Writes (kbps)5,500

Our data set also uses 9,900 local groups. 


tip/resting Created with Sketch.

Good to know

  • Content (all versions) is the total number of all versions of all pages, blog posts, comments, and files in the instance. It's the total number of rows in the CONTENT table in the Confluence database.

  • Content (current versions) is the number of pages, blog posts, comments, and files in the instance. It doesn't include historical versions of pages, blog posts, or files.

  • Local Users is the number of user accounts from local and remote directories that have been synced to the database. It includes both active and inactive accounts.

  • Active Users is the number of all Local Users logged in during each test. All HTTP requests were issued by a subset of all active users.


We analyzed the benchmarks and configurations from our XLarge testing phase and came up with the following recommendations:

ConfigurationApplication nodesDatabase nodeCost per hour1Apdex (per Confluence version)
6.136.15
Stabilityc5.4xlarge x 4m4.2xlarge3.450.8100.826
Low costc5.4xlarge x 3m4.2xlarge2.770.8110.825

The Stability configuration can maintain acceptable performance (that is, Apdex above 0.8) even if it lost one application node. At four application nodes, it is more fault tolerant overall than the Low cost configuration.

Both Stability and Low cost configurations are identical except for the number of nodes. Their Apdex scores don't differ much either. The Low cost configuration is fairly fault-tolerant, in that it can afford to lose 3 nodes before the service goes offline. However, our tests show that if the Low cost configuration loses one node, the Apdex dips below 0.8.

XLarge tests: results and analysis

For our XLarge tests, we first tested different virtual machine types for the application node against a single m4.2xlarge node for the database in each. We ran these tests on Confluence 6.15, which was the latest available version at the time.

After checking which configurations produced the best performance, we re-tested them on Confluence 6.13.

Click here for more details about these test results

Application node test results

We first tested different virtual machine types for the application node against a single m4.xlarge node for the database in each. We ran these tests on Confluence 6.15, which was the latest available version at the time.

However, none of those resulted in Apdex scores above 0.8. The following graph shows our actual Apdex benchmarks when we ran the same tests using m4.2xlarge for the database:


Among the virtual machine types we tested for the application node, the only one that performed at an Apdex above 0.8 was c5.4xlarge. This virtual machine type did so at 3 or 4 nodes. Once we get down to 2 nodes, Apdex will dip below 0.8.

Database nodes test results

We also tested different combinations of virtual machine instances for the application node using m4.4xlarge for the database node, but did not see any improvement in Apdex.

Practical examples

If you'd like to see practical applications of these recommendations, check out these resources:

Both are Large-sized Confluence Data Center instances hosted on AWS. They also use the virtual machine types from the node configuration with the Lowest cost. However, we use four application nodes for even better fault tolerance. In our production Confluence instance, this configuration still gives our users acceptable performance – even with our overall load and installed apps.

We're here to help

Over time, we may change our recommendations depending on new tests, insights, or improvements in Confluence Data Center. Follow this page in case that happens. Contact Atlassian Advisory Services for more guidance on choosing the right configuration for your Data Center instance.

Our Premier Support team performs health checks by meticulously analyzing your application and logs to ensure that your infrastructure configuration is suitable for your Data Center application. If the health check process reveals any performance gaps, Premier Support will recommend possible changes.


Last modified on Jul 1, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.