Data Center: How can I decide if my company should make the move from Server to Data Center?
Objective
How can I decide if my company should make the move from Server to Data Center?
Environment
Data Center / Server
Procedure
We've designed a white paper for customers interested in making the move from Server to Data Center. For information to help your growing team scale Atlassian applications across your organization download our tips and best practices PDF here.
For many of our customers there comes a time in their Atlassian journey when they need more than what a single server or federated environment can provide in terms of availability and performance. As applications grow across an organization they become mission-critical to every team’s success. We call this the “tipping point” for moving to an active-active clustered environment that provides high availability and supports performance at scale. Here are some criteria for customers considering the move to Data Center. Note: you may not meet all of these criteria today, but if you have growth plans in your future, think about preparing now.
Users
Consider how many users you have accessing your Atlassian applications each day. Are you at or approaching 500? We’ve found the tipping point for JIRA Software, Confluence, and Bitbucket customers who need more stability tends to be between the 500-1000 user mark. In fact, roughly 45% of Data Center customers upgrade to this offering at the 500 or 1,000 user tier. When it comes to JIRA Service Desk, 50% of Data Center customers make the move at the 50 agent tier.
As development teams grow, their repos grow alongside them. For distributed teams, this can mean slower clone times between the main instance and remote team. To reduce this pain, Bitbucket Data Center allows for Smart Mirroring which makes read-only copies of repos available on a nearby mirror in a remote location. Mirrors can cut clone and fetch times from hours to minutes, letting users get what they need faster.
Performance
For customers on large instances, performance degradation usually happens under high load or at peak times. Meaning, as more and more users access the application at the same time, response times increase, users get frustrated, and system administrators look for solutions to minimize pain (for users, and themselves). Many global companies experience this when multiple geographic locations come online at the same time. At Atlassian, we experienced this first-hand when our Sydney teams started their day – we had hundreds of concurrent users logging on to a system that already had hundreds online. This usually caused our San Francisco and Austin offices, in addition to our Sydney office, to struggle with slow page load times or brief periods offline. In addition to concurrent usage, other running jobs, like API calls and queries, can impact performance issues. Adding these on top of your users’ traffic only exasperates the problem. Data Center provides the ability to use a load balancer to direct certain types of traffic to certain nodes in your cluster. This allows you to compartmentalize resources to ensure all of your requests maintain the best performance possible. For example, you could direct all of your API traffic to a specific node (or number of nodes). This way, your normal user traffic is never slowed down by ongoing API jobs.
Downtime
There are typically two primary causes of downtime: application and server-side issues. When it comes to the application side, issues are often a result of JVM errors, the most common of which is the heap being overloaded. That is, the memory dedicated on the server for running the application gets too full and causes the application to fail. Another common application side issue is the database’s connection being overloaded with requests and causing the application to fail. Server-side issues can range from planned maintenance to unplanned upgrades/installations to resources like CPU, RAM, or storage on the server being overwhelmed and causing an outage. Whatever the source of the outage, the result is lost productivity from hundreds or thousands of employees being unable to work. Those costs can quickly add up. How many people in your organization depend on JIRA Software, Bitbucket, Confluence, or JIRA Service Desk to get their jobs done? What does an hour of downtime potentially equate to in lost opportunity cost per hour? Data Center significantly reduces this risk. If one server in your cluster goes down, the others take on the load. Instead of productivity grinding to a halt until the server gets back up and running, traffic is redirected to an active server and business continues as usual.