Adjusting your code for rate limiting
Whether it’s a script, integration, or app you’re using — if it’s making external REST API requests, it will be affected by rate limiting. Until now, you could send an unlimited number of REST API requests to retrieve data from Confluence, so we’re guessing you haven’t put any restrictions on your code. When admins enable rate limiting in Confluence, there’s a chance your requests will get limited eventually, so we want to help you prepare for that.
Before you begin
To better understand the strategies we’ve described here, it’s good to have some some basic knowledge about rate limiting in Confluence. When in doubt, head to Improving instance stability with rate limiting and have a look at the first paragraph.
Quick reference
Strategies
We’ve created a set of strategies you can apply in your code so it works with rate limits. From very specific to more universal, these reference strategies will give you a base, which you can further refine to make an implementation that works best for you.
1. Exponential backoff
This strategy is the most universal and the least complex to implement. It’s not expecting HTTP headers or any information specific to a rate limiting system, so the same code will work for the whole Atlassian suite, and most likely non-Atlassian products, too. The essence of using it is observing whether you’re already limited (wait and retry, until requests go through again) or not (just keep sending requests until you’re limited).
Universal, works with any rate limiting system.
Doesn’t require too much knowledge about limits or a rate limiting system.
High impact on a Confluence instance because of concurrency. We’re assuming most active users will send requests whenever they’re available. This window will be similar for all users, making spikes in Confluence performance. The same applies to threads — most will either be busy at the same time or idle.
Unpredictable. If you need to make a few critical requests, you can’t be sure all of them will be successful.
Summary of this strategy
Here’s the high-level overview of how to adjust your code:
- Active: Make requests until you encounter a 429. Keep concurrency to a minimum to know exactly when you reached your rate limit.
- Timeout: After you receive a 429, start the timeout. Set it to 1 second for starters. It’s a good idea to wait longer than your chosen timeout — up to 50%.
- Retry: After the timeout has passed, make requests again:
- Success: If you get a 2xx message, go back to step 1 and make more requests.
- Limited: If you get a 429 message, go back to step 2 and double the initial timeout. You can stop once you reach a certain threshold, like 20 minutes, if that’s enough to make your requests work.
With this strategy, you’ll deplete tokens as quickly as possible, and then make subsequent requests to actively monitor the rate limiting status on the server side. It guarantees you’ll get a 429 if your rate is above the limits.
2. Specific timed backoff
This strategy is a bit more specific, as it uses the retry-after
header. We’re considering this header an industry standard and plan to use it across the Atlassian suite, so you can still be sure the same code will work for Bitbucket and Confluence, Data Center and Cloud, etc. This strategy makes sure that you will not be limited because you’ll know exactly how long you need to wait before you’re allowed to make new requests.
Universal, works with any rate limiting system within the Atlassian suite (and other products using retry-after
) — Bitbucket and Confluence, Server and Cloud, etc.
Doesn’t require too much knowledge about limits or a rate limiting system.
High impact on a Confluence instance because of concurrency. We’re assuming most active users will send requests whenever they’re available. This window will be similar for all users, making spikes in Jira performance. The same applies to threads — most will either be busy at the same time or idle.
Summary of this strategy
Here’s a high-level overview of how to adjust your code:
- Active: Make requests and observe the
retry-after
response header, which shows the number of seconds you need to wait to get new tokens. Keep concurrency level to a minimum to know exactly when the rate limit kicks in.- Success: If the header says 0, you can make more requests right away.
- Limited: If the header has a number greater than 0, for example 5, you need to wait that number of seconds.
- Timeout: If the header is anything above 0, start the timeout with the number of seconds specified in the header. Consider increasing the timeout by a random fraction, up to 20%.
- Retry: After the timeout specified in the header has passed, go back to step 1 and make more requests.
With this strategy, you’ll deplete tokens as quickly as possible, and then pause until you get new tokens. You should never hit a 429 if your code is the only agent depleting tokens and sending requests synchronously.
3. Rate adjustment
This strategy is very specific and expects particular response headers, so it’s most likely to work for Confluence Data Center only. When making requests, you’ll observe headers returned by the server (number of tokens, fill rate, time interval) and adjust your code specifically to the number of tokens you have and can use.
It can have the least performance impact on a Confluence instance if used optimally.
Highly recommended, especially for integrations that require high-volume traffic.
Safe, as you can easily predict that all requests that must go through will in fact go through. It also allows for a great deal of customization.
Very specific, depends on specific headers and rate limiting system.
Summary of this strategy
Here’s a high-level overview of how to adjust your code:
- Active: Make requests and observe all response headers.
- Adjust: With every request, recalculate the rate based on the following headers:
x-ratelimit-interval-seconds
: The time interval in seconds. You get a batch of new tokens every time interval.x-ratelimit-fillrate
: The number of tokens you get every time interval.retry-after
: The number of seconds you need to wait for new tokens. Make sure that your rate assumes waiting longer than this value.
- Retry: If you encounter a 429, which shouldn’t happen if you used the headers correctly, you need to further adjust your code so it doesn’t happen again. You can use the
retry-after
header to make sure that you only make requests when the tokens are available.
Customizing your code
Depending on your needs, this strategy helps you to: