Improving instance stability with rate limiting

When automated integrations or scripts send requests to Crowd in huge bursts, it can affect Crowd’s stability, leading to drops in performance or even downtime. With rate limiting, you can control how many external REST API requests automations and users can make and how often they can make them, making sure that your Crowd Data Center instance remains stable.

How rate limiting works

Here’s some details about how rate limiting works in Crowd.

Limited requests...

Rate limiting targets only external REST API requests, which means that requests made within Crowd aren’t limited in any way. When users move around the Crowd, configuring applications and directories, managing users, groups or memberships, and completing other actions, they won’t be affected by rate limiting, as we’re seeing this as a regular user experience that shouldn’t be limited.

Let’s use an example to better illustrate this:

  1. When a user edits an application in Crowd, a number of requests are sent in the background — these requests ask Crowd for application configuration details, such as type, directories and groups assigned, administrators etc. Since this traffic is internal to Crowd, it won’t be limited.

  2. When the same user opens up the terminal on their laptop and sends a request (like the one below) to get the contents of a space, it will be rate limited because it’s made outside of Crowd.

curl -u user:password https://localhost:8095/rest/appmanagement/1/application/APP_ID
Rate limiting technique we've chosen...

Out of the many available techniques for enforcing rate limits, we’ve chosen to use token bucket

It gives users a balance of tokens that can be exchanged for requests. Here’s a summary of how it works:

  • Users are given tokens that are exchanged for requests. One token equals one request.

  • Users get new tokens at a constant rate so they can keep making new requests. This is their Requests allowed and can be, for example, 10 every 1 minute.

    • The constant rate or interval is the Time interval (in seconds) divided by the Request allowed.

  • Tokens are added to a user’s personal bucket until it’s full. This is their Max requests and allows them to adjust the usage of tokens to their own frequency. For example, 20 every 2 minutes instead of 10 every 1 minute, as specified in their usual rate.

  • When a user tries to send more requests than the number of tokens they have, only requests that can draw tokens from the bucket will be successful. The remaining ones will end in a 429 error message – too many requests. The user can retry the requests once they get new tokens.

Integration with other Atlassian products...

Crowd tastes best when used with our other products like Jira, Confluence, Bitbucket, or Bamboo. Technically, products like these are external to Crowd, so they should be limited. In this case, however, we’re treating them as belonging to the same user experience and don’t want to enforce any limits for requests coming from or to these products.

The way it is now:

  • Server / Data Center: Not limited in any way.

  • Cloud: There’s a known issue that applies rate limits to requests coming from/to cloud products. We’re working hard to disable rate limits for cloud products and should make that happen soon. For now, you can use a workaround. For more info, see Removing rate limits for Atlassian cloud products.

Apps from Atlassian Marketplace...

The general assumption is that Marketplace apps are installed on a Crowd instance, make internal requests from within Crowd, and shouldn’t be limited. But, as always, it depends on how an app works.

  • Internal: If an app in fact works internally, enhancing the user experience, it won’t be limited. An example of such app would be a special banner that’s displayed on top of Crowd UI. Let’s say this banner checks data about users and their memberships and visualize digested statistics. Traffic like that would be internal, not limited.

  • External: Apps whose requests are external to Crowd are limited. Let’s say we have an app that displays a wallboard on TV. It asks Crowd for details about applications, directories, groups, etc. and then reshuffles and displays them in its own way as the earlier mentioned wallboard. An app like that sends external requests and behaves just like a user sending requests over a terminal.

It really depends on the app, but we’re assuming most of them shouldn’t be limited.

How rate limiting works in a cluster...

Rate limiting is available for Data Center, so you most likely have a cluster of nodes behind a load balancer. You should know that each of your users will have a separate limit on each node (rate limits are applied per node, not per cluster).

In other words, if they have used their Requests allowed on one node and were rate limited, they could theoretically send requests again if they started a new session on a different node. Switching between the nodes isn’t something users can do, but keep in mind that this can happen.

Whatever limit you’ve chosen (e.g. 100 requests every 1 hour), the same limit will apply to each node, you don’t have to set it separately. This means that each user’s ability to send requests will still be limited, and Crowd will remain stable regardless of which node their requests are routed to.

What limit should I choose?

Setting the right limit depends on many factors, so we can’t give you a simple answer. We have some suggestions, though.

Finding the right limit

The first step is to understand the size of traffic that your instance receives. You can do this by parsing the access log and finding a user than made the most REST requests over a day. Since UI traffic is not rate limited, this number will be higher than what you need as your rate limit. Now, that’s a base number — you need to modify it further based on the following questions:

  1. Can you afford to interrupt your users’ work? If your users’ integrations are mission-critical, consider upgrading your hardware instead. The more critical the integrations, the higher the limit should be — consider multiplying the number you found by two or three.

  2. Is your instance already experiencing problems due to the amount of REST traffic? If yes, then choose a limit that’s close to the base number you found on a day when the instance didn’t struggle. And if you’re not experiencing significant problems, consider adding an extra 50% to the base number — this shouldn’t interrupt your users and you still keep some capacity.

In general, the limit you choose should keep your instance safe, not control individual users. Rate limiting is more about protecting Crowd from integrations and scripts going haywire, rather than stopping users from getting their work done.

How to turn on rate limiting

You need to be a Crowd System Administrator to turn on rate limiting.

o turn on rate limiting:

  1. In Crowd, go to Administration > Rate limiting.

  2. Change the status to Enabled.

  3. Select one of the options: Allow unlimited requests, Block all requests, or Limit requests. The first and second are all about allowlisting and blocklisting. For the last option, you’ll need to enter actual limits. Find more on the in the public documentation.

  4. Save your changes.

Make sure to add exemptions for users who really need those extra requests, especially if you’ve chosen allowlisting or blocklisting.

Limiting requests — what it’s all about

As much as allowlisting and blocklisting shouldn’t require additional explanation, you’ll probably be using the Limit requests option quite often, either as a global setting or in exemptions.

Let’s have a closer look at this option and how it works:

  1. Requests allowed: Every user is allowed a certain amount of requests in a chosen time interval. It can be 10 requests every second, 100 requests every hour, or any other configuration you choose.

  2. Max requests (advanced): Allowed requests, if not sent frequently, can be accumulated up to a set maximum per user. This option allows users to make requests at a different frequency than their usual rate (for example, 20 every 2 minutes instead of 10 every 1 minute, as specified in their rate), or accumulate more requests over time and send them in a single burst. Too advanced? Just make it equal to Requests allowed, and forget about this field — nothing more will be accumulated.

Examples

Example 1

Requests allowed: 10/hour | Max requests: 100

  • One of the developers is sending requests on a regular basis, 10 per hour, throughout the day. If they try sending 20 requests in a single burst, only 10 of them will be successful. They could retry the remaining 10 in the next hour when they’re allowed new requests.

  • Another developer hasn’t sent any requests for the past 10 hours, so their allowed requests kept accumulating until they reached 100, which is the max requests they can have. They can now send a burst of 100 requests and all of them will be successful. Once they used up all available requests, they have to wait for another hour, and they’ll only get the allowed 10 requests.

  • If this same developer sent only 50 out of their 100 requests, they could send another 50 right away, or start accumulating again in the next hour.

Example 2

Requests allowed: 1/second | Max requests: 60

  • A developer can choose to send 1 request every second or 60 requests every minute (at any frequency).

  • Since they can use the available 60 requests at any frequency, they can also send all of them at once or in very short intervals. In such a case, they would be exceeding their usual rate of 1 request per second.

Finding the right limit

What limit should I choose?

Setting the right limit depends on many factors, so we can’t give you a simple answer. We have some suggestions, though.

Finding the right limit

The first step is to understand the size of traffic that your instance receives. You can do this by parsing the access log and finding a user that made the most REST requests over a day. Since UI traffic is not rate limited, this number will be higher than what you need as your rate limit. Now, that’s a base number — you need to modify it further based on the following questions:

  1. Can you afford to interrupt your users’ work? If your users’ integrations are mission-critical, consider upgrading your hardware instead. The more critical the integrations, the higher the limit should be — consider multiplying the number you found by two or three.

  2. Is your instance already experiencing problems due to the amount of REST traffic? If yes, then choose a limit that’s close to the base number you found on a day when the instance didn’t struggle. And if you’re not experiencing significant problems, consider adding an extra 50% to the base number — this shouldn’t interrupt your users and you still keep some capacity.

In general, the limit you choose should keep your instance safe, not control individual users. Rate limiting is more about protecting Crowd from integrations and scripts going haywire, rather than stopping users from getting their work done.

Adding exemptions

Exemptions are, well, special limits for users who really need to make more requests than others. Any exemptions you choose will take precedence over global settings.

After adding or editing an exemption, you’ll see the changes right away, but it takes up to 1 minute to apply the new settings to a user.

To add an exemption:

  1. Go to the Exemptions tab.

  2. Click Add exemption.

  3. Find the user and choose their new settings.

    • You can’t choose groups, but you can select multiple users.

    • The options available here are just the same as in global settings: Allow unlimited requests, Block all requests, or assign custom limit.

  4. Save your changes.

If you want to edit an exemption later, just click Edit next to a user’s name in the Exemptions tab.

Recommended: Add an exemption for anonymous access

Crowd sees all anonymous traffic as made by one user: Anonymous. If your rate limits are not too high, it might happen that a single user drains the limit assigned to anonymous. It’s a good idea to add an exemption for this account with a higher limit, and then observe whether you need to increase it further. 

Identifying users who have been rate limited

When a user is rate limited, they’ll know immediately as they’ll receive an HTTP 429 error message (too many requests). You can identify users that have been rate limited by opening the List of limited accounts tab on the rate limiting settings page. The list shows all users from the whole cluster.

When a user is rate limited, it takes up to 5 minutes to show it in the table.

Unusual accounts

You’ll recognize the users shown on the list by their name. It might happen, though, that the list will show some unusual accounts, so here’s what they mean:

  • Unknown: That’s a user that has been deleted in Crowd. They shouldn’t appear on the list for more than 24 hours (as they can’t be rate limited anymore), but you might see them in the list of exemptions. Just delete any settings for them, they don’t need rate limiting anymore.

  • Anonymous: This entry gathers all requests that weren’t made from an authenticated account. Since one user can easily use the limit for anonymous access, it might be a good idea to add an exemption for anonymous traffic and give it a higher limit.

Adding limited requests to the log file

You can also view information about rate limited users and requests in the Crowd log file. This is useful if you want to get more details about the URLs that requests targeted or originated from.

Click here to expand...

To add limited requests to the log file:

  1. Go to Administration > Logging and profiling.

  2. Scroll down until you see the input prompt for Class/Package name

  3. Set the package name to: com.atlassian.ratelimiting.internal.requesthandler.logging

  4. Set the logging level to DEBUG, and click Add.

  5. Every rate limited requests will now appear in the Crowd log file: 

    2023-12-03 19:51:24,337 http-nio-8095-exec-17 url: /rest/appmanagement/1/application/32770; user: admin DEBUG [internal.requesthandler.logging.RateLimitedRequestLogger] User [admin] has been rate limited for URL [https://instenv-209516-tll5.instenv.internal.atlassian.com/rest/appmanagement/1/application/32770]




Getting rate limited — user’s perspective

Header

Description

X-RateLimit-Limit

The maximum number of requests (tokens) you can ever have. New tokens won’t be added to your bucket after reaching this limit. Your admin configures this as Max requests.

X-RateLimit-Remaining

The remaining number of tokens. This is what you have and can use right now.

X-RateLimit-Interval-Seconds

The time interval in seconds. You get a batch of new tokens every such time interval.

X-RateLimit-FillRate

The number of tokens you get every time interval. Your admin configures this as Requests allowed.

retry-after

How long you need to wait until you get new tokens.

You can send a request successfully when the retry-after header is set to 0 after several failures with the HTTP status code 429.

When you’re rate limited and your request doesn’t go through, you’ll see the HTTP 429 error message (too many requests). You can use these headers to adjust scripts and automations to your limits, making them send requests at a reasonable frequency.

Allowlisting URLs and external applications

Allowlisting URLs and resources

We’ve also added a way to allowlist whole URLs and resources on your Crowd instance. This should be used as quick fix for something that gets rate limited, but shouldn’t.

When to use it?

For example, a Marketplace app added some new API to Crowd. The app itself is used from the UI, so it shouldn’t be limited, but it might happen that Crowd sees this traffic as external and applies the rate limit. In this case, you could disable the app or increase the rate limit, but this brings additional complications.

To work around issues like that, you can allowlist the whole resource added by the app so it works without any limits.

To allow specific URLs to be excluded from rate limiting:

  1. Stop Crowd.

  2. Add the com.atlassian.ratelimiting.whitelisted-url-patterns system property, and set the value to a comma-separated list of URLs, for example: 

    -Dcom.atlassian.ratelimiting.whitelisted-url-patterns=/**/rest/applinks/**,/**/rest/capabilities,/**/rest/someapi
  3. Restart Crowd.

Allowlisting external applications

You can also allowlist consumer keys, which lets you remove rate limits for external applications integrated through AppLinks.

If you're integrating Crowd with other Atlassian products, you don't have to allowlist them as this traffic isn't limited.

  1. Find the consumer key of your application.

    1. Go to Administration > Application links.

    2. Find your application, and click Edit.

    3. Open Incoming Authentication, and copy the Consumer Key.

  2. Allowlist the consumer key.

    1. Stop Crowd.

    2. Add the com.atlassian.ratelimiting.whitelisted-oauth-consumers system property, and set the value to a comma-separated list of consumer keys, for example: 

      -Dcom.atlassian.ratelimiting.whitelisted-oauth-consumers=app-connector-for-confluence-server
    3. Restart Crowd.

After entering the consumer key, the traffic coming from the related application will no longer be limited.

Adjusting your code for rate limiting

We’ve created a set of strategies you can apply in your code (scripts, integrations, apps) so it works with rate limits, whatever they are. For more info, see Adjusting your code for rate limiting.

Last modified on Apr 15, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.