How do “burst” rate limits relate to rps (requests per second)?
For examples here I’ll reference the management API rate limits page:
And examples will use the numbers from Enterprise (Production) rate limits.
There are two columns here indicating:
Requests per Second: 50
Bursts per Minute (Peak): 1000
The first thing to note is the burst limit is not the requests per second multiplied by 60. There is relationship between the limits, but it is helpful to think of them as separate.
For example, you could make 50 requests per second but you would burn through your per minute rate limit in about 30 seconds and then you could only make ~16 requests per second. (we’ll cover the math on this later). On the other hand, if your traffic was spaced out at exactly 16.67 requests per second you would never hit the rate limit.
The documentation follows the rate limit table with some bucket terminology that is difficult to parse.
In the example,
y are the same value–the per minute burst limit. They need not be the same in this kind of rate limit structure, but in this instance they are.
The key line is:
In other words, for each 60/y seconds, one additional request is added to the bucket.
60/1000 = 0.06. One request is added to the bucket every .06 seconds. Or put another way:
1/.06 = 16.67. 16.67 requests are added to the bucket every second.
Another way to read the rate limits is this:
- You cannot ever make more than 50 requests per second
- You have a “burst bucket” of 1000 requests you can consume at up to 50 rps
- Over time, you must average no more than 16.67 requests per second
The more requests increase above 16.67, the sooner you will deplete the per minute burst bucket:
- At 16 rps you will never deplete the burst bucket
- At 30 rps you will deplete the burst bucket in ~75 seconds
- At 50 rps you will deplete the burst bucket in ~30 seconds
Once you deplete the burst bucket you will have an effective rate limit of 16.67 rps until your traffic drops below 16.67 rps for some period of time.
Suppose my application makes 30 requests per second. With what we covered above we know the following:
- I have a burst bucket of 1000 requests
- I am putting 16.67 requests in to the bucket every second
- I am draining 30 requests out of the bucket every second.
30-16.67 = 13.33 requests per second net outflow. This is the rate at which the level of the bucket is dropping. So the bucket will be empty in:
1000/13.33 = 75 seconds
At this point the effective rate limit for my tenant is the rate at which the bucket refills, so now I can only make 16.67 rps.