.NET 7 adds Rate Limiting feature to easily limit the number of requests

.NET 7 has a built-in Rate Limiting feature, which limits the number of requests that can access a resource. For example, a database can safely handle 1000 requests per minute, and it’s not sure if more will crash. This is where you can put a rate limiter in your application that only allows 1000 requests per minute and starts rejecting requests after that number is reached. This is a way to protect resources and prevent the application from crashing in case of high browsing.

There are many different algorithms to control the flow of requests, here are 4 methods available in .NET 7

Concurrency Limiting

As the name implies, a concurrency limiter is a limit on how many concurrent requests can access a resource. If the limit is 10, then only 10 requests can access a resource at the same time, and the 11th request will be denied.

Once the previous request completes, the number of allowed requests increases by 1, when the second request completes, the number increases to 2, and so on. This algorithm is done by releasing RateLimitLease.

Token Bucket

A token bucket is another algorithm, like a bucket full of tokens. Every once in a while, a fixed number of tokens are added to the bucket, but the number of tokens cannot exceed the maximum number the bucket can hold. When a request comes in, it fetches and saves a token, and if the bucket is empty, a new request comes in with no token to fetch and is about to be denied access to the resource.

Suppose a single bucket can hold 10 tokens, and 2 tokens are added to it every minute. Now 3 requests have come in, leaving 7 tokens. After a minute, the bucket is automatically replenished to 9 tokens, and then 9 requests take all the tokens instantly. Then all requests are not allowed to access the resource until the next token is added to the bucket. If no requests follow, the bucket is automatically replenished with 10 tokens in 5 minutes and then waits for requests.

Fixed Window Limit

The fixed-window algorithm uses the concept of “windows”, which are time-measured to limit the maximum number of requests for a fixed period of time and reset the number of requests when switching to the next window.

Suppose there is a movie theater (window) with a maximum of 100 people (maximum number of requests) and each movie takes 2 hours to play (window duration). After the movie starts, the remaining audience (requests) can only wait in line for the next window, and the maximum number of people in line is also 100 , beyond which they are not allowed to continue in line, and can only wait for the next window to start before they can continue in line.

Sliding window restrictions

The sliding window algorithm is similar to the fixed window algorithm, but with the addition of the concept of “segments (segments)”.

A segment is a part of a window, if the previous 2 hours of the window is divided into 4 segments, there will be 4 30 minute segments. There is also a “segment index” which always points to the latest segment in the window.
Requests within 30 minutes go to the newest segment and the window slides through one segment every 30 minutes. If a new request appears during the window slide through segments, the request is refreshed and the maximum limit of segments is increased. If there are no requests, the segment limit remains the same.

Example Now there is a sliding window that contains 3 10 minute segments and can only accept a maximum of 100 requests. It now has an initial state of 3 segments, all with a count of 0, and the current segment index points to the 3rd segment.

Sliding window

In the first 10 minutes, we receive 50 requests, all of which are in segment 3 (where the segment index is located.) After 10 minutes have passed, we slide the window by 1 segment and move the current segment index to segment 4.

Sliding window

Over the next 10 minutes, we receive another 20 requests, so there are now 50 in segment 3 and 20 in segment 4. Again, after 10 minutes have passed the window starts to slide, so the current segment index points to 5, and since segments 3 and 4 are both inside the window, there are only 20 requests left in the window.

Sliding window

After another 10 minutes, the window is slid again, this time with segment index 6, but segment 3 (the segment with 50 requests) is outside the window, so the window recovers the 50 request limit. Since segment 4 still has 20 requests, the request limit for the sliding window becomes 80.

Sliding window

The Microsoft blog has a detailed description of the rate limiting feature and related APIs and middleware, and those interested in this feature can learn more about it in Nuget.

Table of Contents

Concurrency Limiting

Token Bucket

Fixed Window Limit

Sliding window restrictions