With the Rate Limiting filter, you can set a maximum number of requests per second, minute, hour or day that are allowed to access your service by user, IP address, header value, or URI. You can set the rate limits by HTTP method, or use a wildcard to have the same limit for all HTTP methods for your service. This will prevent your service from being flooded by deliberate or accidental requests that could harm your service.
Your current limits may include your absolute limits. Absolute limits are specific to a service domain and are therefore only provided by the origin service. However, Repose will integrate absolute and current limits as they apply to you. Repose does not enforce absolute limits, but it does inform the origin service of those limits.
General filter information
Filter name: rate-limiting
Filter configuration: rate-limiting.cfg.xml
Released: prior to version 1.0.6
To correctly rate limit a requested resource, the rate limiting component uses one required HTTP header precondition and one optional HTTP header precondition.
|Required||X-PP-User is a single-value header. This header is used to describe the unique name of the client making the request. This name is used in part to cache and store request hits. |
|X-PP-Groups||Optional||X-PP-Groups is a list of string values. This header is used to describe all of the limit groups the client belongs to. A client may belong to zero, one, or more limit groups.|
These headers are used by the Rate Limiting filter and are populated by other Repose components such as the Keystone v2 filter.
Repose filters may add values to HTTP message headers to communicate a set of options to downstream filters. To aid downstream filters in selecting the most qualified value from a given set, the values themselves may be annotated with a relative quality factor. This allows downstream filters to make a decision based on the available options rather then on which filter overwrote the header last.
The X-PP-User header must be set by one of the filters before the Rate Limiting filter.
If you support percent-encoded or URL encoded entities, you need to use the URI Normalization filter in front of the Rate Limiting filter. Percent encoded URLs will fail to be properly rate limited unless normalized by the URI Normalization filter.
Distributed rate limiting configuration
1. Set Up Repose
Configure Repose using a cluster configuration (multiple servers). Note that all servers must use the same system model.
2. Add the Rate Limiting filter
Add the Rate Limiting filter to your system model configuration. Place the Rate Limiting filter below the logging, normalization, authentication, identity, and authorization filters.
3. Add the Distributed Datastore service
Add the Distributed Datastore service to your system model configuration below the filters.
In the following example, the System Model is configured to perform distributed rate limiting using the Rate Limiting filter and the Distributed Datastore service.
4. Configure the Distributed Datastore service
Configure the Distributed Datastore service to listen on a port.
Optional rate limiting configurations
Multiple rate limits
By editing your rate-limiting.cfg.xml file, you can configure multiple rate limits.
Within each <limit-group>, you can define one or more rate limits by using a regular expression to match against request URIs, the maximum number of requests allowed by the limit, and the unit of time before the count applied to the limit is reset. To further restrict the requests a limit applies to, you can also define HTTP methods, but this is not required. Each <limit> element has a "bucket" with an allowance described above. For each request, all applicable limits add a hit into their respective buckets. Multiple rate limits that match the same request, such as POST on * and POST on /servers, will all apply.
When the number of hits in a bucket exceeds the value of the corresponding limit, future requests are rate-limited and will not be forwarded to the origin service. After the configured unit of time for a limit has passed from the time of the first request the limit was applied to, the bucket will be emptied and the process will restart. When the allowance for one limit is reached, subsequent limits within the same limit group will never be applied. In other words, the most restrictive limit will prevent updates to subsequent limits. Previous limits within the group will continue to apply (i.e. continue to increment). If a request does not cause a bucket to go over limit, the request continues on to the next filter.
Group limit order
Rate Limits are only applied to the first matching <limit-group>. For example, for some user “test” in both the “admin” and “observer” groups, if the “admin” limit-group is defined in the configuration above the “observer” limit-group, the “admin” limit-group will apply its limits while the "observer" limit-group will not.
Limits with multiple methods in the http-methods attribute or with the http-method of ALL will match against the listed methods, and matching requests will fall in the same bucket rather than in separate buckets for each method.
For example, if a limit has http-methods="GET POST" and value="3", then three GET or POST requests will be passed, but a fourth, GET or POST request will be rejected because it exceeds the limit of 3. The following example shows how this would work.
Example of multiple rate limiting
The following example shows a sample request, a configuration for multiple rate limiting, and a table showing how the requests are updated.
This is a valid request that could pass through the Rate Limiting filter multiple times depending on the rate limit intervals that are set in the rate-limiting.cfg.xml file.
GET /test/one HTTP/1.1
This section of the rate-limiting configuration shows only one <limit group>. Your configuration may have numerous <limit groups> listed. Within each <limit-group>, you may have numerous <limits>.
This table details a one-second scenario of the preceding request and rate limiting configuration.
- The first request passes all three limits because the methods match and the limit values have not been exceeded.
2. The second request passes all three limits. Limit "two" has been reached.
3. The third request passes the first limit and is then rejected because the allowance has already been fulfilled for the second limit until the next 24-hour period.
Bucket allowanceafter 1st request
Bucket allowanceafter 2nd request
Bucket allowanceafter 3rd request
Bucket allowanceafter 4th request
Bucket allowanceafter 5th request
With the time block approach, each time-block or unit is independent of other units. On average, the configured rate limit of X will be enforced. During a rare spike, up to 2X-1 requests can pass through in less than one unit of time.
In the following example, five requests are allowed per minute for each time-block, but the one-minute window that lands in between the two units has seven requests. You may see over-limits during particular windows; however, across multiple units of time, the allowed requests average less than or equal to five requests per minute.
Global rate limits
Global rate limits prevent more than a configured number of requests from reaching the origin service for a given period of time. Global rate limits apply to all requests, regardless of user. Global rate limits can be defined to protect the end service as a whole rather than just from individual users. For example, if a service is known to only handle up to 500 requests per second in total, a global rate limit can be set to 500 requests per second to prevent further requests from reaching and compromising the origin service. Any request which breaks that limit will receive a response with status code 503.
You can implement global rate limiting by adding <global-limit-group> to your rate limiting filter configuration. If you are using <limit-group>, place <global-limit-group> above the <limit-group> elements.
The <global-limit-group> element uses the same <limit> children elements as <limit-group>.
Global rate limits are currently not queryable via the <request-endpoint>.
The following global rate limit configuration rate limits across all endpoints defining an allowable 50 requests per second with only 10 allowable requests per second to the “/server/create” endpoint.
The rate limiting component caches rate limits by user. Consequently, to query rate limits, a user must be passed into the rate limiting component. The rate limiting component uses the X-PP-User header to identify a user whose limits will be queried. Without a value in this header, the rate limiting component will send back a '401 Unauthorized'.
The rate limiting component uses the X-PP-Groups header to determine which rate limits to apply to the user. Without this header present, the rate limiting component will assign limits from a default group specified in the
rate-limiting.cfg.xml configuration file.
It is possible to query a user's rate limits before the user's limits are placed in the cache. This is the case when querying limits before the user has ever been rate limited. In this case, there are two possible results, based on the contents of the X-PP-Groups header:
- If the X-PP-Groups header, containing specification of a group, is passed in,
then the rate limits configured for the specified group in
rate-limiting.cfg.xml are returned.
- If the X-PP-Groups header is not passed in,
then the rate limits configured for the default limit group in
rate-limiting.cfg.xml are returned.
The following rate limiting configuration defines a default group named
My_Group; the limits defined for
My_Group apply if no other group is specified.
request-endpoint element, the
uri-regex attribute is set to
/limits. This is the URI at which the user should query for rate limits. In this rate limiting configuration, the rate limiting information for a user whose limits are not in the cache are as follows:
Once the user has made a call which uses any of the defined limits, the user has rate limits in the cache. At this point, the rate limiting component will return the limits stored in the cache for that user.
This feature tracks the number of requests a user has within a timetable defined by the rate-limiting configuration. The Rate Limiting filter tracks limits by decrementing all of the rate-limits which match the request URI and the verb. To track limits, specify a POST to /servers.
Initial rate limits
First POST to servers
Second POST to servers
Rate limit by IP
When the IP Identity filter receives a request, it will add the X-PP-User and X-PP-Groups headers to the request. The X-PP-User will be given the value of the originating IP. The X-PP-Groups header will be given the value of IP_Standard unless you configure the whitelist in the IP User filter. If you have configured the whitelist, the X-PP-Groups will be given the value of IP_Super.
The Rate Limiting filter makes use of the X-PP-User and the X-PP-Groups headers to determine and keep track of a users rate-limits. Because the Rate Limiting filter cannot identify users, it expects the X-PP-User and X-PP-Groups header to be present when it receives a request. If X-PP-User is not present, it will return a 401 status code because it cannot determine who the requester is. If the X-PP-User is present, it will look for the X-PP-Groups header. If the X-PP-Groups header is not present, the Rate Limiting filter will use the default rate-limit group configured in the rate-limiting.cfg.xml file.
The following example shows a section of a configuration for the IP User filter and the Rate Limiting filter set up for rate limiting by IP.
Rate limit by Role
The following example shows a section of a configuration for the Identity v3, Header Translation, and Rate Limiting filters which set up rate limiting by role.
Rate limit by query parameter key
To set up rate limiting based on query strings, configure the query-param-names attribute to match all required parameters in any given request where you want this to apply. The query-param-names attribute is located within the limit element. The parameters are space delimited and are not exclusive.
The following configuration is set to rate limit regular expressions based on name, age, and gender.
The following table shows the limits that are applied for each request. Notice that the parameters are not exclusive. For instance, request #3 contains both name and age specifications yet is limited by limit id 1, which only specifies the name parameter.
|3||http://openrepose.org/devs?name=Joe&age=31 ||0, 1, 2|
To rate limit by header, follow the same instructions as you would to Rate Limit by Role, but instead of translating the X-Roles header, translate the header you care about instead. Since the X-Roles header is no longer necessary, you may not need to include and configure an authentication filter.
The rate-limiting.cfg.xml file contains the following elements and attributes. Add the filter to your Repose deployment through the System Model Configuration.
|<rate-limiting>||-||Required||Specifies the sub-elements and attributes to define your rate-limiting configuration.|| |
Stores rate limiting information. If not specified, rate limiting will use the first distributed datastore available. If no distributed datastores are available, it will revert to using a local datastore. Valid values are local/default and distributed/hash-ring (requires dist-datastore service).
|datastore-warn-limit ||Optional||Defines <limit> in order to log a warning on size when an object is stored in the database over this limit. When the limit is met, Repose will issue a warning message in the logs. The limit default is 1000 cache keys per user.||Implemented in version 2.8.1|
|overLimit-429-responseCode||Optional||When set to true, it will send a 429 response code instead of the default 413 response code. The 429 response code in conjunction with the Response Messaging Service Configuration will provide a custom over-limit message. ||Implemented in version 2.4.1|
|use-capture-groups||Optional||When is set to false, it will count all the requests with the <uri-regex> that have the capture group towards the limit count specified. By default, this attribute is set to true. If it is set to false, the first rate limit with a uri-regex that matches the request URI will be used to apply the rate limit. ||Implemented in version 2.6.12|
Defines an endpoint with a matching regex to bind GET requests for returning live rate limiting information.
|uri-regex||Optional||A regular expression (regex) for the URI at which the user can query their limits.|| |
|uri||Required||Defines a human-readable URI describing the endpoint for a given configured limit.|| |
|include-absolute-limits||Optional||Enables or disables integration with absolute limits. || |
|<limit-group>||-||Required||Defines a list of rate limits to be applied to a user, based on the user's membership in a group.|| |
|id||Required||Defines the unique identifier for a given limit group.|| |
|groups||Required||Defines a space-delimited list of the groups to which this limit group will apply.|| |
|default||Optional||Identifies the limit group that will be applied if a user is passed if either no group is specified or no group in the rate limiting configuration matches the group or groups specified.|| |
|<limit>||-||Optional||Describes limits configured for a given endpoint.|| |
|id||Required||Defines the unique identifier for a given limit. Each <limit> element must have an id attribute that is unique within the each <limit-group>.|| |
|uri||Required||Human readable version of the matcher used to enforce this rate limit.|| |
The regex used to match a passing request to current limit group. Within the regex, each capture group is allowed the number of hits specified in the value attribute of the limit element.
|http-methods||Optional||Lists the HTTP methods associated with this limit. Valid values include: GET, DELETE, POST, PUT, HEAD, OPTIONS, CONNECT, TRACE, ALL.|| |
|unit||Required||Defines the unit of time associated with this limit. Valid values include: SECOND, MINUTE, HOUR, DAY.|| |
|value||Optional||Defines the number of requests allowed within the configured unit of time.|| |
|query-param-names||Optional||Lists query parameter names or keys that this rate limit will match on.|| |
Return codes and conditions
If a request is over the per-user limit, the Rate Limiting filter will return a 429 or 413 error code depending on what is configured. If a request is over the global limit, the Rate Limiting filter will return a 503 error code. It will return a '401 Unauthorized' if there is no value in the X-PP-User header.
The Rate Limiting filter does not create any request headers.
Version 126.96.36.199: rate limits for a user are no longer reset based on the shortest duration limit.