Announcing Ratelimit : Go/gRPC service for generic rate limiting

Today we are excited to open source Ratelimit, a Go/gRPC service designed to enable generic rate limit scenarios from different types of applications. For example, per IP rate limiting, or rate limiting the number of connections per second made to a database. Applications request rate limit decisions based on a domain and a set of descriptors. The Ratelimit service takes the request, matches it against the loaded configuration, checks against a Redis cache (although the in memory store can be easily swapped), and returns a decision to the caller.

Ratelimit is in production use at Lyft, and handles tens of thousands of rate limit requests per second. It is a reference implementation of the rate limit API that Envoy, our proxy and communications bus, uses. We use Ratelimit in both our edge proxies, and our internal service mesh to rate limit requests.


The Ratelimit service conforms to the Ratelimit protobuf defined here. It receives RateLimitRequests which are composed of a domain and a set of descriptors. A domain is a unique string that enables configuration to be application specific, without overlap. descriptors are a list of hierarchical entries that are used to determine the final key to lookup in the Redis cache. Combining these two concepts, Ratelimit provides a flexible framework for rate limiting a wide variety of scenarios. For example, to set a rate limit on connections per second to your datastore, you could define a configuration like this:

domain: mongo_cps
— key: database
value: users
unit: second
requests_per_unit: 10

Then send requests to the Ratelimit service like:

domain: mongo_cps
descriptors: (“database”, “users”)

To check if your service should be allowed to create a new connection to the database.

You can find more requests examples and explanations on how they work here and here.


Ratelimit loads YAML configuration files from disk. Ratelimit uses a library we recently open sourced called runtime to load these files. The configuration file reflects the Ratelimit API described above. It contains a domain, descriptors for that domain, and rate_limit. Currently the service supports per second, minute, hour, and day limits. For more information on how to setup the service, and configure it please read the docs.

Envoy ratelimit filters

Envoy integrates with the Ratelimit service via two filters:

1. Network Level Filter: Envoy calls the Ratelimit service for every new connection on the listener where the filter is installed. This way you can rate limit the connections per second that transit the listener.

2. HTTP Level Filter: Envoy calls the Ratelimit service for every new request on the listener where the filter is installed and the route table specifies that the Ratelimit service should be called. A lot of work is going into expanding the capabilities of the HTTP filter.

Like discovery, Ratelimit expands the Envoy ecosystem and enhances the functionality that Envoy can accomplish in your infrastructure. We are excited to be releasing this additional piece of the Envoy ecosystem and can’t wait to hear what you think about it and do with it.

Interested in open source work and having a big impact? Lyft is hiring! Drop me a note on Twitter or at

Announcing Ratelimit was originally published in Lyft Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.