Add rate limiting and cache for m2m token authentication endpoints


@dayeye2006 Nice job, Can we do that in auth0 hook? We can’t really work by auth 0 M2M by this pricing model!


A post was merged into an existing topic: Caching in Actions

Not being able to cache M2M tokens or to rate-limit the generation of these tokens by application is a real show-stopper for us.

We wanted to open up our APIs to partners, but this is impossible in the current state. Within a few minutes, one single partner/customer can block the APIs for everyone else.

The two solutions that I see here, are:

  • Put a rate-limit on token generation by application, so that an M2M API consumer can only block himself, if generation is abused.
  • Implement a cache for generated tokens so that the API owners can control the generation through token lifetime.

What has been suggested in the past, is to use a Hook/Action to control the token generation or the cache. But this would need something like a redis openly available on the Internet to cache lifetimes, application IDs, etc. This is both a problem for high-availability and for security.

We would gladly pay more, but the upgraded account types only contain more tokens per month. So the problem would remain the same.

this has also been discussed a long time ago here:

This feature would be highly appreciated!


If we have a solution where we use the REDIS cache and then throw an error when we hit the specific limit that we set. Does the token usage only increment on successful passes or does it count for each invocation of the API?

Thanks, Konrad;

To be frank, this seems like a big - huge - gaping hole in the feature offering of Auth0. Who wouldn’t be interested in this improvement? In a world where we pay for tokens – this is a problem that every user of tokens has to solve, there is no way around it. This has to be a DRY violation on a massive scale : )

Thanks again,



3 posts were split to a new topic: Caching in Actions

Is there a way to use actions caching to accomplish this? We are having serious issues with B2B partners blowing through our M2M token quotas because we have no way to rate limit. Looking through the docs on the M2M flow, it isn’t clear to me how one would go about fetching a cached access token and returning that instead of a new one to avoid this problem.

Hey @chris45,

We don’t currently have a solution for this. The “caching in actions” that just rolled out is entirely unrelated and covers caching external resources within an action. (e.g. caching a token you retrieve from Slack). It doesn’t control issuing tokens to your clients.

Totally agree. Lately we released a feature in which our customers can use our API and in 3 weeks we got to a whopping number that covers about 85% of our monthly quota, although we currently have only 10 api clients. From what I see our customers request for a new token upon each request, but I can’t really control their usage. I don’t want to cache it on my side, I don’t think that’s an appropriate solution.
Currently we are starting to investigate for solutions other than auth0 for this issue, because it’s a really big issue for us.


Thanks for the added context @yanivs

We had an incident this week: a client failed to cache their token and burned through 360,000 tokens in one day. This was way over the quota we’d paid for. We are charged for these tokens (I think on our plan, this number of tokens comes out to ~$2k).

I can see why Auth0 wouldn’t prioritize this: reducing the number of M2M tokens consumed certainly cuts into their bottom line. I’m not sure we would if this were where I work.

That said, as a former champion of Auth0 at work, I’m now on an impromptu-created cross-functional team to investigate moving off of it.

For AWS users, one thing that could be done is contribute a community maintained construct to, a library of infrastructure-as-code modules using the AWS CDK tool.

We could use API Gateway, Lambda, and DynamoDB to make a serverless endpoint that caches tokens based on a client ID and client secret.

There could also be a non-serverless option that uses Redis.

Users needing a cache could deploy this construct to their AWS account, fetching tokens from an API Gateway /token endpoint rather than going to Auth0 directly.

The lambda function written for this framework could be reused if someone wanted to implement the same solution in Terraform or Pulumi.

Hi @phitoduck,

Thank you for being candid, that sounds like an admittedly poor experience.

We’ve recently passed this thread directly to our product team and they are aware of the limitations of this feature and have plans to investigate possible solutions.

I don’t have any timeline, but I appreciate your patience while our team takes some time to evaluate.


Yea the caching available within actions is a good start, but I think the client credentials flow needs some kind of mechanism for a) getting a reference to a token after it is generated, and b) a mechanism for telling the flow to exit successfully with a token retrieved from the cache before the token is generated.

1 Like

Thanks for the context @matt.howard.

Actually a follow up on this for anyone who comes across it. We ran out of our tokens for Feb and I just implemented a workaround which is fine for us since we control all the clients - we only use M2M tokens for some service-to-service calls as well as bash scripts and some lambdas.

I created a new Database named “system” with a single user in that DB, and only authorized our M2M application to use it. Then changed our core auth class which generated the M2M tokens to use the normal username/password auth with the realm of “system”. I had to slightly change our Login Action which sets some custom claims and validates the user against our API… but was pretty minor.

So essentially we’re not using M2M tokens / client_credentials grant anymore, and it’s just a normal user. We manipulate the token as needed in a custom Action and this seems to work fine. Our apps internally cache the tokens but there are a lot of edge cases (like local development) which basically mean we can still generate more than our quota. Since normal tokens don’t have the quota this works well for us and probably will for many others. Only slight pain is configuring an extra 2 params for this system user/password on top of the clientid/secret.

I would love a solution to this problem. It leaves clients of Auth0 wide open to uncontrollable costs. I’m fine if it requires some coding, but I don’t even see how it’s possible right now.

As @matt.howard says, we need… “some kind of mechanism for a) getting a reference to a token after it is generated, and b) a mechanism for telling the flow to exit successfully with a token retrieved from the cache before the token is generated.”


I rally wonder how that approach of billing for M2M tokens was supposed to ever work in real life. Again, the entities that are asking for tokens (generating cost) can be completely outside of tenant’s control. How is the tenant supposed to be responsible for the cost if he can’t control it? :triumph:

Auth0, before you even try to change anything, what is your guidance?

We use M2M tokens as part of our automated qa process. there have at times been issues where more tokens were requested than necessary which pushed us over our limit. We would really like to see a quota per application/clientid feature to prevent overages.


Thanks for the added feedback everyone. Make sure to hit the “like” button if you haven’t already.