Guidance on Managing M2M Token Quota with Caching and Testing, and Azure Token Integration

We are in the process of integrating Auth0 as our identity provider, transitioning from Identity Server 4. A significant challenge we’ve encountered relates to the quota on issuing M2M (machine-to-machine) tokens. In our previous setup, token issuance was unlimited, simplifying our implementation.

We understand the recommendation to cache tokens to comply with this quota. However, this approach primarily benefits our applications in a production environment. We also conduct daily integration tests and load testing on our services, where caching tokens might not be as straightforward. These testing scenarios could exhaust the token quota or require a separate mechanism to manage token usage efficiently.

Additionally, we are contemplating storing tokens in Azure Key Vault and treating them akin to API keys. However, we recognize that API keys are typically long-lived, whereas tokens are designed to be short-lived. This raises a question about the best practices in balancing security with practicality.

Given this context, our questions are:

Caching in Testing Environments: How can we effectively manage token quotas in scenarios like integration testing and load testing, where token caching might not be as feasible?

Token Storage and Lifecycle Management: Is storing tokens in a system like Azure Key Vault and treating them like API keys a recommended practice, considering their inherently short-lived nature? How can we best balance the need for security with the practical aspects of token management?

Azure Tokens vs. Auth0 for Server-to-Server Calls: Would it be advisable to use Azure tokens for server-to-server calls and reserve Auth0 primarily for user login grant types? What are the pros and cons of this approach, especially in the context of our current challenges with token quota and caching?

Alternative Strategies and Best Practices: Are there alternative strategies or best practices you would recommend for managing M2M tokens within the quota limits, considering our unique testing and implementation requirement