Increase the number of concurrent user import job for enterprise tenants

Problem statement

We have an external Identity system from which we need to import the users utilising the Bulk user import. We have received a script from Auth0 to load users from the external Identity system to Auth0. With the script at the moment, we could only run 2 concurrent jobs with a file size of 500, due to which loading users is taking longer than expected.

We are migrating 6 million users from our external Identity system to Auth0. In order for a successful migration, we must know how long it will take to migrate all 6 million users so we can plan the migration activities over low-traffic periods only.

Troubleshooting

We typically only upgrade production tenants for Enterprise customers to enterprise tier.

Generally, the production tier has about 4x the throughput of free and enterprise has bout 3x-4x the throughput of production. We’ve seen enterprise tenants import at around 400k users/hour when running 2 imports at a time in parallel, maximizing the number of users per import file.

Cause

We usually only allow setting the enterprise tier for bulk user imports for production use cases. The reason for this is that this tier puts much more stress on the databases (bulk imports = lots of write operations), and therefore we want to avoid customers using that for non-production use cases such as testing the imports.

Solution

Based on all of the above, we would recommend the following solution:

  • If your release tenant is currently on the free bulk import tier (because it has the staging environment tag), in typical scenarios, we have observed that the bulk import throughput at enterprise tier is around 10X faster than at free tier. So you can benchmark with a subset of your imports and multiply by 10 to get an estimate.

  • If you really want to benchmark the bulk import duration on the enterprise tier, you can still do it with a small subset of the import in the prod tenant, if it’s set on enterprise tier.

In both cases, these are estimates only, and the final result will depend on the traffic in the environment at the time you run it.

1 Like