Hi all,
I’m not entirely sure where the problem may lie, so I’m dumping everything I can think of here.
Context
I am using some Express middleware by Auth0 (2.16.0) to authenticate any users attempting to access my Node based webapp.
This is my configuration:
auth({
session: {
store: new RedisStore({ client: redisClient }),
},
// I manually specify which endpoints need auth when defining them
authRequired: false,
issuerBaseURL: process.env.ISSUER_BASE_URL,
baseURL: process.env.BASE_URL,
clientID: process.env.CLIENT_ID,
clientSecret: process.env.CLIENT_SECRET,
secret: process.env.SESSION_SECRET,
idpLogout: true,
errorOnRequiredAuth: true,
routes: {
postLogoutRedirect: '/logged-out',
},
authorizationParams: {
response_type: 'code',
audience: process.env.EXTERNAL_API_AUDIENCE,
scope: process.env.EXTERNAL_API_SCOPE,
},
})
It’s been working great for a long time.
What changed
Recently I decided to move this webapp from being hosted on a virtual machine (AWS EC2) to being hosted on serverless lambdas (AWS Lambda). This was a seemingly trivial change using
the serverless-http library.
The problem
Locally this works fine. I’ve been using Serverless Framework to simulate lambdas on my machine, and the auth flow works correctly:
- I hit
/login - Middleware redirects to my issuer url
- Auth0 redirects back to
/callback - The middleware saves info to session, redirects back to
/
In production, I also go through the auth flow all the way to step 4, and I even see a Success Login log entry in the Auth0 logs. However, I am not authenticated despite a session being persisted (in redis) with an access token.
Observations
- Locally the
/callbackendpoint sets anappSessioncookie when it gets acodeback. This doesn’t seem to be happening in production. Another cookie is present on the final requests instead:auth_verification - Despite not having
attemptSilentLoginset totrue, in production, I can see theskipSilentLogincookie being set indicating it is being attempted. Locally, I have to login every time I logout, but in production I don’t. This seems to indicate my config is not being respected in production, and yet,response_typeis respected. - Locally the sessions stored only have two keys present:
id_tokenandstate. In prod there are more keys:id_token,scope,access_token,refresh_token,expires_atandtoken_type. - When I removed the
response_typeso the default is used (id_token), again this works fine locally, but in production I get a400(Bad Request) on my/callbackroute. I can see anid_tokenandstateas part of the request, indicating the call from my tenant is correct. However, I get the following error:
BadRequestError: checks.state argument is missing
at ResponseContext.callback (/var/task/node_modules/express-openid-connect/lib/context.js:354:15)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
Thoughts
- My first guess is that despite specifying redis for session storage, the requests still rely on maintaining state in memory in some way (as indicated by the
checks.stateargument is missing error above). In production, each request is potentially hitting an entirely different running process. However, this doesn’t explain why I see the information persist to my redis store. - My second guess is that I have messed up configuration of my production tenant in some way, which would be surprising since it worked well up till now
- My last guess is that
/callbackis failing to parse the request due to how my lambda is setup in production (as indicated by the400when usingid_tokenas theresponse_type).
Question
Does anyone have any advice on how I can troubleshoot this issue, or any ideas on what may be going wrong?