Auth0 Logs to Segment Extension duplicated entries and missing entries

Problem Statement

We noticed that some logs were either missing or duplicated in Segment, and the run history was also blank.

Symptoms

  • No run history / missing runs
  • Log events missing
  • Log events duplicated multiple times

Steps to Reproduce

Requires artificial load ~ 100 logs each batch.

Cause

Segment was taking too long to process 100 log events, and thus the webtask the extension was running on was being killed before the run completed, presumably due to Segment’s timeouts lasting longer than the webtask container.

This result in not updating the extension with a checkpoint ID, so every time the extension retried a run as per its configured frequency, it would just fetch the most recent 100 logs.

So any logs that had been already sent that came up again in that first 100, would end up being duplicated on the Segment side.
Logs that failed to send before the webtask terminated would not get to Segment.

This combined with the hardcoded behavior of the extension to purposefully omit some logs led to the mixture of missing (either through timeout or omission) or duplicated logs due to lack of checkpoint ID.

Solution

We recommend reducing the batch size, which can help to generate smaller amounts of logs, as it allows the extension to complete sending all events before termination if Segment is slow.

However, for heavier use where the frequency X batch size can not keep up with the logs generated, the best recommendation with the current implementation is to use log streaming, which is more performant.

Since Segment is not supported natively, this will require you to host a custom webhook on your own infrastructure to receive and pass on the logs to Segment.