Managed Prometheus drops our 53 KiB metrics response and Grafana data becomes intermittent

Hi folks.
Today we encountered this situation: we exposed a metrics endpoint to Fly.io Prometheus for metric collection. On top of the original metrics, we added some new metrics and deployed the update at 17:30. However, after the deployment, the lines on the graph became intermittent and broken.

This is a sample response from our metrics endpoint.

We later found in the official documentation that responses larger than 16 KiB are discarded. Could that be the cause?

Hi there, where did you find that responses larger than 16Kb are discarded?

You can click this website and find the 16 KiB limit there.

Thanks! That’s an error “off by 3 orders of magnitude”. The Prometheus metrics payload size limit is 16MB.

Cheers!