Prometheus endpoint appears to be timing out / returning 5xx

What the title says. Grafana seems to be spinning on metrics values and either timing out or failing with a 502/503 error. Making the request manually with curl seems to give me this fun error message:

{"status":"error","errorType":"422","error":"error when executing query=\"...\" on the time range (start=1642607729000, end=1642608029000, step=30000): cannot execute query: timeout exceeded during query execution: 30.000 seconds (elapsed 32.001 seconds); the timeout can be adjusted with `-search.maxQueryDuration` command-line flag"}

Seems like the metrics cluster is on the fritz again :slight_smile:

It seems to be addressed - Fly.io Status - Metrics availability issue

Is it working ok for you too now?

Yep, it’s working again - thanks for letting me know!