Dashboard metrics are not available

Looks like fly dashboard metrics disappeared for all my apps and showing “No data available” (dc: lax)

2 Likes

Can confirm also.

Seems like metrics are dead across the board. My grafana server can’t read metrics either.

Yeah, all metrics queries are coming back with a 500 it seems.

Still down for me as well.

Have been down for me since yesterday.

The status page is showing No Downtime or issues though: https://status.flyio.net/

It seems to be affecting at least a handful of customers, it would be good to have someone from Fly give an update or assistance.

Thanks for the reports, we’re looking into this and will keep you updated.

Confirmed the issue, there is a bug that’s currently causing 500 errors for all Prometheus queries to personal organizations.

A fix for the bug has been deployed, metrics queries should be working for personal organizations again. Please let me know if you’re experiencing any further issues!

1 Like

Would it be possible to apply a permanent fix for the other Prometheus (vector) issue: Lack of Prometheus metrics, sometimes, after deploy (host specific?)?

I appreciate that Prometheus metrics may be “beta” functionality but the problem manifested over a week ago and I don’t believe the root cause has yet been resolved, nor any mention of the issue on https://status.flyio.net/ .

I still can’t get any metrics in my Grafana dashboards so it seems the issue is still present. I can see all the metrics in my “/metrics” endpoint with no problem.

The issue described in this thread has been fixed, I can confirm that Prometheus queries to personal organizations are still working properly.

It’s possible you’re facing a separate configuration issue, to diagnose further, first take a look at our docs on setting up a Grafana datasource to make sure your configuration matches. If you’re setting up any custom metrics make sure that you’ve configured your app to scrape your /metrics endpoint. If that doesn’t work, try doing a manual curl request and share the results if the API is not working correctly for you.

I’m still experiencing this (or something similar). My app exposes a /metrics endpoint and I configured my app with the following metrics block:

[metrics]
  port = 8080
  path = "/metrics"

Example /metrics response:

# TYPE http_request_duration_seconds summary
http_request_duration_seconds{quantile="0.01"} 0.000987601
http_request_duration_seconds{quantile="0.05"} 0.000987601
http_request_duration_seconds{quantile="0.5"} 0.000987601
http_request_duration_seconds{quantile="0.9"} 0.000987601
http_request_duration_seconds{quantile="0.95"} 0.000987601
http_request_duration_seconds{quantile="0.99"} 0.000987601
http_request_duration_seconds{quantile="0.999"} 0.000987601
http_request_duration_seconds_count 1
http_request_duration_seconds_sum 0.000987601
http_request_duration_seconds{status="404", quantile="0.01", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds{status="404", quantile="0.05", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds{status="404", quantile="0.5", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds{status="404", quantile="0.9", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds{status="404", quantile="0.95", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds{status="404", quantile="0.99", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds{status="404", quantile="0.999", path="vapor_route_undefined", method="undefined"} 0.000987601
http_request_duration_seconds_count{status="404", path="vapor_route_undefined", method="undefined"} 1
http_request_duration_seconds_sum{status="404", path="vapor_route_undefined", method="undefined"} 0.000987601
# TYPE http_requests_total counter
http_requests_total 0
http_requests_total{status="404", path="vapor_route_undefined", method="undefined"} 1

In the logs I can also see the /metrics endpoint being continuously triggered every 15 seconds.
However my metrics dashboard still looks like this:

Any ideas?