Metrics Dashboard not populating

djs-mhh · March 16, 2025, 10:29pm

I have a setup where I have an HAProxy instance proxying to my real backend server, which is using actix-web (Rust). As such, I have my main API app instance bound with .bind(("::", 8080)). As a side-effect of this, many of the graphs in my Fly.io Metrics dashboard aren’t getting populated (e.g. HTTP Status Code, HTTP Response Times, etc). I’ve tried setting up a metrics-dedicated server in the same instance bound with .bind(("0.0.0.0", 9091)), but no dice, even though the /metrics endpoint works as-expected.

Where could I be going wrong here?

fly.toml:

app = '<redacted>'
primary_region = '<redacted>'

[build]

[env]
  PORT = '8080'

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = 'stop'
  auto_start_machines = true
  min_machines_running = 1
  processes = ['app']

  [[http_service.ports]]
    handlers = ["http"]
    port = 80

  [[http_service.ports]]
    handlers = ["tls", "http"]
    port = 443

[metrics]
  port = 9091
  path = "/metrics"

[[vm]]
  memory = '512mb'
  cpu_kind = 'shared'
  cpus = 1

My actix-web main() method:

#[get("/")]
async fn hello() -> impl Responder {
    format!("Hello from fly.io!")
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {

    let prometheus = Arc::new(
        PrometheusMetricsBuilder::new("app")
            .endpoint("/metrics")
            .build()
            .unwrap(),
    );

    let metrics_server = HttpServer::new({
        let prometheus = Arc::clone(&prometheus);
        move || App::new().wrap(prometheus.clone())
    })
    .bind(("0.0.0.0", 9091))?
    .run();

    let api_server = HttpServer::new(move || {
        let prometheus = Arc::clone(&prometheus);
        App::new()
            .wrap(prometheus.clone())
            .service(hello)
    })
    .bind(("::", 8080))?
    .run();

    tokio::try_join!(api_server, metrics_server)?;
    Ok(())
}

Sample curl-ing:

% curl "https://<redacted>/v1/"
Hello from fly.io!
% curl "https://<redacted>/v1/"
Hello from fly.io!
% curl "https://<redacted>/v1/metrics"
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.005"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.01"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.025"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.05"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.1"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.25"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.5"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="1"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="2.5"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="5"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="10"} 5
app_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="+Inf"} 5
app_http_requests_duration_seconds_sum{endpoint="/",method="GET",status="200"} 0.00016875
app_http_requests_duration_seconds_count{endpoint="/",method="GET",status="200"} 5
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.005"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.01"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.025"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.05"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.1"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.25"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="0.5"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="1"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="2.5"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="5"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="10"} 44
app_http_requests_duration_seconds_bucket{endpoint="/metrics",method="GET",status="200",le="+Inf"} 44
app_http_requests_duration_seconds_sum{endpoint="/metrics",method="GET",status="200"} 0.0026970429999999997
app_http_requests_duration_seconds_count{endpoint="/metrics",method="GET",status="200"} 44
# HELP app_http_requests_total Total number of HTTP requests
# TYPE app_http_requests_total counter
app_http_requests_total{endpoint="/",method="GET",status="200"} 5
app_http_requests_total{endpoint="/metrics",method="GET",status="200"} 44

wjordan · March 17, 2025, 1:20am

It looks like you’re accessing /v1/metrics and not /metrics (as defined in your fly.toml) in your test, maybe that’s the issue.

djs-mhh · March 17, 2025, 1:41am

In this sample, I’ve temporarily added the /v1/metrics endpoint to my regular server for the sake of easier debugging. metrics_server is a seperate server running in the same process, and the url for that is just a simple /metrics on port 9091.

That said, I’ve made progress, and have learned a lot about metrics in fly.io in the process! Turns out the immediate issue here is that the middleware library I’m using doesn’t automatically collect the various fly.io specific metrics that’re displayed in the Metrics dashboard (e.g. fly_edge_http_responses_count). I’ve manually added code to do that, and now, I’m finally seeing data!

However, this still begs the question: why is it that metrics aren’t automatically generated by fly.io itself when my backend serves requests over IPv6 (by way of HAProxy and 6PN)?

wjordan · March 17, 2025, 2:38am

Ah, in that case, I think the behavior you’re seeing may happen if haproxy is sending the requests to your Machine through its 6pn directly (.internal), not through the fly-proxy service. (In fact, the http_service config isn’t even being used at all for 6pn.) If you use a Flycast address (.flycast) instead, the requests will go through the fly-proxy service layer and then would get tracked by the fly_edge_ metrics. In many cases using Flycast can make haproxy layer redundant, in any case hopefully this helps clarify this a bit further for you.

djs-mhh · March 17, 2025, 3:30am

That did the trick! Thank you, @wjordan!

system · March 24, 2025, 3:31am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dashboard metrics are not available	12	785	December 4, 2023
Custom app metrics not making it to grafana cloud Questions / Help metrics , elixir , grafana	18	1912	March 24, 2022
fly-metrics.net & Prometheus issue?	6	486	October 21, 2022
Metrics aren't working in ams	10	506	May 5, 2022
Metrics / Grafana Not Working metrics , grafana , dashboard	12	134	January 25, 2025

Metrics Dashboard not populating

Related topics