Fly Prometheus not scraping custom metrics - instant queries return empty results

Problem

Fly Prometheus stopped scraping metrics from our apps approximately 2 hours ago. Custom metrics and fly_instance_up return empty results in instant queries, while the metrics endpoints are accessible and working correctly.

Environment

  • Apps affected:

    • Main NestJS app (port 3000)

    • postgres-exporter (port 9187)

Configuration

fly.toml (main app):

[[metrics]]
  port = 3000
  path = "/metrics"
  processes = ["app"]

postgres-exporter fly.toml:

[metrics]
  port = 9187
  path = "/metrics"

What works

  1. Metrics endpoints are accessible directly via HTTPS

  2. Metrics accessible via SSH from inside the machine

  3. Health checks are passing (1 total, 1 passing)

  4. fly_edge_* metrics are available (these don’t require scraping from apps)

What doesn’t work

  1. Instant queries return empty:

    curl "https://api.fly.io/prometheus/<org>/api/v1/query" \
      --data-urlencode 'query=<custom_metric>' \
      -H "Authorization: FlyV1 $TOKEN"
    # Returns: {"data":{"result":[]}}
    
    
  2. Even fly_instance_up returns empty - this is a built-in Fly metric

  3. Range queries show data stopped ~2 hours ago

Expected behavior

Prometheus should scrape /metrics endpoints every 15 seconds and data should be available via API queries.

Steps already tried

  • Redeployed both apps

  • Added health checks

  • Changed [metrics] to [[metrics]] with processes parameter

  • Verified metrics format is valid Prometheus format

Additional info: Metrics were working fine before. We have historical data in Prometheus showing metrics were being scraped until ~08:27 UTC today. After that, scraping completely stopped.

Range query confirms this:

# Last data point: 2026-01-27T08:27:00Z
# No new data since then

The same setup was working for weeks/months.

Question: Are there any cardinality limits or quotas for custom metrics? Could we have hit some limit that caused scraping to stop?

Our apps expose relatively few metrics (~50-100 unique series), so this shouldn’t be the issue, but wanted to confirm.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.