Early access: build Grafana dashboard from Fly metrics

Check out our shiny new metrics engine: Feature preview: Custom metrics

We have a chunky Prometheus (well, VictoriaMetrics) cluster with all kinds of useful application metrics. It’s what powers the graphs on our web UI, but there’s a bunch more in there.

You can use these with Grafana to make neat dashboards, alert on metrics, etc.

We’re working up a prebuild Grafana dashboard with some interesting graphs, but if you want to try it out before instructions examples, you should!

API Details

  1. The Prometheus API is available at https://api.fly.io/prometheus/api/v1/
  2. Send an Authorization: Bearer <TOKEN> to authenticate (you can run flyctl auth token to get your token)
  3. Run some queries

Grafana Cloud Setup

If you don’t have Grafana yet, the free Grafana Cloud plan will work fine for this: Grafana Cloud | Grafana Labs

Once you’re in your spiff new Grafana instance, add a Prometheus source like so (note the URL and “Custom HTTP Headers” section:

From there, you can create a Dashboard, then add a Panel. Try a query like this:

sum(rate(edge_http_responses_count{app="<APP NAME>"}[$__interval])) by (status)

Series and Labels

We’ve exposed a bunch of different series, you will need to supply an app="<NAME>" argument to each. Here’s a quick list:

// responses from load balancer

// connections to anycast IPs

// responses from app vms using http handler

// tcp connections to app vms

// data out through load balancer

// vm memory metrics

// other vm metrics

// network interface metrics

Give it a try, let us know what you think.

Known missing pieces

We’re missing a few features to really make Grafana nice:

  1. Support for autocomplete in queries. This will be fixed before we officially™️ ship this.
  2. Querying metrics for multiple apps: right now, queries need to include an app name, there’s no way to get metrics that combine apps.

Grafana has a neat map visualization:

Just add this: https://grafana.com/grafana/plugins/grafana-worldmap-panel

Use a query like this:

sum(rate(edge_http_responses_count{app="<NAME>"}[$__interval])) by (region)

Then set the map data JSON endpoint to https://api.fly.io/meta/regions.json:


This is amazing!


Heatmaps are pretty great for showing response times:


  1. Query:
     sum(increase(edge_http_response_time_ns_bucket{app="$app"}[$__interval])) by (le)
    • Legend: {{le}}
    • Min step: 1m
    • Format: Heatmap
    • Visualization: Heat map
  2. Axes: Y-Axis
    • Unit: seconds
    • Data format: Time series buckets
  3. Display
    • Colors: spectrum
    • Scheme: Plasma
    • Color scale min: 0
    • Show legend:

Grafana JSON

  "type": "heatmap",
  "title": "Response Times",
  "gridPos": {
    "x": 9,
    "y": 14,
    "w": 9,
    "h": 9
  "id": 23763571993,
  "targets": [
      "expr": "sum(increase(edge_http_response_time_ns_bucket{app=\"$app\"}[$__interval])) by (le)",
      "legendFormat": "{{le}}",
      "interval": "1m",
      "refId": "A",
      "format": "heatmap"
  "fieldConfig": {
    "defaults": {
      "custom": {}
    "overrides": []
  "pluginVersion": "7.1.5",
  "legend": {
    "show": true
  "tooltip": {
    "show": true,
    "showHistogram": false
  "heatmap": {},
  "cards": {
    "cardPadding": null,
    "cardRound": null
  "color": {
    "mode": "spectrum",
    "cardColor": "#b4ff00",
    "colorScale": "sqrt",
    "exponent": 0.5,
    "colorScheme": "interpolatePlasma",
    "min": 0
  "dataFormat": "tsbuckets",
  "yBucketBound": "middle",
  "xAxis": {
    "show": true
  "yAxis": {
    "show": true,
    "format": "s",
    "decimals": 0,
    "logBase": 1,
    "splitFactor": null,
    "min": "0",
    "max": null
  "highlightCards": true,
  "timeFrom": null,
  "timeShift": null,
  "reverseYBuckets": false,
  "xBucketSize": null,
  "xBucketNumber": null,
  "yBucketSize": null,
  "yBucketNumber": null,
  "hideZeroBuckets": false,
  "datasource": null

Hi! I’d be eager to try this out but I can’t get it work.
https://api.fly.io/prometheus/api/v1 responds with 404.
Is this still available for us to play with?

It should be, but the endpoint should be https://api.fly.io/prometheus/ for the Grafana data sources (as per the screenshot).


That doesn’t work either:(

Ah! We’ve only implemented these two URLs so far:

Grafana “knows” about these, so when you set it up all you have to tell it is https://api.fly.io/prometheus/. We use the Ruby Prometheus client, which also just needs the base URL. But for something like cURL you’ll need a more specific endpoint.


Now it works, thanks!

Good deal! Definitely let us know if you run into any problems, this could be quite powerful and we want to make sure it’s nice and solid before we launch it.

Works fine so far, impressive work! I can’t get the worldmap working though.

The worldmap is a little flakey it seems like. Here’s the panel JSON for the one I have working:

  "circleMaxSize": "10",
  "circleMinSize": 2,
  "colors": [
    "rgba(245, 54, 54, 0.9)",
    "rgba(237, 129, 40, 0.89)",
    "rgba(50, 172, 45, 0.97)"
  "decimals": 0,
  "esMetric": "Count",
  "fieldConfig": {
    "defaults": {
      "custom": {
        "align": null
      "mappings": [],
      "thresholds": {
        "mode": "absolute",
        "steps": [
            "color": "green",
            "value": null
            "color": "red",
            "value": 80
    "overrides": []
  "gridPos": {
    "h": 14,
    "w": 18,
    "x": 0,
    "y": 0
  "hideEmpty": false,
  "hideZero": false,
  "id": 4,
  "initialZoom": 1,
  "jsonUrl": "https://api.fly.io/meta/regions.json",
  "locationData": "json endpoint",
  "mapCenter": "(0°, 0°)",
  "mapCenterLatitude": 0,
  "mapCenterLongitude": 0,
  "maxDataPoints": 1,
  "mouseWheelZoom": false,
  "pluginVersion": "7.1.5",
  "showLegend": true,
  "stickyLabels": false,
  "tableQueryOptions": {
    "geohashField": "geohash",
    "latitudeField": "latitude",
    "longitudeField": "longitude",
    "metricField": "metric",
    "queryType": "geohash"
  "targets": [
      "expr": "sum(rate(edge_http_responses_count{app=\"$app\"}[$__interval])) by (region)",
      "interval": "",
      "legendFormat": "{{status}}",
      "refId": "A"
  "thresholds": "0,10",
  "timeFrom": null,
  "timeShift": null,
  "title": "Requests by Region",
  "type": "grafana-worldmap-panel",
  "unitPlural": "",
  "unitSingle": "",
  "valueName": "total",
  "datasource": null

Thanks, it helped. "legendFormat": "{{status}}" was missing for me.

1 Like

Oh nice! If you end up doing other interesting stuff in Grafana, will you post about it here?

Sure. Everything looks pretty solid – maybe the negative memory consumption once today was a bit strange.

Oh that’s odd! We’re not that good at VMs yet.

What query gave you negatives on memory? I can have a look and see why it might’ve done that.

avg(firecracker_vm_memory_mem_total{app="$__app"}) - avg(firecracker_vm_memory_mem_free{app="$__app"})

Btw, would it be possible to collect app specific metrics too? Maybe as a backing service similarly to Redis?

Yes we’d like to let people collect custom metrics. It’s a fun technical challenge because Prometheus like databases have problems with too many different series (metric + label). High cardinality breaks things and/or makes them expensive to run.

We’ve experimented with this a little, if you send a fly-cache-status: HIT header you’ll see a chart appear on our ui. You can use any status you want there. We might expose other named metrics and let apps populate them.

The next step is to make it easy to scrape metrics from your app instances so you can turn on something like paid Grafana cloud and at least get them there!

Great, keep us informed!
Not sure how do you mean sending a fly-cache-status. Send it from an instance in an HTTP response?