Early access: build Grafana dashboard from Fly metrics

Yep! Something like response.headers.set("fly-cache-status", "HIT").

Indeed and just fits to my use-case.
I’m looking forward for metrics scraping as well.

This URL seems to be incorrect :slight_smile:

Currently, the only two endpoints we’ve implemented are:

If you’re using Grafana, it just wants https://api.fly.io/prometheus/.

1 Like

Thanks for clearing that up.

Here’s an error I’m getting:

request:Object
url:"api/datasources/proxy/8/api/v1/query_range?query=%20sum(increase(edge_http_response_time_ns_bucket%7Bapp%3D%22%24app%22%7D%5B15s%5D))%20by%20(le)%0D&start=1603112040&end=1603133640&step=15"
method:"GET"
hideFromInspector:false
response:Object
status:"error"
errorType:"internal"
error:"unknown finding app"

Any ideas for where to debug? It said the data source connected without a problem.

Ah, you’ll need to replace $app with the name of your app in the query:

sum(increase(edge_http_response_time_ns_bucket{app="<NAME>"}[$__interval])) by (le)

Looks like it stopped working recently? Gives empty response to me.

It is! Are you using this with Grafana? That $__interval variable is specific to Grafana, you’d need something like 1m to query directly.

Sure, I’m using 1m but got empty results. It was working fine until recently.

Is this on Grafana or directly? I just tried it on several apps and I get the normal prometheus response.

Can you try it with curl and see what you get?

You can actually deploy Grafana to Fly now, as well. Here’s a quick demo: https://github.com/fly-examples/grafana

1 Like

Seems like the World Map Grafana plugin changed and wants some extra Prometheus endpoints we haven’t implemented yet. We’ll see if we can get those in.

2 Likes

I would edit the first post in this thread to use that url rather than the https://api.fly.io/prometheus/api/v1/ URL - it also tripped me up.

Try importing this as a dashboard and see if the regions show properly on the map. You’ll need to change the app name variable in the top text field to one of your apps:

{
  "__inputs": [
    {
      "name": "DS_PROMETHEUS",
      "label": "Prometheus",
      "description": "",
      "type": "datasource",
      "pluginId": "prometheus",
      "pluginName": "Prometheus"
    }
  ],
  "__requires": [
    {
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
      "version": "7.3.1"
    },
    {
      "type": "panel",
      "id": "grafana-worldmap-panel",
      "name": "Worldmap Panel",
      "version": "0.3.2"
    },
    {
      "type": "panel",
      "id": "graph",
      "name": "Graph",
      "version": ""
    },
    {
      "type": "panel",
      "id": "heatmap",
      "name": "Heatmap",
      "version": ""
    },
    {
      "type": "datasource",
      "id": "prometheus",
      "name": "Prometheus",
      "version": "1.0.0"
    }
  ],
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
  "iteration": 1604594142522,
  "links": [],
  "panels": [
    {
      "circleMaxSize": "10",
      "circleMinSize": 2,
      "colors": [
        "rgba(245, 54, 54, 0.9)",
        "rgba(237, 129, 40, 0.89)",
        "rgba(50, 172, 45, 0.97)"
      ],
      "datasource": "${DS_PROMETHEUS}",
      "decimals": 0,
      "esMetric": "Count",
      "fieldConfig": {
        "defaults": {
          "custom": {
            "align": null
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 14,
        "w": 18,
        "x": 0,
        "y": 0
      },
      "hideEmpty": false,
      "hideZero": false,
      "id": 4,
      "initialZoom": 1,
      "jsonUrl": "https://api.fly.io/meta/regions.json",
      "locationData": "json endpoint",
      "mapCenter": "(0°, 0°)",
      "mapCenterLatitude": 0,
      "mapCenterLongitude": 0,
      "maxDataPoints": 1,
      "mouseWheelZoom": false,
      "pluginVersion": "7.1.5",
      "showLegend": true,
      "stickyLabels": false,
      "tableQueryOptions": {
        "geohashField": "geohash",
        "latitudeField": "latitude",
        "longitudeField": "longitude",
        "metricField": "metric",
        "queryType": "geohash"
      },
      "targets": [
        {
          "expr": "sum(rate(edge_http_responses_count{app=\"$app\"}[$__interval])) by (region)",
          "format": "time_series",
          "instant": true,
          "interval": "",
          "legendFormat": "{{region}}",
          "refId": "A"
        }
      ],
      "thresholds": "0,10",
      "timeFrom": null,
      "timeShift": null,
      "title": "Requests by Region",
      "type": "grafana-worldmap-panel",
      "unitPlural": "",
      "unitSingle": "",
      "valueName": "total"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
      "fieldConfig": {
        "defaults": {
          "custom": {}
        },
        "overrides": []
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 9,
        "x": 0,
        "y": 14
      },
      "hiddenSeries": false,
      "id": 2,
      "legend": {
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "7.3.1",
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(rate(edge_http_responses_count{app=\"$app\"}[$__interval])) by (status)",
          "interval": "",
          "legendFormat": "{{status}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "HTTP Response Codes",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "cards": {
        "cardPadding": null,
        "cardRound": null
      },
      "color": {
        "cardColor": "#b4ff00",
        "colorScale": "sqrt",
        "colorScheme": "interpolatePlasma",
        "exponent": 0.5,
        "min": 0,
        "mode": "spectrum"
      },
      "dataFormat": "tsbuckets",
      "datasource": "${DS_PROMETHEUS}",
      "fieldConfig": {
        "defaults": {
          "custom": {}
        },
        "overrides": []
      },
      "gridPos": {
        "h": 9,
        "w": 9,
        "x": 9,
        "y": 14
      },
      "heatmap": {},
      "hideZeroBuckets": false,
      "highlightCards": true,
      "id": 9,
      "legend": {
        "show": true
      },
      "pluginVersion": "7.1.5",
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(increase(edge_http_response_time_ns_bucket{app=\"$app\"}[$__interval])) by (le)",
          "format": "heatmap",
          "interval": "1m",
          "legendFormat": "{{le}}",
          "refId": "A"
        }
      ],
      "timeFrom": null,
      "timeShift": null,
      "title": "Response Times",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "xBucketNumber": null,
      "xBucketSize": null,
      "yAxis": {
        "decimals": 0,
        "format": "ns",
        "logBase": 1,
        "max": null,
        "min": "0",
        "show": true,
        "splitFactor": null
      },
      "yBucketBound": "middle",
      "yBucketNumber": null,
      "yBucketSize": null
    }
  ],
  "refresh": "30s",
  "schemaVersion": 26,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": [
      {
        "current": {
          "selected": false,
          "text": "app-name",
          "value": "app-name"
        },
        "error": null,
        "hide": 0,
        "label": "App Name",
        "name": "app",
        "options": [
          {
            "selected": true,
            "text": "app-name",
            "value": "app-name"
          }
        ],
        "query": "app-name",
        "skipUrlSync": false,
        "type": "textbox"
      }
    ]
  },
  "time": {
    "from": "now-1h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ]
  },
  "timezone": "",
  "title": "Fly App",
  "uid": "snSx6_vGk",
  "version": 4
}

Thanks @kurt, it works. Not sure what makes the difference, though.

I think the label: {{region}} was wrong and an update to the map plugin started using it.

It works now with using {{region}} instead of {{status}}.

How’s this going for you all? We have only soft launched this while we continue to work on metrics, so it’s pretty easy to add or tweak things over the next few weeks if you have requests.

It’s going great! Though one metric that would be good is how many vms existing at any point in time and in which regions (so we can see the history of where the vms were).

We’ve switched from nanoseconds (u64) to seconds (f64). The main post has been edited to reflect this.

Please adjust your dashboards!