Feature preview: Custom metrics

jerome · April 21, 2021, 5:13pm

We’ve just “soft” launched our new custom metrics features

Here are some docs if you want to get all technical right away.

In essence: we’re starting to collect and store custom prometheus metrics for apps hosted on Fly.

Inserting metrics

Instrument your app with a prometheus library
Expose metrics (make sure it binds to 0.0.0.0)
Configure your fly.toml:

[metrics]
port = 9091
path = "/metrics"

We’ll pull from your instances a few times per minute.

Querying metrics

With this new feature, we’re slowly deprecating our current prometheus API and introducing a new one with a few differences:

Base URL: https://api.fly.io/prometheus/{org_slug}
- Find your {org_list} by listing orgs with flyctl orgs list
Headers: Authorization: Bearer {token}
- Find {token} with flyctl auth token
All apps for the same organization are accessible from the same endpoint
Custom app metrics and Fly metrics (proxy, instance, volumes, prefixed with fly_) are all available through the same endpoint.
Your own metrics are not namespaced, but we will automatically add the app, instance, host and region labels.
Full list of Fly metrics available in the docs
pg_ series are not prefixed with fly_ because they work just like custom metrics (its fly.toml is configured for it

This new API works just like the prometheus API

curl "https://api.fly.io/prometheus/{org_slug}/api/v1/query_range?step=30" \
	--data-urlencode 'sum(rate(fly_edge_http_responses_count{app="{app}"}[5m])) by (status)' \
	-H "Authorization: Bearer {token}"

A grafana setup looks like this:

“Normal” grafana features for discovering labels and series should work as-is. The old API did not support that.

Pricing

We’re still working on this. Keeping your series count under 10,000 should awlays be free.

You’ve probably noted that we’re including our own fly-specific metrics in there too. We’re hoping they don’t account for too many series! However, with the number of regions we have and if you have a lot of instances in different regions, that could grow fast. We’ll decide on pricing and limits based on what we observe here.

During this “beta” phase, we won’t be enforcing any specific limits, but we will be on the lookout for excessive usage.

rugwiro · April 21, 2021, 5:44pm

Wow , this came in faster than I expected. I will deploy a couple my apps tommorow to test, just need to modify them a little. The prometheus endpoint is currently exposed on the same port as the main app api.

jerome · April 21, 2021, 5:53pm

That works too. Set the port to the same internal port for your app and it should be fine.

arnodirlam · May 2, 2021, 6:38pm

That’s amazing!

One step further that would be fantastic is to ship logs to the hosted Grafana instance using Loki. Seeing that you’re already using Vector for log collection (nice blog post btw), which supports Loki as a sink, this would enable inspecting metrics and logs combined. Maybe it could be opt-in for using it instead of Elasticsearch, if that helps.

jerome · May 3, 2021, 12:19pm

We’re working on this yes. Vector is great, but it’s not realistic to configure it with, potentially, thousands of log sinks

We’re going to be using a different model for this, but it will allow us to send app logs to whatever service, eventually.

charsleysa · May 4, 2021, 2:11am

Whats the difference between host and instance?
Is host the physical server while instance is the vm?

michael · May 4, 2021, 2:55am

That’s right

johan · July 29, 2021, 2:30am

Nice work. Got my Grafana dashboard up and humming along nicely.

rhodee · July 29, 2021, 4:42am

I’ve updated a metric to add a label. That metric no longer shows up. Is it possible to relabel metrics through Fly?

jerome · July 29, 2021, 11:44am

How do you mean they don’t show up anymore?

Can you give us an example of the metric and the query you’re using?

rhodee · July 29, 2021, 6:47pm

Yes. I’ve got a metric create_offer_total. It is a counter which prior to yesterday did not have a label. When I added a label, it disappeared and was no longer visible in the Grafana metric browser. I believe once a metric is created it cannot be modified.

This post appears to confirm this.

I just created a new metric, exactly as the original one, with a label.

johan · August 3, 2021, 5:14pm

@jerome I have the default setup and Grafana dashboard working great. I’m now trying to expose Apache metrics and setup a Grafana dashboard for monitoring it. Think I have everything running but struggling with the dashboard part.

I’m using the Apache Exporter for Prometheus, using:

[metrics]
  port = 9117
  path = "/metrics"

This seems to be working fine:

curl "0.0.0.0:9117/metrics"shows the metrics and apache_up is 1 which indicates that it’s connected.
curl "https://api.fly.io/prometheus/joomlatools/api/v1/series?match%5B%5D=apache_up" \ -H 'Authorization: Bearer {token} returns following:

{"status":"success","isPartial":false,"data":[{"__name__":"apache_up","instance":"xxxx","host":"xxx","app":"xxx","region":"xxx"}]}

So it seems the data is being pulled in.

Trying to get the data to show in Grafana using: Apache dashboard for Grafana | Grafana Labs but thats not working.

I am right thinking that default dashboards will not just work and need to be modified to be able to handle the extra attributes (region, host, app) that you are adding? Or is there anything else I am missing?

Thanks for the help!

jerome · August 3, 2021, 5:29pm

Looking at the dashboard’s JSON quickly, it seems like it expects the instance label to be in the form of $host:$port. Our instance label is just your instance ID. So you’ll probably have to modify all the queries and even the dashboard variables.

johan · August 3, 2021, 5:35pm

Thanks that helps to put me on the right track.

johan · August 4, 2021, 10:12am

@jerome One more question for you, how would i handle 2 metric exporters? I have now both the Apache and the PHP-FPM exporter running, one uses port 9117, the other port 9253, it seems that fly.toml only accepts a single [metrics] block?

steveberryman · August 4, 2021, 10:35am

It adds (further) complexity, but probably the best option is to run a third process to merge the exporters metrics. Vector, for example, could ingest metrics from both exporters as a prometheus scrape source, and then have the prometheus exporter sink configured to output the merged metrics. You’d then use this url/port in the metrics block.

johan · August 6, 2021, 12:15am

Thanks @steveberryman! I found: GitHub - rebuy-de/exporter-merger: Merges Prometheus metrics from multiple sources and that seems to work great. Been able to merge both apache and php-fpm and output them.

Vector would probably be the better choice. The merger binary is a little bit smaller in size, trying to keep the size of the VM down.

FrequentFlyer · January 20, 2022, 4:37am

Resurrecting this thread…

Any thoughts on using ChaosSearch as the back-end for logs/metrics?

johan · May 5, 2022, 3:00pm

@jerome Picking up my work on Grafana and Prometeus where I left it off last year.

Quick question, I’m running 2 apps with volumes attached by don’t seem to be able to find fly_volume_size_bytes as documented here: Metrics on Fly Any ideas?

jerome · May 5, 2022, 3:48pm

This seems related to a bug that’s been happening for a while. Sounds like you might be the only user of this metric!

I’m working on a fix, it might take a little bit.

Topic		Replies	Views
Custom Metrics/Prometheus etc. Can not find custom variables in fly grafana dashboard Phoenix metrics	0	112	June 20, 2024
Custom app metrics not making it to grafana cloud Questions / Help metrics , elixir , grafana	18	1881	March 24, 2022
Preview: Managed Grafana Dashboards for Fly Apps metrics , announcement	24	3356	October 21, 2023
Prometheus / Fly metrics getting dropped	9	468	December 29, 2021
Volume metrics	9	816	August 17, 2022

Feature preview: Custom metrics

Related topics