Authenticated requests for health_checks and Prometheus Metric Scraping?

I’m excited to see that I can add metrics scraping to my service, if it exposes a prometheus compatible endpoint.

However, reading the documentation (Metrics on Fly · Fly Docs), it doesn’t seem like it’s possible to supply some sort of authentication header secret so that only Fly is able to scrape my metrics, and not any internet accessible computer.

I’m using Spring Boot, so I have an option to supply a different management port to firewall these off, however, that seems like it would conflict with the HTTP Health Check feature in Spring Boot Actuator, which would also run on the management port, and that is not configurable it seems in Fly.

Optimally my goal is to just make it either extremely difficult for someone to scrape my metrics from an Internet accessible computer (via some shared secret capability), or firewall off some endpoints entirely so that only Fly can access them.

Is this possible with the existing features that I’m not seeing?

Thanks.

Aren’t Fly.io docs just the best? :wink:

If you read in between the lines of the blog posts (which is where Fly engs write docs) on Fly Metrics, you’d see that by declaring a [metrics] port that which is not exposed to fly-proxy (via [[services]] internal_port section of the app’s fly.toml), the corresponding prometheus path isn’t going to be reachable from the Internet. See:

You can export stats on the default Go HTTP handler (they’ll be exposed) or on a private handler on a different port or address; either way, you tell us about your metrics in fly.toml

From: Hooking up Fly Metrics.

In another blog post, Fly engs go:

In each Firecracker instance, we run our custom init, which launches the user’s application. Our Nomad driver, our init, and Firecracker conspire to establish a vsock — a host Unix domain socket that presents as a synthetic virtio device in the guest — that allows init to communicate with the host; we bundle node-exporter-type JSON stats over the vsock for Nomad to collect and relay to Vicky [the TSDB that stores metrics].

The Firecracker… VMs are all IP-addressable. If you like (and you should!), you can expose a Prometheus exporter in your app, and then tell us about it in your fly.toml

So, there is apparently a lot of Fly-specific plumbing (which only Fly’s code can do) going on before app metrics can be scraped. And so, I don’t think there’s really a need for auth.

That said, I’m not privy to the exact Fly Metrics architecture or how it has evolved (since the blog posts were written) to know for sure. Prudent to wait for Fly engs to blog to confirm this one way or the other…

@ignoramous is correct, you can publish metrics on any not-exposed port and they’ll remain private, so there’s no need to add authentication to your metrics endpoint as long as it’s on a different port. This is what I’d recommend, but it sounds like this setup is difficult to configure with Spring Boot.

Adding authentication to the custom metrics feature is something we do hope to add soon, supporting authenticated metrics endpoints on the same port as exposed services is an important use-case that’s not uncommon.

In the meantime, you can build a workaround with a local background process that scrapes your internal metrics endpoint with auth, then exports unauthenticated metrics to a private port configured with [metrics].

For an example workaround using Telegraf’s Prometheus Input Plugin), the configuration file would look something like this (where METRICS_URL is set to something like http://localhost:[port]/metrics):

[[inputs.prometheus]]
  urls = [ "$METRICS_URL" ]
  bearer_token_string = "$METRICS_TOKEN"
[[outputs.prometheus_client]]

Hope this is helpful!

1 Like

Thanks for the detailed responses.

I don’t think I was super clear in my question though (or perhaps some of the nuance was just missed).

So I want both the metrics and health endpoints firewalled off or protected.

From what I can tell, I can easily set a port for metrics which isn’t exposed externally, that’s fine. However, if I do that, then I would lose the ability to have HTTP health checks, as the documentation: App Configuration (fly.toml) · Fly Docs, doesn’t seem to permit changing the port. This honestly makes sense from an implementation standpoint because you typically want to check health by accessing the port the users would actually access, but at the same time, it means I have be exceptionally careful from a security standpoint to not put anything sensitive in the HTTP response of the health check endpoint.

Does that make a bit more sense? So really I love the Fly security model of having everything firewalled off by default, but I would want to have all health checking/metric capabilities firewalled off.

Are your metrics and health endpoints not separate? Or, do you mean, you want to http health-check the (private) metrics endpoint?

If the latter, then one can run local checks (custom, built-in), if that makes sense, for (private) ports not exposed via fly-proxy.

Sorry about the slow response, I didn’t get a notification of your reply.

My metrics and health endpoints are separate. The problem is they’re either going to both be exposed on the same web port, or they’ll both be exposed on an internal only port.

If I expose them both on the web port, then I have no way of protecting both of them at the moment.

If I expose them both on the internal only port, then I lose the HTTP health checks unless I drop down to writing the script check that you alluded to.

Or am I misreading the docs?

You can set a different port than your internal_port for the health checks.

For example:

[[services.http_checks]]
    port = 11111
    interval = 10000
    grace_period = "5s"
    method = "get"
    path = "/"
    protocol = "http"

This will not expose the service over the internet.

Hope this helps!

1 Like

I actually just noticed there’s a separate checks section (unsure if this existed when I asked this original ticket), but this covers precisely what I wanted:

1 Like