Does Prometheus keep scraping after a kill signal? How to avoid losing custom metrics during shutdown?

sean-ahn · May 23, 2025, 5:25am

Hi all,

I’m running a service on Fly.io with custom Prometheus metrics (exposed via :9091/metrics). I have a question about how Fly handles shutdown and how it affects Prometheus scraping:

When Fly sends a kill signal (SIGINT) to a Machine during a deploy or scale-down event, does Prometheus still attempt to scrape metrics after that point? Or is the instance removed from routing immediately, making it unreachable?

If it’s the latter (no more scrapes after the kill signal), what’s the recommended way to avoid losing in-flight custom metrics during shutdown? For example, should I delay the shutdown with a setTimeout to give Prometheus time to perform a final scrape? Is there a best practice for this on Fly?

Thanks in advance!

mayailurus · May 23, 2025, 5:27pm

Hi… Last I heard, logs and metrics were in a kind of limbo state, and I would avoid relying on any specific behavior there until that fog has fully cleared up.

(For logs, there’s a branch in the flyctl repository which seems to revolve around them being stored on S3 or Tigris—which would be a neat compromise! Not all branches get merged into the final product, though.)

system · May 30, 2025, 7:56pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Prometheus / Fly metrics getting dropped	9	489	December 29, 2021
Metrics-shipper - any plans to do that? metrics	4	323	February 21, 2024
Lack of Prometheus metrics, sometimes, after deploy (host specific?)	20	2291	June 22, 2022
Prometheus scrape endpoint to move metrics into Honeycomb	9	2095	January 3, 2023
Prometheus API currently 503-ing metrics	6	399	November 24, 2023

Does Prometheus keep scraping after a kill signal? How to avoid losing custom metrics during shutdown?

Related topics