Hi all,
I’m running a service on Fly.io with custom Prometheus metrics (exposed via :9091/metrics). I have a question about how Fly handles shutdown and how it affects Prometheus scraping:
When Fly sends a kill signal (SIGINT) to a Machine during a deploy or scale-down event, does Prometheus still attempt to scrape metrics after that point? Or is the instance removed from routing immediately, making it unreachable?
If it’s the latter (no more scrapes after the kill signal), what’s the recommended way to avoid losing in-flight custom metrics during shutdown? For example, should I delay the shutdown with a setTimeout to give Prometheus time to perform a final scrape? Is there a best practice for this on Fly?
Thanks in advance!