Questions about running FaaS

Hi there, I’ve come across this article in the docs:

What would a good way to track usage of these machines though? I think that just “logging” the start time is insufficient, because the task the machine runs exits automatically when it becomes idle, so I have no way of knowing when that happens.

I’ve thought of some kind of heartbeat tracking, that checks via the fly machine API every second which machines are running and increments the related usage key in my Redis database by one. That gets to a lot of API requests though, so I don’t know if that is actually recommended.

Basically, I want a redis database to track usage of every machine (which is tied to a user), and every user has a certain amount of usage included every month. I’m fairly certain that I’d have to implement this usage tracking myself. - Just asking here what I should do to keep track of which machines are running.

The second question I have: Is there a way to set a max TTL for a given machine upon launch via the Machines API?

Thanks in advance to everyone!

Usage of machines should be easy; just periodically ingest machine created/updated data for each machine ID until you detect they are stopped. You can use Get Machine via REST for this. You’ll thus need to store the machine ID for every machine that your users create.

Sure, so you are basically saying I should query the Fly API every second like a “heartbeat”, to see if the machine is still running?

Or do you mean that I should actually use the “started” and “stopped” events of the Machines instead?

please don’t do that, it doesn’t scale :​)

the best way to calculate uptime for your Machines is using the Prometheus metrics. you can check fly_instance_up metric for this: Metrics on Fly.io · Fly Docs

1 Like

No, there would be no need. You will already have a database record of machines that you believe are running, so every hour read the status of them, until you find they are stopped, at which point you can update your database with their new status.

For each machine you newly discover as stopped, use the API l suggested to get a lifetime in seconds.

Or listen to Lillian’s advice; after all, they work for Fly! :zany_face:

1 Like

First of all, thank you both for your replies!

Sounds reasonable, thank you! Does Fly also bill the machines from this metric? I want to achieve a 1-1 relationship between seconds I bill and get billed, just for the sake of transparency.

Just to note:
The Fly Prometheus Metrics seem to have a scraping interval of 15s. How should I treat that inaccuracy within my service? Is there a way to get the uptime more accurately?

Just dumping a thought here.

Fly.io itself has to have some kind of method to determine machine uptime, right?
Alas, fly needs to calculate usage in units of seconds to bill me the right amount for machine use at the end of the month.
Could there be a way for me to access just that “uptime number” on a specific machine? I mean with the new usage insights, the data has to be somewhere at least.

I still haven’t found a way to more correctly determine a specific machine’s total uptime.

I might go back to the “heartbeat” idea, just scaling it down to something like every 10 seconds and using the machine events like created, started, suspended, destroyed and so on that I can get from the Fly Machines API.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.