Billed high amounts for auto start/stop machines

Biswas · September 21, 2023, 5:18pm

I’ve been migrating a high traffic service from AWS Lambda to fly.io and have been conducting performance tests.

Each request is handled in a single container, so I spun up ~440 machines of shared-cpu-4x type, with auto_start/stop_machines=true, hard/soft_limits=1 in fly.toml.

I was under the impression that I was only billed for the period during which the VM runs, however after running a 20 minute performance test which involved ramping up a load of 10 concurrent requests to 430 concurrent requests, I ended up with a rather unexpectedly large bill of $77:

I believe I’m being billed for suspended VMs as well, or VMs aren’t actually marked as idle once they’ve been provisioned or they’ve handled a request. However, since the usage data on the dashboard isn’t granular enough, I have no way to confirm that this is indeed the case, or if there is another billing issue.

Can anyone help me understand what happened here? Thanks in advance for your time.

Zane_Milakovic · September 21, 2023, 10:47pm

Do you have the auto_stop feature setup in your fly.toml?

You have to exit the main process to let fly.io know the machine can shut down.

Do you have health checks setup? You may want to try turning them off for your purpose, though I don’t think they keep the machine alive.

These are VMs, not containers, and not lambdas. So there is overhead of spinning them up as well, depending on how fast they are ready that can eat into your time. Smaller and lighter Dockerfile image may help here.

You may want to contact billing for more information.

charsleysa · September 21, 2023, 11:38pm

Hi @Biswas

To clarify, you still pay for the server when it’s idle. You stop paying when the server stops.

When using the auto stop feature, it doesn’t stop all the servers at once. It stops them slowly over time.

fly provides a Grafana dashboard that you can use at https://fly-metrics.net/

You can check to see the history of how many servers were running at any point using this link
https://fly-metrics.net/explore?left={"datasource":"prometheus_on_fly","queries":[{"refId":"A","datasource":{"type":"prometheus","uid":"prometheus_on_fly"},"editorMode":"builder","expr":"sum(fly_instance_up)","legendFormat":"__auto","range":true,"instant":true}],"range":{"from":"now-1h","to":"now"}}

If you’ve got multiple apps, you’ll need to select app in the label filters and enter the desired app name as the value.

Biswas · September 22, 2023, 3:28am

The reply from @charsleysa suggests that machines are shut down eventually.

Are they shut down eventually if the app process keeps running, but shut down immediately if the app process exits?

Somewhat related to this, I faced a few HTTP 502 errors when running performance tests. Is this usually because fly was unable to provision a machine (after all, allocated machines are not provisioned immediately), or is it because the app didn’t start up in time?

If it’s the latter I’m wondering whether health checks may be helpful here.

system · September 29, 2023, 3:29am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using fly.io as an alternative to AWS Lambda Questions / Help	6	2093	June 21, 2023
Fly io app running, despite auto stop set to stop autoscaling	2	139	September 25, 2024
auto_stop_machines: true and min_machines_running:0 do not scale down to 0 Questions / Help	4	814	August 16, 2023
How does billing work? Questions / Help	7	950	June 30, 2023
Machines not stopping	1	197	August 3, 2023

Billed high amounts for auto start/stop machines

Related topics