machine creation takes 1 hour

I recently tried playing with flyctl machine run.
It hangs and/or is very very slow.

Here’s one machine in region sjc – it took 59min46sec to start.

2023-02-23T19:56:11Z runner[24d89603c93587] sjc [info]Pulling container image
2023-02-23T19:56:14Z runner[24d89603c93587] sjc [info]Unpacking image
2023-02-23T20:56:00Z app[24d89603c93587] sjc [info]Starting init (commit: 08b4c2b)...
2023-02-23T20:56:00Z app[24d89603c93587] sjc [info]Preparing to run: `deno run --allow-run=deno /container_entrypoint.ts` as deno
2023-02-23T20:56:00Z app[24d89603c93587] sjc [info]2023/02/23 20:56:00 listening on [fdaa:0:57f1:a7b:a160:3a49:3e1f:2]:22 (DNS: [fdaa::3]:53)

Here’s another one in region dfw that took 59min22sec:

2023-02-23T20:04:33Z runner[06e82955a02e87] dfw [info]Pulling container image
2023-02-23T20:04:34Z runner[06e82955a02e87] dfw [info]Unpacking image
2023-02-23T20:04:38Z runner[06e82955a02e87] dfw [info]Configuring firecracker
2023-02-23T21:04:00Z app[06e82955a02e87] dfw [info]Starting init (commit: 08b4c2b)...
2023-02-23T21:04:00Z app[06e82955a02e87] dfw [info]Preparing to run: `deno run --allow-run=deno /container_entrypoint.ts` as deno
2023-02-23T21:04:00Z app[06e82955a02e87] dfw [info]2023/02/23 21:04:00 listening on [fdaa:0:57f1:a7b:cf99:e1ed:e2b6:2]:22 (DNS: [fdaa::3]:53)

Also, machines in state created cannot be destroyed, meaning one can’t give up and tear down the experiment to move on to something else, you need to monitor it until completion just to tear it down, to avoid leaving idle machines around.

Also really worrying to me is how your logging mechanism (NATS?) still loses log entries. See how only one of those two snippets says “Configuring firecracker”. This makes debugging incredibly frustrating, as you simply can’t trust what you’re seeing.

It looks like both of these machines are scheduled machines configured to run hourly. What we do for scheduled machines is setup the VM but delay starting it until the next execution based on the schedule which explains the delay.

We’ve not had a lot of use for scheduled machines so would be curious to know what you’d expect to happen.

Ohh so if I add schedule=hourly it’ll delay up to an hour before the first launch! That explains things.

I guess I need to set them up as “normal” machines first, and then change them to hourly, to get the intuitive behavior I want – see 1 round of batch processing complete now, then refresh every now and then.

That doesn’t seem like the desired UX you’d want though? It seems like the expectation is the scheduled machine will run when first created, stop, and start again after the defined schedule?

Yes, the UX I expected was “the machine runs once at flyctl machine run time, then roughly hourly after that”.

My mental model: --schedule=hourly injects a roughly hourly flyctl machine start for me, without me needing to run that command somewhere, mess with API keys, etc.

1 Like

flyctl code for scheduled machine commands await stopped state, not started unlike with flyctl m [run|clone|update]. Ref: https://github.com/superfly/flyctl/blob/a3afbfcb83a/internal/machine/update.go#L94-L97

@Tv1 Quick update, a scheduled machine will run when initially created and transition to the stopped state once it exits and stays that way until the next scheduled execution.

1 Like