Machine environment errors

I observed a newly-created machine not get env vars set it in its toml file. Looking further, environment config for new machine apps seem to misbehave:

$ flyctl apps create --machines machine-env-is-broken
$ flyctl config env -a machine-env-is-broken
Secrets
NAME	DIGEST	CREATED AT

Oops, something went wrong! Could you try that again?
EXIT:3

Here’s a demonstrator of the missing env vars:

printenv.toml:

app = "machine-env-is-broken"

[build]
dockerfile = "Containerfile-printenv"

[env]
FOO = "foo"
BAR = "bar"

Containerfile-printenv:

FROM alpine
ENV THIS=works
ENTRYPOINT ["/bin/sh", "-c", "echo START; printenv; echo DONE; sleep 5"]

run it:

flyctl machine run . -c printenv.toml --region dfw

observe logs:

2023-03-13T20:14:23Z runner[148e425f165989] dfw [info]Pulling container image registry.fly.io/machine-env-is-broken:deployment-01GVE9QX1N8FB5ZWA6ACPTHSZ9
2023-03-13T20:14:24Z runner[148e425f165989] dfw [info]Unpacking image
2023-03-13T20:14:24Z runner[148e425f165989] dfw [info]Configuring firecracker
2023-03-13T20:14:24Z app[148e425f165989] dfw [info]Starting init (commit: 08b4c2b)...
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]Preparing to run: `/bin/sh -c echo START; printenv; echo DONE; sleep 5` as root
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]2023/03/13 20:14:25 listening on [fdaa:0:57f1:a7b:a062:496d:e29e:2]:22 (DNS: [fdaa::3]:53)
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]START
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]FLY_PUBLIC_IP=2605:4c40:208:83f5:0:496d:e29e:1
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]FLY_VM_MEMORY_MB=256
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]SHLVL=1
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]HOME=/root
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]FLY_IMAGE_REF=registry.fly.io/machine-env-is-broken:deployment-01GVE9QX1N8FB5ZWA6ACPTHSZ9
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]TERM=linux
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]THIS=works
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]FLY_ALLOC_ID=148e425f165989
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]cgroup_enable=memory
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]FLY_REGION=dfw
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]FLY_APP_NAME=machine-env-is-broken
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]PWD=/
2023-03-13T20:14:25Z app[148e425f165989] dfw [info]DONE
2023-03-13T20:14:31Z app[148e425f165989] dfw [info]Starting clean up.
2023-03-13T20:14:32Z app[148e425f165989] dfw [info][    7.136288] reboot: Restarting system
2023-03-13T20:14:32Z runner[148e425f165989] dfw [info]machine exited with exit code 0, not restarting

It seems the same is true of secrets, after I kludge env vars by moving them from fly.toml to Containerfile ENV FOO=foo, my secrets are still not being set.

A workaround seems to be to give the machine metadata fly_platform_version=v2 (e.g. flyctl machine update 5683dde2a70778 --metadata fly_platform_version=v2) and then running flyctl deploy.

To do that as a single-shot, apparently flyctl deploy --force-machines works. Hooray! (And if you use scheduled machines, set the schedule after deploy with flyctl machines update XXX --schedule hourly.)

thanks for the bug reports! flyctl config env -a machine-env-is-broken certainly shouldn’t crash like that, and machines should probably be able to inherit the environment from the config file. I’m working on a fix to both for these.

I’m glad you were able to find a solution!

Word of warning: as you probably know, giving a machine that metadata means that it’s managed as part of the Apps V2 platform, which means that certain kinds of changes you may make to it are liable to be reverted on deploy. I don’t think that schedules should end up getting wiped, but it’s something to watch out for!

How did you set the secrets? I’m not able to reproduce this one.

Just flyctl secrets import. First, that triggered a bug (Internal error setting secret for machines apps), then it worked (I have an earlier machines experiment that’s happy), and then it stopped working (this reproduction).

Actually, it turns out that you can’t get flyctl deploy and --schedule hourly to cooperate. A flyctl machine update XXX --schedule hourly wipes out the configuration with empty environment again (status gives a new instance ID, history is truncated), and a new flyctl deploy will forget the schedule part.

Also, flyctl machine update times out more often than it completes successfully, and re-running the same command says the config change has already been done.

I need to do other things again for a while, but when I come back to this (if flyctl hasn’t improved in the meanwhile), I guess I’ll look at using the API directly.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.