Fly Apps on machine prerelease

The first post about this was Fly Apps on machines: prerelease of fly deploy , and we’ve added a lot more since then! This, kind of long, post goes into detail on how things work in Apps v2.

We’re working to migrate the Fly Apps Platform to Machines. We’re calling this Apps v2. Apps v1 are the apps that use Nomad.

The first big step will be making new apps use the machines platform instead of nomad. There’s a bunch of small steps to get there. Today we updated the prerelease and PR to support fly launch and fly deploy, making it possible to launch and manage apps built entirely on machines.

Install the prerelease or build the PR to try out the new fly deploy behavior:

To install the latest prerelease:

curl -L https://fly.io/install.sh | sh -s -- prerelease
fly version
fly v0.0.451-pre-4 darwin/arm64 Commit: a52f9a0c BuildDate: 2023-01-20T14:37:55Z

Try it out and let us know what you think! We’re looking for bugs we can squash, missing features, things you’d like see. Let us know in this thread.

Creating a v2 app

Launch apps with fly launch. It should work like it does for apps on Nomad. All the scanners and builders are supported, as usual.

To try out Apps v2 use an app that does not require statics. Apps v2 doesn’t support statics, yet. We’ll announce when that changes.

I’ll start an nginx app, and use that for the rest of the examples in this post.

fly launch --image nginx --internal-port 80
...
Created app dry-pond-1475 in organization tvd-testorg
Admin URL: https://fly.io/apps/dry-pond-1475
Hostname: dry-pond-1475.fly.dev
Wrote config file fly.toml
? Would you like to set up a Postgresql database now? No
? Would you like to set up an Upstash Redis database now? No
? Would you like to deploy now? Yes
? Will you use statics for this app (see https://fly.io/docs/reference/configuration/#the-statics-sections)? No
==> Building image
Searching for image 'nginx' remotely...
image found: img_wd57v5nge95v38o0
Provisioning ips for dry-pond-1475
  Dedicated ipv6: 2a09:8280:1::ce9b
  Shared ipv4: 66.241.124.2
  Add a dedicated ipv4 with: fly ips allocate-v4
No machines in dry-pond-1475 app, launching one new machine
  Machine 21781973f03e89 update finished: success
  Finished deploying

Once fly launch finishes, use fly open to open the app’s homepage in a browser. The “Welcome to nginx!” page will show if everything worked.

Deployments

fly deploy continues to work to update the app:

fly deploy
==> Building image
Searching for image 'nginx' remotely...
image found: img_wd57v5nge95v38o0
Deploying dry-pond-1475 app with rolling strategy
  Machine 21781973f03e89 update finished: success
  Finished deploying

release_command, rolling and immediate strategies, and the other deploy flags and settings are supported. There are some small variations in apps v2, and we’ve limited them as much as possible and updated flyctl to tell you about them.

Mounts and volumes

Volumes need to be created and manually attached to machines. The source setting in the [mounts] section is no longer supported in fly.toml. There is no enforcement around volumes names.

Attach volumes with the Machines API for now. We’re working on making this easier with fly machine update and fly machine clone.

Once attached, the destination setting in fly.toml will be used to update the destination of the volumes. For example, the volumes will be mounted at /my/new/directory with this fly.toml config after running fly deploy:

[mounts]
destination = "/my/new/directory"

Scaling

Scaling an app is different with Apps v2. Use fly machine clone to horizontally scale the app, even across regions:

fly machine clone 21781973f03e89
fly machine clone --region syd 21781973f03e89
fly machine clone --region ams 21781973f03e89

Now 4 machines are running for this app: the original machine plus three new ones. Use fly machine stop and fly machine remove to scale down the app:

fly machine stop 9080524f610e87
fly machine remove 9080524f610e87
fly machine remove --force 0e286039f42e86

Scale memory and cpu with fly machine update:

fly machine update --memory 1024 21781973f03e89
fly machine update --cpus 2 21781973f03e89

Processes

Processes continue to be supported in fly.toml. The big difference with apps v2 is you need to specify which machines are assigned to which processes.

fly deploy will update each machine based on its process group, applying only the services, cmd, and checks for that process.

Use fly machine update to assign a process group to a machine with:

fly machine update --metadata fly_process_group=app 21781973f03e89
fly machine update --metadata fly_process_group=app 9e784925ad9683
fly machine update --metadata fly_process_group=worker 148ed21a031189
fly deploy
==> Building image
Searching for image 'nginx' remotely...
image found: img_wd57v5nge95v38o0
Deploying dry-pond-1475 app with rolling strategy
  Machine 21781973f03e89 [app] update finished: success
  Machine 148ed21a031189 [worker] update finished: success
  Machine 9e784925ad9683 [app] update finished: success
  Finished deploying

Make sure to run fly deploy after updating these groups to ensure each machine gets the appropriate services, checks, and cmd. These are the key pieces of the fly.toml that configure the processes, with the one service using the "app" process group:

[processes]
  app = "nginx -g 'daemon off;'"
  worker = "tail -F /dev/null" # not a very useful worker!

[[services]]
  processes = ["app"]

fly machine clone can then be used to build out multiple instances within a process group, or to clone a machine and put it in a different process group:

fly machine clone --region gru 21781973f03e89
fly machine clone --process-group worker 21781973f03e89

Checks

Checks defined in fly.toml are translated to checks on each machines. We don’t restart or stop routing traffic to machines based on these health checks, yet. We’re working on that!

Failing checks will cause the deployment to fail. Use --detach to skip waiting for health checks to pass during fly deploy.

Secrets

Secrets continue to work with machine in Apps v2. Setting or unsetting secrets will result in a deployment that calls the update api on each machine to change some metadata. This causes the machine to update the secrets it is using.

fly secrets set DB_PASSWORD=supersecret
INFO Using wait timeout: 2m0s and lease timeout: 30m0s
Deploying long-sun-1337 app with rolling strategy
  Machine 06e8297ef31587 update finished: success
  Finished deploying

Release commands

Release commands defined in fly.toml are run in a new machine. flyctl will wait for the machine to finish and check the exit code. If the exit code is 0, the machine will be automatically destroyed and the deployment will continue. Otherwise, the deployment fails and the release_command machine is put into a destroying state for about an hour after which it is destroyed.

We keep the failed release_command machine for a little bit in case it’s useful for debugging issues. Unfortunately it’s not possible to access the machine once it’s in the destroying state. We can clone it, though, with:

fly machine clone --clear-auto-destroy --clear-cmd MACHINE_ID

We set --clear-auto-destroy so the new machine won’t destroy itself on exit. We also unset the CMD, so the release command won’t run again. The default cmd for the machine will run. Use --override-cmd, instead of --clear-cmd, to manually set a cmd to run.

After that, we can use fly machine list to get the ip address for the machine, then use fly ssh console -A <ip> to access the machine. Then we can debug!

We have some ideas about to simplify debugging release_commands. Let us know what you’d like to see. We’ll announce those once they are available.

Restarting the app

fly apps restart APP_NAME will restart all machines in the app. This includes both machines that have the Fly Apps Platform metadata as well as other machines that may have been created.

Migrating existing apps

If you already have an app with machines, running fly deploy will convert it to an apps v2 app. You’ll be prompted to do the migration.

We don’t currently support migrating apps from Nomad to Machines. We’ll announce when that’s available.

Note/Warning this will overwrite the config for all these machines, based on the values set in fly.toml and the existing config on the machines. As an example, the services and environment values will come from fly.toml replacing whatever was present before. Any mounts will not change, though fly deploy may change the mount path if the destination path under the [mounts] section in fly.toml is different than what’s currently on a machine.

6 Likes

Do the release commands respect the environment variable PRIMARY_REGION like apps v1? I mean, is the machine spawned on PRIMARY_REGION?

Is there a ETA for implementing canary deployments?

We considered rolling our own zero downtime strategy using the API but we don’t really want to do it right now :sweat_smile:.

The release command doesn’t pay attention to PRIMARY_REGION right now, that’s a good thing to bring up. Added an issue here: Apps v2: Release command should attempt PRIMARY_REGION env var · Issue #1608 · superfly/flyctl · GitHub

We aren’t sure when we’re going to do canary deployments. We’re focusing on things that don’t require creating new machines, vs just updating existing ones. This is much more resilient. Most problems we run into with our current infrastructure are caused by VM churn.

It’s worth trying a rolling deploy with machines to see if you really need canary deploys. Deploys with machines do some magic to make an update incredibly fast. We pull the new image and prep everything before restarting the VM. It happens so fast, Postgres doesn’t even failover.

If you’re running 2+ machines per app, there’s a good chance a machines rolling deploy will be zero downtime.

1 Like

Nice.


I want to use this new release.

I submitted a bunch of fixes for flyctl wrt to Machines back in the day that I needed (and since then continue to maintain my very old fork) but couldn’t be merged: Improvements to deploy and run commands for machines by ignoramous · Pull Request #1327 · superfly/flyctl · GitHub

If they have been fixed in main, I’d move to it.

The blocking issues for me are:

  1. leases for apps with 40+ Machines just didn’t work as expected with rolling deploys. Always had (and still have) to do an immediate deploy.
  2. flyctl deploy -a <app-name> ... didn’t really work.

The PR fixed (or tried to fix) the underlying issues.

It’d be nice if we could use Machine names instead of IDs that increasingly have a lot of hex chars common for some reason. I always have to double check the ID before execing some of the more onerous commands (remove -f for example): [FR] flyctl m update with machine-name · Issue #1293 · superfly/flyctl · GitHub (I volunteer to impl if this is something that folks at Fly are open to merging).

3 Likes

Would be awesome to have the machine metadata available to the instance as environment variables.

1 Like

Is there any way for us to get this information? I played a bit with the GraphQL API but checks are empty for my app (I tried digging into app.healthChecks, app.allocations, app.machines. ...checkState, etc.).

I’m wondering how we can do a sanity check once a while to ensure the desired state. Do you have a suggestion? Any input is valuable, we are willing to do something on our side using the API.

I see it’s possible to get a list of running machines (machines(state: "started")), and we can use it to detect issues with the underlying host/orchestrator. But I suppose the state will still be started in case of problems on our side (e.g. app exited with an error).

By the way, is there a way to get the desired state using the API? As in, “wants 4 machines in iad, 2 machines fra, 1 machine syd, 1 machine in gru”.

1 Like