fly migrate-to-v2 - Automatic migration to Apps V2

Apps V2 is the future - and now, it’s getting a lot easier to get your hands on it.

We’ve been hard at work making Apps V2 a stable and reliable base for your applications, but we’ve also been working to ensure that everyone gets to reap the benefits of this effort. There have been instructions for a little bit to manually migrate your Nomad/V1 apps to Apps V2, but we now have a zero-downtime, automatic solution to handle this: fly migrate-to-v2!

This is a prerelease, so you might have to tinker a bit after-the-fact to get your app configured perfectly out of the box. If you encounter any surprises not listed as a known bug, please let us know! In fact, we’d appreciate any and all feedback.

Unsupported Configurations

  • You must be in the same directory as fly.toml, or pass the path to the app’s config file like fly migrate-to-v2 --config /path/to/some/cfg.toml
    • You can not just specify an app with -a, as this requires modifying the fly.toml config file.

Usage

Install flyctl 0.0.506 or later. For example with:

curl -L https://fly.io/install.sh | sh

From there, you can run fly migrate-to-v2 to migrate a V1 app to V2.

Known Issues

  • Sometimes, you can end up with a machine that has a release version 0.
    • This is a cosmetic bug, the machine still runs the correct image and has the right configuration.
  • The process will begin whether or not payment is configured, but it cannot finish without a valid payment method.
    • Don’t worry, though! When this happens, your app will be rolled back to how it was before migration was attempted, so nothing breaks.
  • Rarely, the error recovery rollback process leaves your app in a “suspended” state. This is a consequence of the app being in-between nomad and machines for a brief period. This has to be fixed on our backend, but once it’s fixed, this should automatically not be an issue anymore.
    • For the time being, if you end up running into issues and your app has to be rolled back to nomad, you might have to fly resume your app. This should fix the issue. Sorry about this!

Why not to do this (yet)

  • Apps V2 isn’t yet as fully featured as Apps V1. There’s not as much abstraction.
  • Less magic: One big conceptual shift that may give you pause is that flyd does not move VMs magically. Nomad continuously tries to make sure that if you’ve scaled to (say) one instance, there’s one instance running on some hardware, somewhere. So if a host has issues, in principle Nomad will move a failed VM to other hardware. With Apps V2, the idea is that you provision more than one Machine, and either run them continuously (simple) or spin the extra one(s) down when idle, and have them wake on request (more complex).
  • HA, horizontal scaling, and “autoscaling” UX improvements for Apps V2 are still in the pipeline.
  • We haven’t yet implemented canary or blue-green deployment strategies for V2. Right now there’s only rolling and immediate.
  • There are still some sharp edges with Apps V2. You could find new ones! (These are generally in getting up and running, not after-the-fact failures)
  • This can’t migrate apps with volumes yet. If you use Volumes, you’ll have to create new volumes and manually migrate your data, and we don’t yet have a seamless way to move data from one volume to another. We now support migration for apps with volumes! See here for more details about that.

Why to do this

Remember, we’re holding Nomad wrong (scroll down a bit in that article for specifics).

  • When Nomad gets jammed up it can’t do the things it’s good at anyway.
  • Nomad can’t move volumes, so on an app with volumes, Nomad can’t do the things it’s good at anyway.
  • Autoscaling on V1 doesn’t work the way you’d hope. It’s slow, for one thing.
  • Less magic—maybe this suits you! You know where you put your Machines and you can tell them to start and stop. If you ask for a new Machine you’ll find out quickly if there isn’t capacity to put it where you asked, rather than waiting for Nomad to come to a consensus on where to put it. You won’t see new Machines spin up in surprising regions like sometimes happens with V1 instances.

Why to do this ASAPractical

  • Reliability for you: You’re much less likely to be affected by V1-adjacent incidents.
  • Reliability for you plus everyone else: The fewer Apps V1 apps we have, the fewer V1-adjacent incidents are likely to happen.
16 Likes

Does this mean that it prevents you from migrating apps that have volumes entirely? Does it ignore them completely? Or if I have new volumes already available it’ll use them (and I can manually remove the old ones)?

(I only have volumes that contain caches, so it’s not a problem.)

For safety reasons, we go the most cautious route right now and prevent migrations that use volumes at all. In cases like yours, though, you should be able to just remove the volumes from the app, migrate the app, then attach new volumes.

@allison is it possible to detach a volume ? Don’t see it in fly volumes commands unless you meant it’s all done by modifying the toml configs

❯ fly volumes
Commands for managing Fly Volumes associated with an application

Usage:
  flyctl volumes [command]

Aliases:
  volumes, volume, vol

Available Commands:
  create      Create new volume for app
  destroy     Destroy a volume
  extend      Extend a target volume
  list        List the volumes for app
  show        Show details of an app's volume
  snapshots   Manage volume snapshots

Yes, you’ll have to remove it from the fly.toml, then re-add it once the migration’s done. (note that if you just comment out the volume config, the migrator does read->modify->write the fly.toml file, which unfortunately strips comments out)

For reasons, out of an abundance of caution, I would recommend creating new volumes entirely if the data within is only cache. If I recall correctly, there are still some unresolved questions about how volumes that have been referenced by nomad will behave after an app is migrated (one of the reasons we don’t currently migrate volumes automatically), so I’d advise regenerating them if at all possible.

Hey,

I tried the outlined points, but I am getting this result: Error unknown command "migrate-to-v2" for "flyctl"
I ran the curl command, even deleted the brew install and ran it again. Flyctl works, but migrate-to-v2 is not available.

xxx@pro:~/projects/work/proj$ curl -L https://fly.io/install.sh | sh -s pre
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1475    0  1475    0     0   2301      0 --:--:-- --:--:-- --:--:--  2322
######################################################################## 100.0%
set channel to shell-prerel
flyctl was installed successfully to /Users/xxx/.fly/bin/flyctl
Run 'flyctl --help' to get started
xxx@pro:~/projects/work/proj$ fly migrate-to-v2
Error: unknown command "migrate-to-v2" for "flyctl"
Run 'flyctl --help' for usage.
Error unknown command "migrate-to-v2" for "flyctl"

Since v0.0.505 was released, another -pre release will need to be created. Meanwhile you can specify sh -s 0.0.504-pre-3 instead.

Hey, I tried:
curl -L https://fly.io/install.sh | sh -s 0.0.504-pre-4
and getting:
Error: Unable to find a flyctl release for Darwin/arm64/0.0.504-pre-4 - see github.com/superfly/flyctl/releases for all versions

Sorry about that, typo fixed. Try `pre-3’.

This command is now available in the regular release channel. Install flyctl version 0.0.506 or later. For example use: curl -L https://fly.io/install.sh | sh

This looks great. We’ve got a mix of V1 and V2 apps right now.

Our V2 apps are things like postgres (of course), and other applications which don’t so much need horizontal scaling, placement control and deployment rollout.

Our V1 apps are things like our API services which need flexible horizontal scaling, and deployment rollouts like canary or blue/green. Once these things are available in V2 we would definitely be keen to migrate.

I understand that Nomad, which underpins V1 apps is also the source of the growth-based issues and that migration to V2 apps is preferable, but choosing fly was significantly based on some V1 features not yet available V2, and so currently block a migration on many apps.

We’re working on autoscaling. Previous autoscaling wasn’t actually very good, we have a better setup now. :slight_smile:

Are you getting value out of canary deploys you wouldn’t get from rolling deploys? Bluegreen is not actually very good on Nomad, just because new VMs start getting traffic immediately. The bluegreen most people want is for all the new VMs to come up, then traffic to shift over, then old VMs to go away.

One interesting feature of machine backed deploys is that they’re actually much faster. We pull the new image and prep the whole VM before we stop what’s running. If your app boots fast enough, you can get near zero downtime deploys with a single Machine running. And deploys with 2+ machines should always be zero downtime.

The biggest problem you’ll have with Nomad based apps is reliability. If you’re running into repeated deploy issues, you might consider switching to new apps earlier than you otherwise would.

oh and one other thing I’m not sure has been mentioned anywhere – release commands which we use for running DB migrations before starting a deploy. Are those supported in Apps V2? (We’ve not yet moved any apps using this to V2 yet as they’re also the apps using autoscaling and canary deploys)

Canary deploys are good for us because they allow for smoke tests and health checks as one final gate before we push code into production. So if something worked in our CI workflow but suddenly the app doesn’t start in prod, we’ve got a final chance to catch it.

Release commands should Just Work on V2.

As for deploy strategies, I think I can help you get what you want out of what’s already here.

When the default (rolling) strategy goes to update machines, it updates a single machine then waits for health checks to pass. If they fail, the entire deployment fails (with a handy non-zero exit status on the flyctl process!)
If this happens, all you need to do is redeploy the previous version of the app.

The only thing you couldn’t catch with this is subtle misbehaving (the kind an automated health check wouldn’t figure out, but a human could), but nomad couldn’t handle this either. Many people solve this with staging apps where they test-run deployments before they get deployed to prod, but that’s not a solution that works for every case!

1 Like

So how to do this for an app with a volume? (pocketbase database). Should I simply follow this?

@eipe Here’s post about that: fly migrate-to-v2: Apps with volumes support 🎉

2 Likes

@allison I tried to migrate my app, but the machines it created were all the base shared-cpu-1x with 256mb of memory.

My Rails app needs more than that, so the machines weren’t able to launch and the migration failed.

Is there something I need to do to make sure the new machines are launched with a copy of the existing VM specifications?

Hi, it looks like the nomad app is using micro-1x VMs, which is a vm size that isn’t supported at the moment. We will look at handling that for migrations.

In the meantime, try changing the nomad app to shared-cpu-1x vm size, or whichever size works best for your app, with fly scale vm shared-cpu-1x. Then run the migration.

1 Like