Manual migration to Apps V2

( see also: fly migrate-to-v2 - Automatic migration to Apps V2 )

Apps V2 is now the default platform for new Fly.io customers’ deployments. You can also flip the switch yourself, so that any new apps you create are V2 apps by default. (TLDR: fly orgs apps-v2 default-on <org-slug>)

Our main docs now come at things from a V2-first perspective, though most procedures haven’t really changed and Nomad-specific instructions are still included where there is a difference. (Let us know if something’s missing!)

But what about my existing V1 apps?

We’re working on a tool to migrate apps to V2, but in the meantime, you may want to migrate manually. We now have a doc covering manual migration of a basic web app. In addition, community members have contributed resources here on the forum:

@tj1 got out ahead of our docs, sharing their process for moving over.

@pier generously emphasized the good practice of checking your certs are good before swapping DNS over

@sb8244 also concentrated the fruits of their migration experience for our benefit today.

Why not to do this (yet)

  • Apps V2 isn’t yet as fully featured as Apps V1. There’s not as much abstraction.

  • Less magic: One big conceptual shift that may give you pause is that flyd does not move VMs magically. Nomad continuously tries to make sure that if you’ve scaled to (say) one instance, there’s one instance running on some hardware, somewhere. So if a host has issues, in principle Nomad will move a failed VM to other hardware. With Apps V2, the idea is that you provision more than one Machine, and either run them continuously (simple) or spin the extra one(s) down when idle, and have them wake on request (more complex).

    HA, horizontal scaling, and “autoscaling” UX improvements for Apps V2 are still in the pipeline.

  • We haven’t yet implemented canary or blue-green deployment strategies for V2. Right now there’s only rolling and immediate.

  • There are still some sharp edges with Apps V2. You could find new ones! (These are generally in getting up and running, not after-the-fact failures)

  • These instructions aren’t a complete walk-through for more complex apps. For example, you can’t just unplug Fly Volumes from a V1 app and plug them into a V2 app. If you use Volumes, you’ll have to create new volumes and migrate your data, and we don’t yet have a seamless way to move data from one volume to another.

Why to do this

Remember, we’re holding Nomad wrong (scroll down a bit in that article for specifics).

  • When Nomad gets jammed up it can’t do the things it’s good at anyway.

  • Nomad can’t move volumes, so on an app with volumes, Nomad can’t do the things it’s good at anyway.

  • Autoscaling on V1 doesn’t work the way you’d hope. It’s slow, for one thing.

  • Less magic—maybe this suits you! You know where you put your Machines and you can tell them to start and stop. If you ask for a new Machine you’ll find out quickly if there isn’t capacity to put it where you asked, rather than waiting for Nomad to come to a consensus on where to put it. You won’t see new Machines spin up in surprising regions like sometimes happens with V1 instances.

Why to do this ASAPractical

  • Reliability for you: You’re much less likely to be affected by V1-adjacent incidents.

  • Reliability for you plus everyone else: The fewer Apps V1 apps we have, the fewer V1-adjacent incidents are likely to happen.

If you feel confident in performing a manual V1 → V2 migration, we encourage you to go ahead and do so! If you’re not quite sure about that, we encourage you to smash that default-on button and make your new apps V2 apps!

14 Likes

Thanks for this; I think I’m going to try it.

I have a non-critical app that I don’t really want to rename: if I delete it first and then re-create it, can I reuse the same name? (I have prereview and prereview-sandbox apps; I don’t mind renaming the former to prereview-prod, but the latter doesn’t really have a good alternative, and adding a -v2 suffix looks weird!)

Does the tool that’s being developed require apps to be renamed? (IIRC it’s not possible to rename apps.)

What’s the suggestion for apps that need a process group with a single machine (eg. running cron)? Examples:

Yes, I do this all the time! Once you destroy the app, its name becomes available to use again. (Caveat: it becomes available for everybody)

Sometimes it seems to take a while for DNS and/or health checks to stabilize. If it’s not a critical app, that’s less of a worry.

1 Like

I’m not sure Apps V2 has a solution for this yet.

Hi @catflydotio !

I am really excited about this new improvement. Relating this I would like to ask if a fly VM has now a static ip Adress that I can whitelist?

Thanks!

Not at this time, sorry. VMs have outgoing IPv6 addresses, but these are not guaranteed to remain the same.

hey @catflydotio

This article Carving The Scheduler Out Of Our Orchestrator · The Fly Blog says:

one thing we do is: when you do a deploy, flyctl creates multiple machines for each instance. Only one is started, but others are prepped on different workers. If a worker goes down, fly-proxy notices, and sends a signal to start a spare.

This is not what actually happening? Considering the following

Hi!

Can you expand on what you mean on this? The [processes] config created multiple VM’s in Apps V1 and continues to do the same (creates multiple machines) in V2.

At one point in Laravel’s history here at Fly, we had instructions on running CRON inside the same VM as the web server. That’s still possible but isn’t related to apps v1 vs apps v2.

Let me know if I’m making it more confusing instead of less confusing - I’m not entirely sure what you meant there!

fly launch and fly deploy have been rewritten since then. I’m pretty confident that we don’t currently do this, and that this sort of default HA behaviour is still a WIP for the shape of Apps V2 as released.

Added another “why not to migrate yet”:

  • We haven’t yet implemented canary or blue-green deployment strategies for V2. Right now there’s only rolling and immediate.

Thanks for these instructions! I successfully migrated an app to V2 today and so far it’s working well, but I did find that services.script_checks no longer seems to work. Has support for script_checks been removed in Apps V2? I know it was never officially documented, but it was very useful.

Here’s the script_checks config I was using successfully with my V1 app:

[[services]]
  processes = ["sidekiq"]

  [[services.script_checks]]
    command = "/app/bin/sidekiq-healthcheck"
    args = []
    grace_period = "10s"
    interval = "30s"
    restart_limit = 4
    timeout = "15s"

Something that I think should be emphasized is that machines need to have a restart policy of always. It’s strange because this seems important and it’s not even documented?

At some point this was made the default, but that wasn’t the case when I migrated my apps from v1 to v2.

I had to manually do fly machines update <machine-id> for every machine of every v2 app to apply the new default setting. My PG machines didn’t need to be updated though.

More info here:

2 Likes

I’ve migrated on of our API (Node) services and was not super straightforward to get it working but finally did.

One thing I missed it’s what’s the best process for deploying/releasing a new version? Should we run fly m update to all the machines that we have running inside an app, or should we just run fly deploy as usual?

We’re running into issues running fly deploy because the machine that gets booted up runs a VM process of 256mb and 1 cpu but we need something more (1024kb) for running our seed process, when we deploy the machine gets destroyed by an oop memory and we can’t adjust later the machine size. I would expect that the machine should boot-up with the default machine size but it’s not working properly.

1 Like

@Moreno fly deploy deploys new version to all machines within the app

You set more memory before the first deploy, or add a swap file if it’s a one time memory usage burst

We finally completed our migration, adding our anecdotal experience here. Most of the issues we bumped into can be chocked up to user error on our part (we ran into the same issue with our certs as @pier, for instance, should have been more on top of things here :man_facepalming:). The experience was largely quick and relatively painless, and since the app is a separate setup from the currently running one we were able to cut over easily and after having validated that the new one was set up correctly (while keeping our main V1 instance chugging along as normal). This was great and would be something I’d honestly worry about a bit with the new automatic migration tool, as that peace of mind would be lost a bit.

One of the things that would be a definite help would be the ability to preconfigure preferred machine size on initial deployment, as other folks have mentioned in this and other threads. For the large prod-type deploys this isn’t as big a deal, since I could manually deploy, see that it failed, scale it up, and then redeploy with more memory to get the app up and running and then subsequent deploys would just get pushed on top of that already-big-enough-machine.

Our use case also includes automated preview environments we spin up through a pipeline, however, and those are a lot more annoying to spin up with this strategy as it would require us (I think) to attempt a deploy, wait for it to be like “yep, out of memory alright”, swallow that error, and then scale and redeploy. (Weirdly we’ve found that older verisons of flyctl don’t seem to throw errors/timeout on that initial deploy step, so we’re pinned to older ones for now and it’s working well with the automated deploy → scale approach). This would definitely be cleaner with the option to toss a size preset or CPU/Memory setting in the toml file much like the primary_region, though.

Aside from that hiccup, the documentation assembled was very helpful, especially scripts for copying over app secrets, so thanks for adding that to the docs @catflydotio.

Overall, it was certainly a bit of work but not nearly as bad as I had worried it would be. Would recommend to anybody that is considering manually migrating and was on the fence like I was.

3 Likes

So, always is a good restart policy for always-on Machines. It’s not a good restart policy for Machines you stop yourself and would like to stay stopped*, or Machines, with services, that exit when idle so that Fly Proxy can wake them on request, which is closer to the generic web app use case our default fly deploy expects to be most common.

I’m not saying we nailed the UX on this one :grimacing:. It will improve, on several fronts! Just thought it might be less mysterious with that info.

*In fact I missed that the restart policy is about whether to try to get a Machine up and running again without reaching the stopped state, so if you stop a Machine, it’ll stay stopped unless the Fly Proxy wakes it up.

2 Likes

Maybe a little menu in the flyctl CLI when creating an app would help :slight_smile:

1 Like

We have a beta tool to migrate an app in place now: fly migrate-to-v2 - Automatic migration to Apps V2 (beta)

3 Likes

The comment I quoted made keeping a single VM running sound dangerous, because if the host has problems, I could end up with zero machines running.