Apps V2 is now the default platform for new Fly.io customers’ deployments. You can also flip the switch yourself, so that any new apps you create are V2 apps by default. (TLDR:fly orgs apps-v2 default-on <org-slug>)
Our main docs now come at things from a V2-first perspective, though most procedures haven’t really changed and Nomad-specific instructions are still included where there is a difference. (Let us know if something’s missing!)
But what about my existing V1 apps?
We’re working on a tool to migrate apps to V2, but in the meantime, you may want to migrate manually. We now have a doc covering manual migration of a basic web app. In addition, community members have contributed resources here on the forum:
Apps V2 isn’t yet as fully featured as Apps V1. There’s not as much abstraction.
Less magic: One big conceptual shift that may give you pause is that flyd does not move VMs magically. Nomad continuously tries to make sure that if you’ve scaled to (say) one instance, there’s one instance running on some hardware, somewhere. So if a host has issues, in principle Nomad will move a failed VM to other hardware. With Apps V2, the idea is that you provision more than one Machine, and either run them continuously (simple) or spin the extra one(s) down when idle, and have them wake on request (more complex).
HA, horizontal scaling, and “autoscaling” UX improvements for Apps V2 are still in the pipeline.
We haven’t yet implemented canary or blue-green deployment strategies for V2. Right now there’s only rolling and immediate.
There are still some sharp edges with Apps V2. You could find new ones! (These are generally in getting up and running, not after-the-fact failures)
These instructions aren’t a complete walk-through for more complex apps. For example, you can’t just unplug Fly Volumes from a V1 app and plug them into a V2 app. If you use Volumes, you’ll have to create new volumes and migrate your data, and we don’t yet have a seamless way to move data from one volume to another.
When Nomad gets jammed up it can’t do the things it’s good at anyway.
Nomad can’t move volumes, so on an app with volumes, Nomad can’t do the things it’s good at anyway.
Autoscaling on V1 doesn’t work the way you’d hope. It’s slow, for one thing.
Less magic—maybe this suits you! You know where you put your Machines and you can tell them to start and stop. If you ask for a new Machine you’ll find out quickly if there isn’t capacity to put it where you asked, rather than waiting for Nomad to come to a consensus on where to put it. You won’t see new Machines spin up in surprising regions like sometimes happens with V1 instances.
Why to do this ASAPractical
Reliability for you: You’re much less likely to be affected by V1-adjacent incidents.
Reliability for you plus everyone else: The fewer Apps V1 apps we have, the fewer V1-adjacent incidents are likely to happen.
If you feel confident in performing a manual V1 → V2 migration, we encourage you to go ahead and do so! If you’re not quite sure about that, we encourage you to smash that default-on button and make your new apps V2 apps!
I have a non-critical app that I don’t really want to rename: if I delete it first and then re-create it, can I reuse the same name? (I have prereview and prereview-sandbox apps; I don’t mind renaming the former to prereview-prod, but the latter doesn’t really have a good alternative, and adding a -v2 suffix looks weird!)
Does the tool that’s being developed require apps to be renamed? (IIRC it’s not possible to rename apps.)
one thing we do is: when you do a deploy, flyctl creates multiple machines for each instance. Only one is started, but others are prepped on different workers. If a worker goes down, fly-proxy notices, and sends a signal to start a spare.
This is not what actually happening? Considering the following
Can you expand on what you mean on this? The [processes] config created multiple VM’s in Apps V1 and continues to do the same (creates multiple machines) in V2.
At one point in Laravel’s history here at Fly, we had instructions on running CRON inside the same VM as the web server. That’s still possible but isn’t related to apps v1 vs apps v2.
Let me know if I’m making it more confusing instead of less confusing - I’m not entirely sure what you meant there!
fly launch and fly deploy have been rewritten since then. I’m pretty confident that we don’t currently do this, and that this sort of default HA behaviour is still a WIP for the shape of Apps V2 as released.
Thanks for these instructions! I successfully migrated an app to V2 today and so far it’s working well, but I did find that services.script_checks no longer seems to work. Has support for script_checks been removed in Apps V2? I know it was never officially documented, but it was very useful.
Here’s the script_checks config I was using successfully with my V1 app:
Something that I think should be emphasized is that machines need to have a restart policy of always. It’s strange because this seems important and it’s not even documented?
At some point this was made the default, but that wasn’t the case when I migrated my apps from v1 to v2.
I had to manually do fly machines update <machine-id> for every machine of every v2 app to apply the new default setting. My PG machines didn’t need to be updated though.
I’ve migrated on of our API (Node) services and was not super straightforward to get it working but finally did.
One thing I missed it’s what’s the best process for deploying/releasing a new version? Should we run fly m update to all the machines that we have running inside an app, or should we just run fly deploy as usual?
We’re running into issues running fly deploy because the machine that gets booted up runs a VM process of 256mb and 1 cpu but we need something more (1024kb) for running our seed process, when we deploy the machine gets destroyed by an oop memory and we can’t adjust later the machine size. I would expect that the machine should boot-up with the default machine size but it’s not working properly.
We finally completed our migration, adding our anecdotal experience here. Most of the issues we bumped into can be chocked up to user error on our part (we ran into the same issue with our certs as @pier, for instance, should have been more on top of things here ). The experience was largely quick and relatively painless, and since the app is a separate setup from the currently running one we were able to cut over easily and after having validated that the new one was set up correctly (while keeping our main V1 instance chugging along as normal). This was great and would be something I’d honestly worry about a bit with the new automatic migration tool, as that peace of mind would be lost a bit.
One of the things that would be a definite help would be the ability to preconfigure preferred machine size on initial deployment, as other folks have mentioned in this and other threads. For the large prod-type deploys this isn’t as big a deal, since I could manually deploy, see that it failed, scale it up, and then redeploy with more memory to get the app up and running and then subsequent deploys would just get pushed on top of that already-big-enough-machine.
Our use case also includes automated preview environments we spin up through a pipeline, however, and those are a lot more annoying to spin up with this strategy as it would require us (I think) to attempt a deploy, wait for it to be like “yep, out of memory alright”, swallow that error, and then scale and redeploy. (Weirdly we’ve found that older verisons of flyctl don’t seem to throw errors/timeout on that initial deploy step, so we’re pinned to older ones for now and it’s working well with the automated deploy → scale approach). This would definitely be cleaner with the option to toss a size preset or CPU/Memory setting in the toml file much like the primary_region, though.
Aside from that hiccup, the documentation assembled was very helpful, especially scripts for copying over app secrets, so thanks for adding that to the docs @catflydotio.
Overall, it was certainly a bit of work but not nearly as bad as I had worried it would be. Would recommend to anybody that is considering manually migrating and was on the fence like I was.
So, always is a good restart policy for always-on Machines. It’s not a good restart policy for Machines you stop yourself and would like to stay stopped*, or Machines, with services, that exit when idle so that Fly Proxy can wake them on request, which is closer to the generic web app use case our default fly deploy expects to be most common.
I’m not saying we nailed the UX on this one . It will improve, on several fronts! Just thought it might be less mysterious with that info.
*In fact I missed that the restart policy is about whether to try to get a Machine up and running again without reaching the stopped state, so if you stop a Machine, it’ll stay stopped unless the Fly Proxy wakes it up.