Hi,
We are again facing issues deploying to MAA region. Worst part is, if the deployment failed, it did not roll back to previous stable version, and again causing our app to stop. Its failing for last 30 mins now.
202v96 failed - Failed due to unhealthy allocations - not rolling back to stable job version 96 as current job has same specification
203***v96 failed - Failed due to unhealthy allocations - not rolling back to stable job version 96 as current job has same specification and deploying as v97
That should not take your app offline, however, it should just fail the current deploy. It looks like your app has one volume and is running a single instance. You’ll need to add a second volume and fly scale count 2 to prevent downtime during deploys.
I’m getting your app running again now, give it a few minutes.
Yes, in the same region. They do not sync, though. Our volumes are meant to be run in pairs for better uptime. Sync is left to the application, however. What are you storing on the volume?
You’re likely better off using a postgres cluster for storing data. That handles all the high availability / sync / failover for you. You can run other clustered database-like software if you’d rather.
Again, Scale count 2 in 1 region would be better or 1 instance in 2 regions is better? I would like to be in MAA as much as possible due to latency. Our primary customers are in India.
Can you please expand on this? The docs mention there’s some kind of “anchoring” between scale count and volumes. And I can kind of see how volumes would affect deployments, but I am not sure if I understand it well enough.
Is my understand right that: If my setup’s single instance with a single volume, then new deploys always go down until that volume could be detached from the current instance and attached to the new one? I’d expect a new volume instead to be attached to the newly deployed instance, though…
Yes that’s correct. If you have one volume, the VM using it has to stop before a new one can start. If you have two volumes, this can happen with no downtime.
You don’t need to use volumes to “anchor” things to regions like this anymore. We now support fly scale count 2 --max-per-region 1, which should let you keep two vms in two different regions full time.