Yes, we will not start the automatic migrations until we can get this figure out.
Previously, I deployed my app to 14 regions. If I migrated to V2, do I need to set up the regions again? Also, do I need to set secrets, certificates …other settings again?
I tested it with my staging app (deployed to only one region) and Fly secretly add one instance without my notice
No, you shouldn’t have to change any of those settings. There’s a few things we don’t yet support in the migration (autoscaling is one I know of).
I wouldn’t have expected this to be the case, we’re looking into this and will let you know if we see anything.
@huytrinh with Apps V2, by default we create more than 1 instance because they are also configured to scale to zero. For apps getting migrated, we don’t automatically set the scale to zero configuration though so to maintain the old behavior for only 1 instance you’ll need to add --ha=false
to your fly deploy
command.
@danwetherald Forgot to respond earlier yesterday but the issue with old image references has been fixed and you should be all clear to migrate your apps.
I run the migration script fly migrate-to-v2
but got this error. I tried to pick fra or cdg as the default region. Could you check what is this error about?
My original setup
We also have a reverse proxy with NGINX in front of the app. I guess this migration doesn’t change the DNS
Updated: It seems that we had an incident. Now I migrated the app successfully
Hello. I can’t auto migrating my app to v2
My toml file
# fly.toml file generated for platinum on 2022-12-13T14:19:08+05:00
app = "platinum"
kill_signal = "SIGINT"
kill_timeout = 5
[env]
[experimental]
allowed_public_ports = []
auto_rollback = true
cmd = []
entrypoint = []
exec = []
[processes]
worker = "python main.py"
[[services]]
http_checks = []
internal_port = 8080
processes = []
protocol = "tcp"
script_checks = []
[services.concurrency]
hard_limit = 25
soft_limit = 20
type = "connections"
[[services.ports]]
force_https = true
handlers = ["http"]
port = 80
[[services.ports]]
handlers = ["tls", "http"]
port = 443
[[services.tcp_checks]]
grace_period = "1s"
interval = "15s"
restart_limit = 0
timeout = "2s"
I recieve error
Service specifies 'app' as one of its processes, but no processes are defined with that name; update fly.toml [processes] to add 'app' process or remove it from service's processes list
✘invalid app configuration
Error: failed to validate config for Apps V2 platform: App configuration is not valid
But I don’t understand what is required of me.
You need to link your service to your process:
[[services]]
http_checks = []
internal_port = 8080
processes = ["worker"]
Hello,
I’m trying to migrate one of my apps to V2. All other apps migrations went smoothly, in this case I have an error I’m struggling to figure out.
The app in question has 3 processes (app, queue, schedule. It’s a Laravel app). The migrate-to-v2
command fails with server returned a non-200 status code: 504
.
The schedule
process seems stuck and that is blocking the migration from completing. The current situation is:
-
app
andqueue
are scaled to 0 and thus not running -
schedule
is running and seemingly impossible to kill - all machines start and get to running state just fine
- the
app
platform is detached, manually changing it to v2 indicates that the command fails because there are still active allocations
After some digging I’ve identified a couple problems:
-
fly scale count 0
does not setschedule
to 0, it remains to 1 -
fly scale count schedule=0
ends in status 504
Here’s the result for LOG_LEVEL=debug fly scale count schedule=0
DEBUG --> POST https://api.fly.io/graphql
{
"query": "mutation ($input: SetVMCountInput!) { setVmCount(input: $input) { taskGroupCounts { name count } warnings } }",
"variables": {
"input": {
"appId": "contentwrite",
"groupCounts": [
{
"group": "schedule",
"count": 0,
"maxPerRegion": null
}
]
}
}
}
DEBUG {}
DEBUG <-- 504 https://api.fly.io/graphql (1m0.14s)
DEBUG <html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
I’ve also tried to fly vm stop XXXX
, which works but then the VM just restarts and goes back to running, and I cannot find any command to just kill it.
My flyctl version is fly v0.1.39 darwin/arm64 Commit: bcddf1f7e3fbe18d19a6a071fd1a2dee998facf4-dirty BuildDate: 2023-06-20T14:46:42Z
Any ideas on how to make this work? Thanks!
Our GraphQL API (which Nomad operations go through) has been pretty unstable today. It’s still under maintenance, I believe, but I suspect once that is cleared up you should be able to scale down.
Hi! I’ve just tried again but no luck. It’s related to that VM only, I’ve just tried and I can scale everything else in the same project without errors.
Anything else I could try?
I think the API has finally stabilized. Sorry about that!! Can you give it one final try?
Still the same error unfortunately
Ugh, sorry! I definitely thought that’d be the cause. I know this has got to be frustrating, but I’m going to make sure we figure out what’s going on here.
Can you check the output of fly autoscale show
? I did some digging, found an error trace that might be related to your issue, and it looks like it might be failing to scale down due to autoscaling rules conflicting with your scale-down request.
If autoscaling is enabled, run fly autoscale disable
to disable it.
If it is already disabled, try enabling it then disabling it again? fly autoscale set min=1 max=2
then fly autoscale disable
?
None of this should cause straight timeouts from the API, there’s definitely a bug lurking in here, but before I tackle that I want to get you fixed up.
There’s definitely something weird going on, I’ve tried your suggestion but it didn’t work.
fly autoscale show
returns Autoscaling: Disabled
.
fly autoscale set min=1 max=2
fails with Error: Autoscaling settings are not supported for Machine apps, please see https://community.fly.io/t/increasing-apps-v2-availability/12357
Now, the app is not on V2 yet, here’s the result of fly status
showing the platform as detached.
❯ fly status -a contentwrite
App
Name = contentwrite
Owner = contentwrite
Version = 102
Status = deployed
Hostname = contentwrite.fly.dev
Platform = detached
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
8558d2c6 schedule 102 fra run running 0 2023-06-21T12:22:14Z
and a simple screenshot from my dashboard, as you can see the text indicating Apps V2 next to the hostname is not showing yet.
To be fair there are also machines running that have been started successfully by fly migrate-to-v2
I’ve noticed that the app process has 3 machines, which is a bit strange. My scaling should be set to a single instance. It used to be 3 in the past, in multiple regions, but it shouldn’t be anymore.
Here’s the result of fly scale show
❯ fly scale show -a contentwrite
VM Resources for contentwrite
Count: app=0 queue=0 schedule=1
Max Per Region: app=0 queue=0 schedule=0 websocket=0
Process group app
VM Size: shared-cpu-1x
VM Memory: 256 MB
Max Per Region: 0
Process group queue
VM Size: shared-cpu-1x
VM Memory: 256 MB
Max Per Region: 0
Process group schedule
VM Size: shared-cpu-1x
VM Memory: 256 MB
Max Per Region: 0
Process group websocket
VM Size: shared-cpu-1x
VM Memory: 256 MB
Max Per Region: 0
I can see that the problem with scaling between 0 and 1 instances I had earlier has been fixed. Thank you, now I no longer lose my machine settings, and no longer is a second instance created automatically. Fantastic news, thank you Fly team
I have recently received an email that the remaining apps I have on V1 nomad will be automatically migrated to V2. The only reason these last few apps were still on V1 was due to the issues when migrating apps with volumes.
I was able to migrate two of the last three apps but I have one last app left (better-cart-postgres
) which returns an error when running the migration command. I am guessing this is due to an old image that is not retrievable by the migrator but not sure.
Error: 404: 404 page not found
Thanks!
Can you rerun the migration like LOG_LEVEL=debug fly migrate-to-v2
? The additional context would make tracking this down much easier
I also shared this via email:
DEBUG gqlGetInstances() result: &{[ord.better-cart-postgres.internal (fdaa:0:190:a7b:8c31:0:d966:2) ord.better-cart-postgres.internal (fdaa:0:190:a7b:8ba9:0:d988:2) iad.better-cart-postgres.internal (fdaa:0:190:a7b:ab8:0:57f8:2)] [fdaa:0:190:a7b:8c31:0:d966:2 fdaa:0:190:a7b:8ba9:0:d988:2 fdaa:0:190:a7b:ab8:0:57f8:2]}
DEBUG gqlErr: <nil> agentErr: <nil>
DEBUG flypg will connect to: http://fdaa:0:190:a7b:8c31:0:d966:2:5500
DEBUG --> GET http://fdaa:0:190:a7b:8c31:0:d966:2:5500/commands/admin/role
DEBUG <-- 404 http://fdaa:0:190:a7b:8c31:0:d966:2:5500/commands/admin/role (43.38ms)
DEBUG 404 page not found
DEBUG 404 response from /commands/admin/role endpoint. Calling legacy endpoint.
DEBUG --> GET http://fdaa:0:190:a7b:8c31:0:d966:2:5500/flycheck/role
DEBUG <-- 200 http://fdaa:0:190:a7b:8c31:0:d966:2:5500/flycheck/role (23.5ms)
DEBUG "leader"
DEBUG flypg will connect to: http://fdaa:0:190:a7b:8c31:0:d966:2:5500
DEBUG --> GET http://fdaa:0:190:a7b:8c31:0:d966:2:5500/commands/admin/settings/view
DEBUG [
"max_wal_senders",
"max_replication_slots"
]
DEBUG {0x14000ec4720}
DEBUG <-- 404 http://fdaa:0:190:a7b:8c31:0:d966:2:5500/commands/admin/settings/view (42.8ms)
DEBUG 404 page not found
DEBUG Task manager done
Error: 404: 404 page not found
try fly image update
before starting the migration. It should update the postgres image to a version that have the expected endpoints