Migrating to v2 broke my app 😢

After receiving the email about the upcoming migration, I figured I’d run fly migrate-to-v2 myself, just to be sure it ran properly. Unfortunately, the migration failed, but seems to believe it succeeded. I attempted to use troubleshooting mode, not realising that this was my only chance to roll back to a working version.

% fly migrate-to-v2 -c ./fly.toml 
This migration process will do the following, in order:
 * Lock your application, preventing changes during the migration
 * Remove legacy VMs 
   * Remove 1 alloc
   * NOTE: Because your app uses volumes, there will be a short downtime during migration while your machines start up.
 * Create clones of each volume in use, for the new machines
   * These cloned volumes will have the same name but different id
   * Please note that your old volumes will not be removed.
     (you can do this manually, after making sure the migration was a success)
 * Create machines, copying the configuration of each existing VM
   * Create 1 "app" machine
 * Set the application platform version to "machines"
 * Unlock your application
 * Overwrite the config file at '/Users/.../fly.toml'
? Would you like to continue? Yes
==> Migrating tidyweather-api to the V2 platform
>  Locking app to prevent changes during the migration
>  Making snapshots of volumes for the new machines
>  Scaling down to zero VMs. This will cause temporary downtime until new VMs come up.
>  Enabling machine creation on app
>  Creating an app release to register this migration
>  Starting machines
INFO Using wait timeout: 5m0s lease timeout: 13s delay between lease refreshes: 4s

Updating existing machines in 'tidyweather-api' with rolling strategy
  [1/1] Waiting for 32874792fd1085 [app] to become healthy: 0/1

WARNING The app is not listening on the expected address and will not be reachable by fly-proxy.
You can fix this by configuring your app to listen on the following addresses:
  - 0.0.0.0:8080
Found these processes inside the machine with open listening sockets:
  PROCESS                                         | ADDRESSES                            
--------------------------------------------------*--------------------------------------
  litefs mount -- tidyweather-api -dsn /litefs/db | [::]:20202                           
  /.fly/hallpass                                  | [fdaa:1:3065:a7b:b2:e94c:755d:2]:22  

failed while migrating: timeout reached waiting for healthchecks to pass for machine 32874792fd1085 failed to get VM 32874792fd1085: Get "https://api.machines.dev/v1/apps/tidyweather-api/machines/32874792fd1085": net/http: request canceled
note: you can change this timeout with the --wait-timeout flag
? Would you like to enter interactive troubleshooting mode? If not, the migration will be rolled back. Yes

Oops! We ran into issues migrating your app.
We're constantly working to improve the migration and squash bugs, but for
now please let this troubleshooting wizard guide you down a yellow brick road
of potential solutions...
               ,,,,,
       ,,.,,,,,,,,, .
   .,,,,,,,
  ,,,,,,,,,.,,
     ,,,,,,,,,,,,,,,,,,,
         ,,,,,,,,,,,,,,,,,,,,
            ,,,,,,,,,,,,,,,,,,,,,
           ,,,,,,,,,,,,,,,,,,,,,,,
        ,,,,,,,,,,,,,,,,,,,,,,,,,,,,.
   , ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

The app's platform version is 'detached'
This means that the app is stuck in a half-migrated state, and wasn't able to
be fully recovered during the migration error rollback process.

Fixing this depends on how far the app got in the migration process.
Please use these tools to troubleshoot and attempt to repair the app.
No legacy Nomad VMs found. Setting platform version to machines/Apps V2.

% fly migrate-to-v2 troubleshoot
No issues detected.

The only actionable advice this output seems to have is that my app needs to listen on port 8080, which is already does.

The monitoring page shows this error over and over:

cannot acquire lease or find primary, retrying: fetch primary url: Get "http://127.0.0.1:8500/v1/kv/litefs/tidyweather-api": dial tcp 127.0.0.1:8500: connect: connection refused

I haven’t had any luck figuring out what’s gone wrong, I’d appreciate any suggestions that folks have for where I can look further. :slightly_smiling_face:

Hi @pento

I can see here that your app is no longer on detached state, it’s in machines platform version.

The log line you posted seems you’re having trouble connecting to Consul, Ben Johnson helped someone with a similar issue, try fly consul attach. For reference:

Also keep an eye on logs after that to see if you can spot other issues that might appear.

1 Like

That did it, thank you for your help, @lubien!

1 Like

I’m glad it’s fixed! Thanks for following up

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.