fly migrate-to-v2 - postgres edition

Looks like 0.530 isnt latest yet.

@DAlperin We are still getting the same error as before when trying to upgrade an old pg deployment on version0.532

@DAlperin I am still not able to migrate old PG apps to v2, any ideas?

I am still getting Error: 404: 404 page not found

app:rotator-api-pg

I can also not restart these pg apps either:

fly apps restart rotator-api-pg                                                                                                                                                                    
Error: postgres apps should use `fly pg restart` instead

fly pg restart                                                                                                                                                                                       
Error: command is not compatible with this image

running fly version 0.549

1 Like

Hello, getting the following error when running the fly migrate-to-v2 command for my Postgres app:

 2 desired, 1 placed, 0 healthy, 1 unhealthy
--> v1 failed - Failed due to unhealthy allocations and deploying as v2 

--> Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
failed while migrating: abort
==> (!) An error has occurred. Attempting to rollback changes...
>  Disabling readonly
failed while rolling back application: can't get role for fdaa:0:476d:a7b:5adc:1:6276:2: Get "http://fdaa:0:476d:a7b:5adc:1:6276:2:5500/commands/admin/role": connect tcp [fdaa:0:476d:a7b:5adc:1:6276:2]:5500: operation timed out
Error: abort

Then I tried a couple more times, but I kept getting:

==> Migrating db to the V2 platform
>  Upgrading postgres image
>  Setting postgres primary to readonly
failed while migrating: 404: 404 page not found

==> (!) An error has occurred. Attempting to rollback changes...
>  Disabling readonly
failed while rolling back application: 404: 404 page not found

Error: 404: 404 page not found

Running fly v0.0.549

I am still stuck with a postgres app running in v1 with a VM that is stuck in a bad state, I am not able to re-deploy, restart, or migrate. At the moment I at least have been able to scale to 2 VMs so that its at least working but it would be nice to get migrated to v2 and get back to a 100% healthy state.

Any ideas @DAlperin ?

@stakindotcom Could you try running fly agent restart and try again?

@danwetherald

Can you confirm everything looks good on your end?

@shaun nope, nothing has changed, haven’t been directed to try anything new. Everything is still the same on my end.

@danwetherald What app are you having trouble with?

@shaun rotator-api-pg

What issues are you experiencing? It looks healthy on my end.

It’s not 100% healthy, one of the two VMs is not healthy, more errors were shared in the other topic referenced above.

I also referenced the issues we are having migrating to V2 in reference to Thai topic.

$ fly status

Instances
ID      	PROCESS	VERSION	REGION	DESIRED	STATUS           	HEALTH CHECKS     	RESTARTS	CREATED
4de836d7	app    	15     	ord   	run    	running (leader) 	3 total, 3 passing	0       	2023-05-02T21:48:38Z
19ba1e0e	app    	15     	ord   	run    	running (replica)	3 total, 3 passing	1       	2023-05-02T21:22:00Z

@shaun This is new then as of today, one of the two has been failing for some time now, again you can see the errors that were happening all week in the other thread I referenced above.

End of the day we want to be able to move these to V2 with fly flex and have no idea how to do so.

Unfortunately, you’re only option to move onto Flex right now is to spin up a new application that’s running Flex and import your data via fly pg import tool.

Is there docs on this? I have never seen this tool before.

Edit: Found it, we were able to migrate by creating a new PG app and running the pg import tool.

1 Like

I run it again today and it worked!

1 Like

I encountered this error when trying to migrate:

 can't get role for xxxx:xxxx:xxxx Get "http://xxxx:xxxx:xxxx /commands/admin/role": connect tcp [xxxx:xxxx:xxxx ]:5500: operation timed out

The reason this was happening, in my case, is that another dev was proxying the database at the time.

So, if you are migrating, make sure to tell everyone else with access to disconnect.

After everyone has disconnected, just do this before migrating:

fly agent restart

I had an issue where the migrate-to-v2 command on the postgres cluster timed out. It left our database in read-only state. I wrote a postmortem on it here.

In short, if your DB gets stuck in a read only mode as a result of a timeout on a migration you will want to set it back to writable using this command.

# connect to postgres psql
fly postgres connect -a <postgres_fly_app_name>
# This will show false for the postgres DB
SHOW default_transaction_read_only; 
# connect to the production db
\connect <production_db_name>
# This will now show the root cause of issue
SHOW default_transaction_read_only;
# the following command will only work for the current connection
SET default_transaction_read_only TO off;
# connect back to postgres
\connect postgres
# fix the real problem
alter database <production_db_name> set default_transaction_read_only=off;

I hope this helps someone with a failed postgres migration stuck in a read only state.

Hi,

I tried migrating two Postgres instances connected to two apps to v2. The first one worked fine, but the second one failed.

After the migration, I first got an error about “No leader found” and after trying to restart the app and scale it up/down I’m now getting “Error: no 6pn ips found for…

Any ideas on what I should do next to get back up and running?

I also tried to create a new volume from some of the snapshots but that fails with

Error: failed to create volume: failed to create volume: EOF