Fly Postgres won't start after config change

alechartung · October 16, 2024, 10:31pm

I tried updating the config like this:

$ fly postgres config update \
  --max-connections 300 \
  --shared-buffers 524288 \
  --maintenance-work-mem 131072 \
  --work-mem 893952 \
  --app foobar 
NAME                	VALUE	TARGET VALUE	RESTART REQUIRED 
shared-buffers      	65536	524288      	true            	
work-mem            	4096 	893952      	false           	
maintenance-work-mem	65536	131072      	false
...

The idea was to get it matching this config:

shared_buffers = 512MB
maintenance_work_mem = 128MB
work_mem = 873kB

But it’s failing the health checks now and not coming back up:

 failed post-init: failed to establish connection to local node: failed to connect to `host=fdaa:a:87f5:a7b:a:87cd:572a:2 user=postgres database=postgres`: dial error (dial tcp [fdaa:a:87f5:a7b:a:87cd:572a:2]:5433: connect: connection refused). Retrying...

Now that I took another look, I totally messed up work_mem when converting units.

Is there anything I can do to recover from this, or am I stuck deleting the app/machine and recreating from scratch?

If I’m stuck deleting it this time, is there anything in the future I can do to make recovering from similar easier?

Tried reverting the config, but I’m getting Error: no active leader found now.

mayailurus · October 16, 2024, 10:46pm

Hi… Don’t panic; at worst you’ll just have to restore from a day-old snapshot.

How about temporarily scaling up the machine to match that amount of memory?

https://fly.io/docs/postgres/managing/scaling/

You’ll only pay for this extra for the duration that it’s actually running…

Aside: It is generally advisable to periodically do your own pg_dump backups, though.

alechartung · October 16, 2024, 11:23pm

Right now I’m evaluating moving a Digital Ocean workload over, so I’m mostly figuring out (and breaking) things as I go.

Was able to make a new database using a snapshot:

$ fly postgres create --snapshot-id foobar --image-ref flyio/postgres-flex-timescaledb:16

Swapping the attachment was a bit wonky, but worked.

$ fly postgres attach foobar2 --app foobar 
Checking for existing attachments
Error: consumer app "foobar" already contains a secret named DATABASE_URL
$ fly postgres detach foobar1 --app foobar
Error: no active leader found
$ fly secrets unset DATABASE_URL --app foobar
Updating existing machines in 'foobar' with rolling strategy
$ fly postgres attach foobar2 --app foobar
Checking for existing attachments
? Database "foobar" already exists. Continue with the attachment process? Yes
Error: database user "foobar" already exists. Please specify a new database user via --database-user
# manually `DROP USER foobar;`, then retry
$ fly postgres attach foobar2 --app foobar
# success

As that was happening, my application hit the max restarts and stopped. Couldn’t get it to start back up through the web UI by pressing start, but fly machine restart brought it back up with the new database config.

Mostly documented for my future reference.

Everything’s running now - thanks!

system · October 23, 2024, 11:24pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Postgres down after machine memory update Questions / Help postgres	26	747	May 8, 2023
Can't update Postgresql machine "Error: no config changes found" elixir , postgres	2	326	December 21, 2023
Resizing postgres does not update work_mem Questions / Help postgres	3	458	July 8, 2022
Scaling up postgres memory scales shared-buffers without intervention docs , postgres	1	292	October 26, 2023
Postgres machines down? Questions / Help postgres	4	521	April 12, 2024

Fly Postgres won't start after config change

Related topics