Postgres machine stuck in "stopping"

What can I do with a Postgres machine stuck in Stopping state? I don’t seem to be able to do anything due to preconditions not being met for starting or restarting.

2025-03-17 16:45:44.169	
[PC05] timed out while connecting to your instance. this indicates a problem with your app (hint: look at your logs and metrics)
2025-03-17 16:45:43.860	
[PC05] timed out while connecting to your instance. this indicates a problem with your app (hint: look at your logs and metrics)
2025-03-17 16:45:41.526	
[253203.643626] reboot: Restarting system
2025-03-17 16:45:41.525	
 WARN could not unmount /rootfs: EINVAL: Invalid argument
2025-03-17 16:45:41.492	
 INFO Umounting /dev/vdc from /data
2025-03-17 16:45:41.489	
 INFO Starting clean up.
2025-03-17 16:45:41.471	
 INFO Main child exited normally with code: 0
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))


2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.727	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.726	
[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=shutdown_write, error=Transport endpoint is not connected (os error 107))
2025-03-17 16:45:40.472	
postgres | Process exited 0
2025-03-17 16:45:40.470	
postgres | 2025-03-17 15:45:40.469 UTC [653] LOG:  database system is shut down
2025-03-17 16:45:40.443	
postgres | 2025-03-17 15:45:40.443 UTC [16625] LOG:  checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.002 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=48439 kB
2025-03-17 16:45:40.441	
postgres | 2025-03-17 15:45:40.441 UTC [16625] LOG:  checkpoint starting: shutdown immediate
2025-03-17 16:45:40.441	
postgres | 2025-03-17 15:45:40.440 UTC [16625] LOG:  shutting down
2025-03-17 16:45:40.441	
postgres | 2025-03-17 15:45:40.440 UTC [16625] LOG:  checkpoint complete: wrote 3224 buffers (24.7%); 0 WAL file(s) added, 0 removed, 2 recycled; write=229.706 s, sync=0.001 s, total=229.708 s; sync files=8, longest=0.001 s, average=0.001 s; distance=41135 kB, estimate=53821 kB
2025-03-17 16:45:40.425	
repmgrd  | Process exited 0
2025-03-17 16:45:40.423	
repmgrd  | [2025-03-17 15:45:40] [INFO] repmgrd terminating...
2025-03-17 16:45:40.423	
admin    | signal: interrupt
2025-03-17 16:45:40.419	
proxy    | exit status 130
2025-03-17 16:45:40.416	
postgres | 2025-03-17 15:45:40.416 UTC [653] LOG:  background worker "logical replication launcher" (PID 16634) exited with exit code 1
2025-03-17 16:45:40.414	
postgres | 2025-03-17 15:45:40.412 UTC [16640] FATAL:  terminating connection due to administrator command
2025-03-17 16:45:40.408	
postgres | 2025-03-17 15:45:40.406 UTC [653] LOG:  aborting any active transactions
2025-03-17 16:45:40.408	
proxy    | [WARNING]  (728) : All workers exited. Exiting... (130)
2025-03-17 16:45:40.408	
proxy    | [ALERT]    (728) : Current worker (730) exited with code 130 (Interrupt)
2025-03-17 16:45:40.408	
proxy    | [WARNING]  (728) : Exiting Master process...
2025-03-17 16:45:40.408	
proxy    | [NOTICE]   (728) : path to executable is /usr/sbin/haproxy
2025-03-17 16:45:40.408	
proxy    | [NOTICE]   (728) : haproxy version is 2.8.3-1~bpo12+1
2025-03-17 16:45:40.401	
monitor  | signal: interrupt
2025-03-17 16:45:40.401	
postgres | 2025-03-17 15:45:40.352 UTC [653] LOG:  received fast shutdown request
2025-03-17 16:45:40.401	
exporter | signal: interrupt
2025-03-17 16:45:40.355	
repmgrd  | [2025-03-17 15:45:40] [NOTICE] INT signal received

It looks like several of us are having this issue. In case it helps or someone figures out a fix, this is what I reported earlier today.

I just want to get a backup so I can move my database to another provider.

Please see Suspended database

Thanks, but it seemed different from your issue. Neither restoring from snapshot nor forking the volume worked. At the same time the web part of the app was failing to deploy. In the end we concluded it has to be an internal Fly issue. At same point Postgres has started without our action and from that point it seems to be working correctly.

1 Like

I was a bit too quick to declare solution. Just tried to increase RAM to 1GB on the instance to enable backups and we’re back to being stuck, this time in “replacing”. Machine ID 148ed127a00208 if some good soul from Fly would be willing to look into it.

There is clearly something wrong in WAW region. I’ve tried to create an empty postgres app from two different accounts and both attempts failed. It succeeded in FRA region.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.