I started to have strange behavior on one of the pages. Just 1 page could not be loaded out of whole project and I got 502 HTML error thrown. Then I checked the logs on VM and found “resource limit reached error” and I tried to scale up memory for that machine (flyctl scale memory) … then everything went sideways and my VM going in infinite loop with error massages:
2025-11-20T06:06:15.523 app[78432debe79658] sin [info] proxy | exit status 1
2025-11-20T06:06:15.523 app[78432debe79658] sin [info] reader error: read ptm: input/output error
2025-11-20T06:06:15.524 app[78432debe79658] sin [info] proxy | restarting in 1s [attempt 283]
2025-11-20T06:06:16.063 app[6830311bd62e18] sin [info] monitor | Voting member(s): 2, Active: 2, Inactive: 0, Conflicts: 0
2025-11-20T06:06:16.201 app[6830311bd62e18] sin [info] proxy | Running...
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [NOTICE] (1949) : haproxy version is 2.8.5-1ubuntu3.4
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [NOTICE] (1949) : path to executable is /usr/sbin/haproxy
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg1' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg2' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg3' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg4' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg5' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg6' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg7' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg8' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg9' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : parsing [/fly/haproxy.cfg:38] : backend 'bk_db', another server named 'pg10' was already defined at line 37, please use distinct names.
2025-11-20T06:06:16.254 app[6830311bd62e18] sin [info] proxy | [ALERT] (1949) : config : Fatal errors found in configuration.
2025-11-20T06:06:16.256 app[6830311bd62e18] sin [info] repmgrd | [2025-11-20 06:06:16] [INFO] monitoring primary node "6830311bd62e18" (ID: 377450399) in normal state
2025-11-20T06:06:16.258 app[6830311bd62e18] sin [info] proxy | exit status 1
All PG machines are up and running, all health check passed but I can’t reach any page and 502 thrown. Can anyone suggest the direction where to search please?
Hi again… Managed Postgres is available in Singapore now, so that would be my main recommendation. All these scaling, poking around, and snapshot recovery headaches will go away!
There’s a separate procedure for scaling Legacy Postgres, so I’m not entirely surprised that the above caused problems.
Legacy Postgres was really only intended for people who were expert database administrators already—and just wanted to save a little time on typing, etc. The puzzles and emergencies will only get worse as the months wear on and all this gets more and more deprecated…
I understand… And need to explore this possibility further. Unfortunately I have another project running and had to time to do a proper maintenance for this one.
What can be done to restore the project? Now it is all down as server overwhelmed with those errors and can’t accept anymore requests
I managed to create new PG cluster using volume image. After altering the DATABASE_URL of old app to introduce new cluster I could get all the data back.
My initial problem though remails …. one page does not want to load! I don’t see anything in machine logs for neither PG nor app
Logs from my app
2025-11-24T20:31:41.363 app[1781345a465e58] arn [info] [ 440.533086] reboot: Restarting system
2025-11-24T20:34:25.773 app[1781034b441038] fra [info] [2025-11-24 20:34:25 +0000] [650] [CRITICAL] WORKER TIMEOUT (pid:668)
2025-11-24T20:34:25.774 app[1781034b441038] fra [info] [2025-11-24 20:34:25 +0000] [668] [INFO] Worker exiting (pid: 668)
2025-11-24T20:34:25.774 proxy[1781034b441038] fra [error] [PU02] could not complete HTTP request to instance: connection closed before message completed
2025-11-24T20:34:25.951 app[1781034b441038] fra [info] [2025-11-24 20:34:25 +0000] [669] [INFO] Booting worker with pid: 669
Logs from PG
2025-11-24T20:17:38.239 app[287454ea4e0418] sin [info] repmgrd | [2025-11-24 20:17:38] [INFO] monitoring primary node "287454ea4e0418" (ID: 377450399) in normal state
2025-11-24T20:22:31.191 app[287454ea4e0418] sin [info] monitor | Voting member(s): 3, Active: 3, Inactive: 0, Conflicts: 0
2025-11-24T20:22:40.189 app[287454ea4e0418] sin [info] repmgrd | [2025-11-24 20:22:40] [INFO] monitoring primary node "287454ea4e0418" (ID: 377450399) in normal state
2025-11-24T20:27:26.879 app[287454ea4e0418] sin [info] postgres | 2025-11-24 20:27:26.878 UTC [708] LOG: checkpoint starting: time
2025-11-24T20:27:27.384 app[287454ea4e0418] sin [info] postgres | 2025-11-24 20:27:27.384 UTC [708] LOG: checkpoint complete: wrote 6 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.503 s, sync=0.001 s, total=0.506 s; sync files=6, longest=0.001 s, average=0.001 s; distance=13 kB, estimate=13 kB; lsn=0/13259240, redo lsn=0/13259208
2025-11-24T20:27:31.142 app[287454ea4e0418] sin [info] monitor | Voting member(s): 3, Active: 3, Inactive: 0, Conflicts: 0
2025-11-24T20:27:42.190 app[287454ea4e0418] sin [info] repmgrd | [2025-11-24 20:27:42] [INFO] monitoring primary node "287454ea4e0418" (ID: 377450399) in normal state
2025-11-24T20:32:31.125 app[287454ea4e0418] sin [info] monitor | Voting member(s): 3, Active: 3, Inactive: 0, Conflicts: 0
2025-11-24T20:32:44.201 app[287454ea4e0418] sin [info] repmgrd | [2025-11-24 20:32:44] [INFO] monitoring primary node "287454ea4e0418" (ID: 377450399) in normal state
2025-11-24T20:37:31.185 app[287454ea4e0418] sin [info] monitor | Voting member(s): 3, Active: 3, Inactive: 0, Conflicts: 0
2025-11-24T20:37:44.318 app[287454ea4e0418] sin [info] repmgrd | [2025-11-24 20:37:44] [INFO] monitoring primary node "287454ea4e0418" (ID: 377450399) in normal state
Hm… You do have an error in the web-app logs there…
This Machine is in Germany, but the database is way over in Singapore.
Long-distance Postgres is something that is really best avoided, when you can. What is the full list of regions that your web app has Machines in?
Also, you can try increasing the timeout on that worker. Possibly you’re doing something computationally intensive that just won’t fit into the framework’s default time slice.