Need Help With Postgresql

I have a development postgresql that has been running a few weeks. However, over the weekend, it looks like it became unavailable.

Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
afe2b24f app 0 sjc run running (failed to co) 3 total, 3 critical 1 2022-01-22T03:03:10Z

% fly logs is showed a no space left on device error, too.

2022-02-07T15:45:43.989 app[afe2b24f] sjc [info]keeper | 2022-02-07 15:45:43.989 UTC [825] PANIC: could not write to file “pg_wal/xlogtemp.825”: No space left on device

I was monitoring my DB size, and hoping to prevent banging against any limits, but I guess I failed.

Just wondering what my options are at this point?

It looks like by default the volume size was set to 1gb. Oops. I had assumed it was 3gb for the free test environment. Would it be possible for me to increase the size to 2gb or 3, so I can at least recover the data?

Thanks!
Jae

If your data isn’t taking up the space, sounds like it could be due to log files accumulating:

If you feel comfortable resizing yourself and its development data isn’t critical, this is how they suggest to resize the volume yourself (up, rather than down). So it sounds fixable:

Or you may want someone from Fly to handle that.

… but maybe hold off doing any changes until this is resolved:

Cool. Thanks for the links. Im pretty sure it was just running out of the storage space. I did not realize it was allocated to just 1GB. Oh well. It def would be nice to get a warning, before having things break. :wink:

I am just evaluating and testing things out. Thus, no harm done. I guess for any others, just be careful about hitting storage limits!

1 Like

This is something we should automate, but you can get your DB healthy again by clearing out WAL files.

  1. fly ssh console
  2. Run pg_controldata -D /data/postgres/, then look for a line like this:
    Latest checkpoint's REDO WAL file:    00000001000000020000002F
    
  3. Then run
    pg_archivecleanup /data/postgres/pg_wal 00000001000000020000002F
    

Once it’s healthy again, you can add a 3GB volume, temporarily scale to 2 instances, then remove the old volume and scale back to 1 instance.