VM Postgres : 'Cpu: system spent x.xxs of the last 10 seconds waiting on cpu'

Hi,
I have an app and a postgres app living as its db.
I have scaled the db app (shared x4) because I had trouble with CPU.
Now i’m still having this health error :

2023-06-12T09:23:38.525 health[9080559f54dd58] cdg [error] Health check for your postgres vm has failed. Your instance has hit resource limits. Upgrading your instance / volume size or reducing your usage might help. [✗] cpu: system spent 1.46s of the last 10 seconds waiting on cpu (73.72µs)

2023-06-12T09:23:53.519 health[9080559f54dd58] cdg [info] Health check for your postgres vm is now passing.

when i’m looking at the monitoring, there is no cpu issue…

Moreover, when I do a fly status on my db app, I got a weird message :

C:\Users\Pierre\display_geodata_api>fly status -a display-geodata-api-db
WARNING: Cluster size within your primary region "cdg" does not meet HA requirements. (expected >= 3, got 1)
ID              STATE   ROLE    REGION  CHECKS                  IMAGE                                   CREATED                 UPDATED
9080559f54dd58  started primary cdg     3 total, 3 passing      flyio/postgres-flex:15.3 (v0.0.42)      2023-06-06T15:18:44Z    2023-06-11T18:39:26Z

Of course, no doubt i’m doing / i did something wrong or I’m misunderstanding something.

Best

Hi @ThomasPoum

It seems that you are experiencing health check failures due to hitting resource limits on your Postgres instance. The warning message you see when running fly status indicates that your primary region “cdg” does not meet the High Availability (HA) requirements, which require at least 3 instances for redundancy.

More info:

Hello @francoab ,

Thks for your reply and the docs about HA

To be sure to understand the second point, 3 is a kind of magical number to be sure that in case of failures, a second and a third machine can take over. am I right?

Concerning my cpu usage issue, I don’t understand where i’m hitting ressource limit as the cpu seems to be almost unused…

@ThomasPoum That’s correct! High Availability & Global Replication · Fly Docs

And unsure on the CPU usage issue

Thanks @francoab, HA is much clearer now!

Let see if someone can enlight me about cpu usage

best,

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.