Fly Postgres HA Cluster: do apps ever read from the standby in the primary region?

roadmr · June 20, 2024, 5:31pm

Hi there,

Connections to port 5432 are handled by a haproxy server running on each Postgres unit and will always be forwarded to a writable instance (i.e. the leader on the primary region). You can see the configuration used for this here.

Connections to port 5433 go direct to Postgres units. So you can conceivably connect to read replicas using port 5433 and bypassing the “redirect to primary” behavior.

The main wrinkle with the above is that the logic described here works great when connecting to non-primary-region replicas; but within the primary region, all units are basically equal so discriminating which one is a read replica is more involved.

A way to do it is to resolve the name (your-db-app.internal) and round-robin requests to all IP addresses.

The resolution for a name will return a list of IP addresses, typically in the same order (by proximity / RTT). One of those is the primary. If you round-robin the requests though, you’ll end up distributing about 50% to the primary and another 50% to the read-only replica within the same region.

The fly-replay logic you want to implement is likely similar to what LiteFS uses and there’s also some Elixir code to achieve that here, in addition to the Ruby code in the global replication page.

Let me know if this helps!

Daniel