Hi all,
Anyone running a Node.js app on Fly Machines against Managed Postgres (Crunchy, PG 17.4) with the pg library and queries returning large jsonb payloads (~600 KB+ rows)?
We see intermittent hangs at modest load (3 websites sites with large traffic, “Launch” MPG at 60/200 connections, 3% CPU): roughly 10% of requests on queries returning large payloads never resolve, and the client-side timeout fires after 10s. At that moment, the TCP socket is alive, ~950 KB of bytes did arrive on it, but pg.activeQuery stays true forever (no ReadyForQuery parsed). A fresh SELECT 1 from the same pool replies in 3 ms, and PG-side the query completed normally.
Our guess: TCP chunking on Fly 6PN (smaller MTU than AWS VPC) → many more small segments per response → exposes a slow path in the pure-JS pg-protocol parser. But it’s a guess.
Anyone seen this, or know of a workaround beyond “retry on a fresh conn”? Happy to share more diagnostic details if useful.
Thanks!