This started happening around April 22 for us.
Summary
Intermittent empty 400 Bad Request from Fly Proxy before the app handles the request.
This reproduces on both prod and staging Fly apps when rapid Node/undici requests use Connection: close. The same requests succeed with keep-alive or with a 5-second delay between requests.
Apps
prod app: findgood-work-f4fb
staging app: findgood-work-f4fb-staging
Endpoint
GET /hq/service-titan-proxy/jpm/v2/tenant/1053411235/jobs/316870122/notes?page=1&pageSize=5
Prod host:
https://hq.hepisontheway.com
Staging host:
https://hep.staging-findgood.work
Client Conditions
Fails with Node fetch/undici when sending:
Connection: close
flyio-debug: doit
Authorization: Bearer <token>
Accept: application/json
Does not fail with:
Connection: keep-alive / default Node fetch pooling
Does not fail when adding a 5-second delay between close-style requests.
Observed Staging Result
50 rapid attempts against staging:
200, 400, 200, 400, 200, 400, 200, 400, 200, 400,
200, 400, 200, 400, 200, 400, 200, 400, 200, 400,
200, 400, 200, 400, 200, 400, 200, 400, 200, 400,
200, 400, 200, 400, 200, 400, 200, 400, 200, 400,
200, 400, 200, 400, 200, 400, 200, 400, 200, 400
Same staging test with DELAY_MS=5000:
200 x 12
400 Response
status: 400
statusText: Bad Request
bodyLength: 0
content-type: null
server: Fly/9f7e98291c (2026-04-30)
via: 1.1 fly.io
duration: ~90-130ms
200 Response
status: 200
bodyLength: 4429
content-type: application/json; charset=utf-8
duration: ~250-600ms
Staging Failing Request IDs
01KQT0R3N8A7CHKKBRVRNK53Q1-dfw
01KQT0R453YQ70V7BZ3RRFRWRB-dfw
01KQT0R4P9TABQ9TASE3V7QRZ6-dfw
01KQT0R565W6F0VT1Y68V5DQFJ-dfw
01KQT0R5KRCFYNVPWTR62333SR-dfw
01KQT0RDYKY8DGREJ248TJTF7H-dfw
Prod Failing Request IDs
01KQT0AD4DSSJY750QMK085FRD-dfw
01KQT0ADFXETNW9WFZV3FPT5RA-dfw
01KQT0ADTRW7Z3H827ZDZ354P5-dfw
01KQT0AEE25S4JR10V250EGVM1-dfw
01KQT0AEGS143QZG9T018FF9C2-dfw
01KQT0AEWB607MD0HZ4EB03CNT-dfw
01KQT0AHHGFJSJNMGZ7M8E8DG6-dfw
01KQT0AJDCSAZ5RCVG8EBQQX7C-dfw
Example Staging flyio-debug for Failing 400
{
"n": "edge-cf-dfw1-2432",
"nr": "dfw",
"ra": "162.81.188.54",
"rf": "Verbatim",
"sr": "dfw",
"sdc": "dfw1",
"sid": "32872dd3f70438",
"st": 0,
"nrtt": 0,
"bn": "worker-lsh-dfw1-f574",
"mhn": null,
"mrtt": null
}
Example Prod flyio-debug for Failing 400
{
"n": "edge-cf-dfw1-9a88",
"nr": "dfw",
"ra": "162.81.188.54",
"rf": "Verbatim",
"sr": "dfw",
"sdc": "dfw1",
"sid": "2872470a143078",
"st": 0,
"nrtt": 0,
"bn": "worker-dp-dfw1-fe80",
"mhn": null,
"mrtt": null
}
App Log Evidence
fly logs -a findgood-work-f4fb --no-tail shows only 200 morgan entries for the endpoint during the repro window. The 400 attempts do not appear in app logs.
This suggests Fly selects a machine (sid present) but returns 400 before the request reaches Express/morgan.
Non-Fly Comparisons
Local app proxy:
http://hep.localhost.localhost:3000/hq/service-titan-proxy/...
Results:
100/100 200 with Connection: close
Direct upstream ServiceTitan requests also did not reproduce.
Question for Fly
Why does Fly Proxy return empty 400 for rapid HTTP/1.1 close-style Node/undici requests after selecting a machine, while the app never logs the failed request?
Is this a known Fly Proxy connection teardown/reuse issue around Connection: close?