Hi everyone,
Hoping someone might have insights into a persistent issue we’re facing with our Remix app deployed on Fly.io.
Context:
-
Stack: Remix v2, remix-serve, Node 18 (Alpine), Fly Postgres.
-
VM: Currently shared-cpu-2x (512MB RAM, scaled up during testing).
-
Workflow: A Shopify theme extension sends a POST request with multipart/form-data (containing a ~1-2MB image) via Shopify’s App Proxy to our backend endpoint (/apps/our-app-subpath/process-image?shop=…) for image processing.
The Problem:
The browser consistently gets a 500 Internal Server Error for these specific POST requests. The core issue seems to be that the request never reaches our Remix application code .
-
We’ve added logging to the absolute entry point (app/entry.server.tsx).
-
fly logs show standard GET requests (health checks, etc.) hitting entry.server.tsx successfully.
-
However, the POST requests with multipart/form-data do not appear in the entry.server.tsx logs at all.
What We’ve Tried:
-
Verified App Proxy Config: Shopify App Proxy URL points correctly to our base Fly hostname (https://app2-virtual-try-on.fly.dev).
-
Scaled VM: Increased RAM to rule out memory exhaustion during request buffering. Problem persists.
-
Fixed Startup Issues: Resolved previous unrelated Prisma client init errors that caused crashes. The app now starts cleanly and passes health checks reliably (previous deployment warnings about the app “not listening” are gone).
-
Simplified Handler: Deployed a minimal Remix action handler returning only 200 OK; the request still didn’t reach it.
-
Local Testing: Saw a similar 500 symptom locally with Shopify CLI tunnel, but curl tests did reach the local handler, suggesting the local problem was tunnel-specific.
Hypothesis:
Since GETs work, the app is healthy, and the POST request vanishes before entry.server.tsx, we suspect the Fly.io proxy/load balancer might be dropping or rejecting these specific multipart/form-data POST requests (possibly due to size, content, or interactions with the Shopify App Proxy forwarding) before they hit our app container.
Questions:
-
Has anyone else hit issues with multipart/form-data POSTs (especially >1MB or via App Proxy) getting dropped before reaching their app on Fly.io?
-
Are there specific Fly proxy configurations (body size limits, timeouts, filters) that might silently cause a 500 without logging anything in fly logs?
-
Any other ideas for diagnosing requests that seem to fail at the Fly infrastructure level before the application VM?
We’re a bit stuck debugging this “black hole”. Any pointers would be hugely appreciated!
Thanks!