502 Bad Gateway on Image Uploads

Hi, my application has support for uploading images via multipart request to our server for manipulation and then tigris.

I got the feature working locally, however for some reason when deployed I begin to see 502 Bad Gateway errors. These errors don’t even seem to appear in “Live Logs” for some reason so i’m inclined to believe that Fly.io infra is blocking these requests for some reason.

Is there any configuration I need to add to my fly.toml to enable image uploads via multipart requests? Or does anyone know why the 502 Gateway error might be happening?

Hi,

Strange the requests aren’t even making it to the app (I’d assume if they were, they would appear in the logs).

I can’t imagine Fly would be blocking them.

A 502 can result from the proxy/frontend not getting a response back (or the app closing the connection too soon). The default is for Fly to start up machines only when needed. That reduces the cost, but also means that one may not be available to handle the request. Possibly resulting in a 502. So at least for now, I’d recommend making sure there is at least one machine available to rule that out as the cause of the 502. In the fly.toml

auto_stop_machines = false
auto_start_machines = true
min_machines_running = 2

… then tail the logs, then do an upload, and then see if you still get a 502. If you do, there could still be a timeout (if the file is large). But it would rule out one variable.

@greg thank you for the response!

I’ve updated the fly.toml config to always have machines running and set auto_stop_machines to false. However the issue is still persisting. The behavior is quite odd, because there is no timeout. It’s is an instant 502 on the upload file request.

I have my code instrumented to show traces and logs when a endpoint is consumed, tracing info, warnings, and errors logs. However for some reason only on this request is there an Instant 502 bad gateway. None of the logs ever show that this handler was ever even hit. This is the only request that is a multpart http request so I don’t think that is also a coincidence, but there also is no timeout.

Locally everything works seamlessly and only in deployed environment is this issue happening.

1 Like

Strange :thinking:

Ok, it’s not the absence of machine causing the 502 then.

The fact the 502 is instant suggests there is some issue with the request, perhaps malformed. Which could explain why Fly’s proxy is stopping it.

Does that app return a 200 for other requests? Like … if you stick a /healthcheck route in there, call that, do you get a success back? That would confirm requests are getting to it and the 502 is only for this request.

Total guess but the other thing could be the involvement of Tigris (since presumably locally you are not pushing to that but running minio, or some S3 simulator. So that would differ). If the file is being uploaded to your server and then afterwards uploaded to Tigris (e.g by a cron), I’d perhaps remove Tigris from the equation e.g comment it out, comment out any secrets for the AWS S3 SDK etc. At least get a file up to your server before adding another service/variable in which could be causing the issue.

@greg all other requests return a 200 this happens to be the only request that is sending a multipart request with image data.

It’s really odd behavior to get a 502, I would expect a different type of request error that would not be instant. I think this really might be due to something on fly.io since i’m not even getting any logs to the endpoint about the network request starting.

1 Like

I figured it out it had to do with my container not installing ca-certicates and updating them in the container prior to building our application.

1 Like

@kennet Ah! Nice catch. I never would have guessed that.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.