Hi there, I run an file hosting service with @wowjesus. For a short description so you guys can get the gist of things, users can upload files on our service to be shared with a link.
We’ve been using Fly for a while now and it works fine with smaller files for the most part, but trying to upload a large file, like 1MB+, to our service seems to take an extremely long time, around 70 seconds, whereas uploading smaller files can take from 70ms-500ms which is expected. We’re coming from Google Cloud which handled these files fine, we moved to Fly as it was much cheaper.
What we’re wondering is if there is anything at all we’re doing wrong (such as VM sizes), that could be causing this huge processing delay for larger files? Or is it something to do with Fly?
In this thread@wowjesus mentions that you’re writing directly over the private network instead of sending uploads requests to the primary region. Maybe this is related?
In that scenario, it’s possible that the upload is being transferred twice: once to the secondary region VM, then again across the private network to the primary.
Are uploads from the primary region also slow? Is there a way I could test your app?
A random string gets generated (name) and the response is sent back before parsing the file (or gets partially parsed if the user’s response parsing setting needs it), after that, the file gets uploaded to an AWS bucket and the write to the database happens.
Oh yes with our recently rewritten backend, we are facing a few issues, so uploading from the dashboard is no yet possible, here is a Postman collection to upload files: https://www.getpostman.com/collections/81ada7f32e950499e07d
Body and access_key query param needs to be inputted, you can create an API (access) key at: Fileglass — Dashboard
I was trying to setup an example with Nestjs, but I see that multipart form uploads are not supported with the Fastify adapter. How are you handling the uploads?
Will you add the fly-request-id header to your logs then trigger a slow upload for us? We can look at specific requests in logs. I see a fair number of errors caused by timeouts reading request bodies on your app, but I’m not sure these are exactly the same.
I noticed this is proxying through Cloud Flare. It would be worth testing directly to Fly.io without Cloud Flare in the mix. My upload was routed through Toronto (I’m in Chicago), so one possibility is that Cloud Flare is routing you through a region that’s not close.
After some discussion with the users experiencing this, turns out that they were being routed to Singapore (they are in Qatar), and Singapore is further away than European nodes (Frankfurt, Amsterdam), so that could be the issue, i’ll tweak the logs in a second