I’m running a production express JS + Remix server in production. Since 2 days, we noticed that we have really slow performances when fetching JS, or even translation files. It happens that a basic components.json file is taking like 10s to be sent to the browser.
My config is 3 shared-cpu-8x in the cdg region and I’ve tried several configurations since in order to attempt to fix this issue.
Moreover, when i’m requesting my files locally on the machine after doing a basic fly ssh console and curl http://localhost:3000/locales/en/common.json, i’m getting the response immediatly after 20ms. Whenever i use the fly.dev dns, it can be fast but it can’t take a long time sometimes like this:
I’ve been told they made some kernel changes that may be caused memory issues. Look at your memory usage, if it’s near capacity, you’ll get massive latency and degradation of your app.
Further to @khuezy great suggestion (that’s entirely possible), it would also be worth looking at whether the slow requests are due to a machine having to be started on-demand. I believe that is the default. Normally that’s great as it saves a lot of money not having idle machines doing nothing. But it can result in the first request (the one triggering the machine to start) to be slow. You could also try temporarily not having that to see if it makes a difference.
Thanks for the answer !
Yup it started to be really slow like 2 or 3 days ago. We didn’t change anything. The app was not the fastest on the market but it was not the slowest. We are really fine in memory, as you can see here :
I’ve tried to be the swap_memory to 512Mb to see but it doesn’t do much.
Regarding the autostop feature. I’ve redeployed my app with the autostop set to ‘off’ and autostart to true. I will keep you updated if that solves anything.
Maybe a routing issue? That would need Fly to investigate. They would need a traceroute from you I guess. It might be worth seeing if it’s only slow for you (because of your ISP) or for everyone. If the app is public you could put its URL into a tool like Website Speed Test - Check Full Page Performance | KeyCDN Tools and choose different locations to see how quickly it loads. If they are fast, it’s likely your route/ISP. If they are also really slow, it can’t be that.
I’ve put all my machines to LHR and it seems to be much faster. There is definitely an issue with the CDG region. I’ve checked https://status.flyio.net/ and other monitoring stuffs but was warned about this.
I’m still facing the issue, recreating the machines did not work. I deployed the app machine to AMS without any luck either, probably because the DB and Redis are still in CDG.
Can we have an update from fly.io’s team?