Fly response times

I’ve recently moved my personal website to fly. I have uptime tracking setup for the website and since moving to fly I’m seeing the response times regularly spike to quite large times. Before the move I had quite constant 100-600ms response times, now I see spikes up to 10s on the regular intermixed with sections of times where response times are as expected.

I’m wondering where those spikes would be coming from.

Hi,

I recall other mentions of around 10 second delays … which makes me wonder if that’s a new machine being started on-demand :thinking:

The default is for machines to stop when not in use, which is perhaps not obvious to new users: Autostop/autostart Machines · Fly Docs

As a result if a machine is not running and a request arrives, there is a delay while that happens. 10s seems long but it’s possibly caused by that. I’d start by disabling that and see if the problem goes away. If not you can continue to debug. If it does, that would be the cost trade-off.

1 Like

To add to what Greg said, what tech stack are you using? 10s is extremely long.
Looking at the timeline, I don’t think the issue is from a cold boot.

1 Like

This is using elixir and phoenix. The uptime check runs every 5 minutes, so given the documented “Fly Proxy should take when the app is idle for several minutes” it might or might not run into this, depending on what “several” means.

If the uptime check is pinging your app every 5minutes, that should keep the instance awake. Do you see “excess capacity” anywhere in your logs? That means it autostops due to inactivity.

Try setting auto_stop_machines = 'suspend' and redeploy to see if you still get those big spikes. If not, then your app stack’s initialization is somehow slow.

Yeah, I do see such messages from ams (which is my closest region and the primary one), so it could indeed be scaling. I’ve for now set min_machines_running = 1 and will continue to monitor.

Edit: comparing the logs with my spikes does certainly suggest correlation.

You should investigate why your app is taking 10s to start though.

1 Like

The spikes have been gone the last two days, so this was indeed the auto stop behaviour. While I was aware of it I wasn’t expecting it to regulartly trigger within the 5 minute timeframe of the uptime check.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.