Requests to a suspended machine are taking a long time

paulactually · July 25, 2025, 12:01am

When my machine is put into suspend mode due to inactivity, the first request that triggers the wake up takes a long time to be returned (or never returns) to the user

The logs actually show the machine waking up correctly:

		2025-07-25 10:15:50.980	machine became reachable in 6.717712ms
		2025-07-25 10:15:50.974	machine started in 223.935944ms
		2025-07-25 10:15:50.972	Machine started in 218ms
		2025-07-25 10:15:50.839	2025-07-24T22:15:50.839963167 [01K0Z6JEQ0E2TWFP5712F38WPH:fc_api] The API server received a Put request on “/logger” with body “{"log_path":"logs.fifo","level":"info"}”.
		2025-07-25 10:15:50.839	2025-07-24T22:15:50.839889975 [01K0Z6JEQ0E2TWFP5712F38WPH:fc_api] API server started.
		2025-07-25 10:15:50.839	2025-07-24T22:15:50.839571880 [01K0Z6JEQ0E2TWFP5712F38WPH:main] Listening on API socket (“/fc.sock”).
		2025-07-25 10:15:50.839	2025-07-24T22:15:50.839442498 [01K0Z6JEQ0E2TWFP5712F38WPH:main] Running Firecracker v1.12.1
		2025-07-25 10:15:50.750	Starting machine
		2025-07-25 10:02:13.631	Virtual machine has been suspended

But a response takes over 30 seconds, so it appears to the user that it hangs. Refreshing the page resolves the issue as the machine is awake for the next request.

Here is the response time from Postman:

When I do it from the browser, it seems to never return (stays in the “pending” status indefinitely).

When I manually suspend the machine (as opposed to waiting for it to suspend itself), it seems to wake up and respond correctly.

I have also seen it wake up correctly, but the majority of the time, the response is never returned.

paulactually · July 25, 2025, 1:18am

I have tried using auto_stop_machines = true rather than ‘suspend’, and that seems to reliably return from the stopped state. It would obviously be better to have the same behaviour from the suspended state

pavel · July 25, 2025, 8:37am

Hey @paulactually

Could you set flyio-debug: doit header on this request and post fly-request-id value from the response here, please?

paulactually · July 27, 2025, 9:16pm

Hi @pavel.

I did a request with that header this morning. The fly fly-request-id was: 01K16V7AZ4ZJ1VRBT2NDS0AD7T-syd. There was also the flyio-debug header with the value:

{"n":"edge-cf-syd1-777c","nr":"syd","ra":"125.236.220.56","rf":"Verbatim","sr":"syd","sdc":"syd1","sid":"0801693a191618","st":0,"nrtt":1,"bn":"worker-cf-syd1-519a","mhn":null,"mrtt":null}

Here is the full request and response headers:

Also, the logs for that request:

2025-07-28 09:04:37.576	
machine became reachable in 10.463414ms
2025-07-28 09:04:37.566	
machine started in 212.047468ms
2025-07-28 09:04:37.564	
Machine started in 206ms
2025-07-28 09:04:37.447	
2025-07-27T21:04:37.447628939 [01K16SX00PT0HVST9PY8HNS3YG:fc_api] The API server received a Put request on "/logger" with body "{\"log_path\":\"logs.fifo\",\"level\":\"info\"}".
2025-07-28 09:04:37.445	
2025-07-27T21:04:37.445903630 [01K16SX00PT0HVST9PY8HNS3YG:fc_api] API server started.
2025-07-28 09:04:37.445	
2025-07-27T21:04:37.445614085 [01K16SX00PT0HVST9PY8HNS3YG:main] Listening on API socket ("/fc.sock").
2025-07-28 09:04:37.445	
2025-07-27T21:04:37.445436032 [01K16SX00PT0HVST9PY8HNS3YG:main] Running Firecracker v1.12.1
2025-07-28 09:04:37.353	
Starting machine
2025-07-28 08:48:09.628	
Virtual machine has been suspended

Thanks. Paul.

pavel · July 28, 2025, 8:55am

Hmm, I don’t see anything wrong in our logs.

When you made the request, the proxy woke up the machine and established a new connection to it. It took the app ~30s to respond:

21:04:37.576626000: backhaul -> backend: Request { method: GET, ... }
21:05:08.654473000: backhaul <- backend: Response { status: 200, ... }

Does your app need to talk to some external resource (e.g. a database) to serve such request? If so, it could be that there are connections to the external resource in the pool that are already dead (because the machine was suspended), but it takes a while for the TCP/IP stack/client libraries to realize that once the machine is resumed.

Could you add some logs to make it easier to understand where the app spends the time while serving the request?

Topic		Replies	Views
Suspended machines woken up by PUT from fc_api every few mins logs , machines , autoscaling	8	75	October 3, 2024
Machine created via API is not suspending troubleshooting , machines , autoscaling	13	87	May 11, 2025
PUT Request on `/logger` waking up suspended machines every few minutes Questions / Help machines , autoscaling	9	72	March 23, 2025
Autosuspend is here! (+ Machine suspension is enabled everywhere) Fresh Produce machines , proxy	25	6759	June 25, 2025
Logs missing when waking a suspended machine Questions / Help logs , machines , autoscaling	3	43	February 6, 2025

Requests to a suspended machine are taking a long time

Related topics