Machines not waking up for HTTP requests

pier · May 5, 2023, 2:34pm

I have an app with all the machines stopped and supposedly they should be waking up with new 1requests but they don’t?

In the logs I see this error:

Failed to proxy HTTP request (error: no known healthy instances found for route tcp/443. (hint: is your app shutdown? is there an ongoing deployment with a volume or using the ‘immediate’ strategy? if not, this could be a delayed state issue)). Retrying in 998 ms (attempt 70)

I haven’t updated this app in 10 days or so and it has been working fine since then until now.

jerome · May 5, 2023, 2:40pm

Is that a recent log? We made a change last night, rolled back this morning thinking it was probably the cause: Automatically starting/stopping Apps v2 instances - #34 by AsymetricalData

pier · May 5, 2023, 2:44pm

I’ve been able to start the machine manually so the app is ok.

I have a number of those logs. The last one is from 2023-05-05T08:05:32Z

pier · May 5, 2023, 2:46pm

So the machine went to sleep after some inactivity and the last log now is:

2023-05-05T14:42:35.306 health[148e460b773238] ams [error] Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

pier · May 5, 2023, 2:48pm

Ok I’ve confirmed the machines are waking up now.

After processing a request the last log is:

2023-05-05T14:47:51.306 health[148e460b773238] ams [info] Health check on port 8080 is now passing.

pier · May 5, 2023, 2:57pm

Something is still not working properly though.

Some requests are timing out:

2023-05-05T14:51:10.689 proxy[4d89696ae79438] ams [error] could not make HTTP request to instance: connection error: timed out

This app has like 8 machines so that shouldn’t be happening. The autoscaling policy is set to 1 soft limit and 3 hard limit.

And I’m still seeing this weird error message when the machine goes to sleep:

2023-05-05T14:53:31.700 health[6e82576a022258] ams [error] Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

jerome · May 5, 2023, 2:57pm

We’re looking into it. I’m not sure what’s happening yet, but it looks like services are sticking around after the machine is stopped and they shouldn’t.

pier · May 5, 2023, 3:08pm

Not sure if you’ve done anything on your end, but it all appears to be working as expected regarding machines waking up and processing requests

Only thins funky error remains after the machine goes to sleep:

Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

jerome · May 5, 2023, 3:09pm

That log is unrelated to the problems I believe. That’s a separate service on our nodes sending those in. I suppose they shouldn’t if the machine isn’t running!

jerome · May 5, 2023, 3:17pm

Ok it looks like 2 edge nodes didn’t get the rollback, one in LHR and one in CHI.

I have now fixed that. Hopefully the issue doesn’t happen anymore.

pier · May 5, 2023, 3:19pm

Thanks Jerome!

system · May 12, 2023, 3:19pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Failed to proxy HTTP request intermittent failure that last for 2-3mins	1	403	April 24, 2023
Todays incident 20 Dec, Volumes Endpoints returning 5XX volumes	2	218	December 28, 2023
could not make HTTP request to instance	2	463	August 4, 2023
could not find instance to route to	1	366	February 28, 2023
Help troubleshooting rogue machine connection autoscaling	5	14	February 10, 2025

Machines not waking up for HTTP requests

Related topics