Machines not waking up for HTTP requests

I have an app with all the machines stopped and supposedly they should be waking up with new 1requests but they don’t?

In the logs I see this error:

Failed to proxy HTTP request (error: no known healthy instances found for route tcp/443. (hint: is your app shutdown? is there an ongoing deployment with a volume or using the ‘immediate’ strategy? if not, this could be a delayed state issue)). Retrying in 998 ms (attempt 70)

I haven’t updated this app in 10 days or so and it has been working fine since then until now.

Is that a recent log? We made a change last night, rolled back this morning thinking it was probably the cause: Automatically starting/stopping Apps v2 instances - #34 by AsymetricalData

I’ve been able to start the machine manually so the app is ok.

I have a number of those logs. The last one is from 2023-05-05T08:05:32Z

So the machine went to sleep after some inactivity and the last log now is:

2023-05-05T14:42:35.306 health[148e460b773238] ams [error] Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

Ok I’ve confirmed the machines are waking up now.

After processing a request the last log is:

2023-05-05T14:47:51.306 health[148e460b773238] ams [info] Health check on port 8080 is now passing.

1 Like

Something is still not working properly though.

Some requests are timing out:

2023-05-05T14:51:10.689 proxy[4d89696ae79438] ams [error] could not make HTTP request to instance: connection error: timed out

This app has like 8 machines so that shouldn’t be happening. The autoscaling policy is set to 1 soft limit and 3 hard limit.

And I’m still seeing this weird error message when the machine goes to sleep:

2023-05-05T14:53:31.700 health[6e82576a022258] ams [error] Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

We’re looking into it. I’m not sure what’s happening yet, but it looks like services are sticking around after the machine is stopped and they shouldn’t.

1 Like

Not sure if you’ve done anything on your end, but it all appears to be working as expected regarding machines waking up and processing requests :slight_smile:

Only thins funky error remains after the machine goes to sleep:

Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

That log is unrelated to the problems I believe. That’s a separate service on our nodes sending those in. I suppose they shouldn’t if the machine isn’t running!

1 Like

Ok it looks like 2 edge nodes didn’t get the rollback, one in LHR and one in CHI.

I have now fixed that. Hopefully the issue doesn’t happen anymore.

2 Likes

Thanks Jerome!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.