Unexpected Restarts

sebastian · September 17, 2020, 4:53pm

Hey there!

We recently experienced super slow load times of ~20s for request that usually take less than ~100ms. Our server is situated in the EWR region, and hasn’t been redeployed in over a day or so. In checking the status of our server, we noticed that it was restarted ~15m ago. Would love some additional info as to what might have happened/cause our server to restart, and how best to handle an event like this in the future.

Thanks!
Sebastian

kurt · September 17, 2020, 4:57pm

That restart column indicates that either the process crashed, or the health checks failed and was restarted. If enough of those happen it’ll actually replace the VM entirely.

The best way to check this is with flyctl logs -i <instance id>. Our log feature is somewhat rudimentary, but usually those restarts have a stack trace or something.

Also, if you run flyctl status --all you’ll see VMs that are no longer running. If they’re less than a few days old the logs command might still show you the last of their output.

sebastian · September 17, 2020, 5:03pm

Thanks Kurt, It looks we had a failed instance with a total of 6 restarts. In looking at the logs, it looks like our instance went from a health check status of “passing” to “critical” a few times. Is there any way to diagnose what caused this to happen?

Thanks again, Sebastian

kurt · September 17, 2020, 5:07pm

There’s not much beyond the app logs, unfortunately. When it goes critical it means the process isn’t responding to network connections (or http checks, if you have http checks configured in fly.toml). If the app just hung and didn’t log anything, there’s no real way for us to see why.

If you’re worried about that happening again it’s worth setting flyctl scale set min=2 to make sure there are always 2 VMs running. When one fails health checks we’ll send all the requests to the other until it recovers or is replaced.

Topic		Replies	Views
Cause of instance restart unclear	14	1173	December 11, 2020
scale count 15 but eventually no instances running (503 error) Questions / Help docs	2	610	December 16, 2022
restarts seem to sometimes not work?	7	306	April 1, 2022
When an app instance is moved/restarted, how can we determine why?	4	236	November 15, 2022
Instance or service not restarted when I expected it to Questions / Help	5	1150	July 26, 2022

Unexpected Restarts

Related topics