This morning we noticed requests to one of our projects was failing. We checked the deployment and it showed the maintenance icon.
We checked the logs, there were no logs about it exiting. Just the logs for the previous request.
So we just restarted and it went back online.
Here are the logs, the last successful request and then when we restarted.
2023-01-10T15:25:28.762 app[47e967c0] gru [info] {"latencyInNs":228000000,"level":"info","message":"POST /token 200 228ms","method":"POST","statusCode":200,"url":"/token"}
2023-01-10T16:42:18.004 runner[47e967c0] gru [info] Starting instance
2023-01-10T16:42:32.307 runner[47e967c0] gru [info] Configuring virtual machine
2023-01-10T16:42:41.891 runner[47e967c0] gru [info] Pulling container image
2023-01-10T16:46:31.809 runner[47e967c0] gru [info] Unpacking image
2023-01-10T16:46:58.481 runner[47e967c0] gru [info] Preparing kernel init
2023-01-10T16:48:14.311 runner[47e967c0] gru [info] Configuring firecracker
2023-01-10T16:48:16.056 runner[47e967c0] gru [info] Starting virtual machine
2023-01-10T16:48:16.281 app[47e967c0] gru [info] Starting init (commit: f447594)...
2023-01-10T16:48:16.355 app[47e967c0] gru [info] Preparing to run: `docker-entrypoint.sh pnpm run start` as root
2023-01-10T16:48:16.389 app[47e967c0] gru [info] 2023/01/10 16:48:16 listening on [fdaa:0:3bd8:a7b:1f63:47e9:67c0:2]:22 (DNS: [fdaa::3]:53)
In Graphana it looks like the project was just off for that amount of time.
My question is, why didn’t the deployment restart? If something failed it should show those logs and restart. And what can I do so this doesn’t happen again.
Thanks for the help!