Normal for health check to fail and system to restart on auto-stopped machine?

paulrudy · August 28, 2023, 4:58pm

Yes, exactly, and the data population only takes about a 90-second github action run.

I don’t think there was anything between them that one particular time. But now there is. Here’s the most recent log. It looks like typesense is restarting itself immediately, or was never killed:

2023-08-28T09:18:55.107 proxy <MACHINE_ID> <REGION> [info] Downscaling app typesense-nik9fiyd in region <REGION> from 1 machines to 0 machines. Automatically stopping machine MACHINE_ID
2023-08-28T09:18:55.112 app<MACHINE_ID> <REGION> [info] INFO Sending signal SIGINT to main child process w/ PID 255
2023-08-28T09:19:00.124 app<MACHINE_ID> <REGION> [info] INFO Sending signal SIGTERM to main child process w/ PID 255
2023-08-28T09:19:00.665 app<MACHINE_ID> <REGION> [info] INFO Main child exited with signal (with signal 'SIGTERM', core dumped? false)
2023-08-28T09:19:00.665 app<MACHINE_ID> <REGION> [info] INFO Starting clean up.
2023-08-28T09:19:00.666 app<MACHINE_ID> <REGION> [info] WARN hallpass exited, pid: 256, status: signal: 15 (SIGTERM)
2023-08-28T09:19:00.666 app<MACHINE_ID> <REGION> [info] 2023/08/28 09:19:00 listening on <IPV6_ADDRESS>:22 (DNS: [fdaa::3]:53)
2023-08-28T09:19:01.424 app<MACHINE_ID> <REGION> [info] I20230828 09:16:56.409983 348 raft_server.cpp:564] Term: 2, last_index index: 3, committed_index: 3, known_applied_index: 3, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 8
2023-08-28T09:19:01.424 app<MACHINE_ID> <REGION> [info] I20230828 09:16:56.410104 363 raft_server.h:60] Peer refresh succeeded!

... <similar typesense messages> ...

2023-08-28T09:19:01.424 app<MACHINE_ID> <REGION> [info] I20230828 09:18:56.423753 363 raft_server.h:60] Peer refresh succeeded!
2023-08-28T09:19:01.424 app<MACHINE_ID> <REGION> [info] I20230828 09:19:00.660526 261 typesense_server_utils.cpp:53] Stopping Typesense server...
2023-08-28T09:19:01.424 app<MACHINE_ID> <REGION> [info] I20230828 09:19:01.424450 348 typesense_server_utils.cpp:314] Typesense peering service is going to quit.
2023-08-28T09:19:01.424 app<MACHINE_ID> <REGION> [info] I20230828 09:19:01.424505 348 raft_server.cpp:829] Set shutting_down = true
2023-08-28T09:19:01.661 app<MACHINE_ID> <REGION> [info] [ 375.713067] reboot: Restarting system
2023-08-28T09:19:01.678 app<MACHINE_ID> <REGION> [info] I20
2023-08-28T09:19:09.756 health<MACHINE_ID> <REGION> [error] Health check on port 8108 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

I’m using the “gross” bash method of running multiple processes in order to run typesense plus the startup scripts. Maybe that’s preventing properly killing typesense? Startup script looks like this:

#!/bin/bash

set -m # turn on bash's job control

/opt/typesense-server --data-dir /data --api-key $<API_KEY> --enable-cors &

sleep 1 &&

<curl command to add API key to typesense> 
<curl command to add API key to typesense>
<curl command to run github workflow to populate typesense db>

fg %1

and fly.toml has kill_signal = "SIGINT" and kill_timeout = 5

Topic		Replies	Views
Restart machines when health check fails	2	71	February 24, 2025
Critical health check, but app not restarting? Questions / Help wishlist , appsv2	2	451	December 14, 2023
Machine starting and Health Checks Questions / Help autoscaling	1	25	January 25, 2025
One of my apps both machines just stopped Questions / Help	7	67	January 24, 2025
Scale to 0 with healthcheck Questions / Help machines , autoscaling	2	44	November 13, 2024

Normal for health check to fail and system to restart on auto-stopped machine?

Related topics