I have an existing Rails app that I’ve hosted on Fly for a while.
Recently I regenerated Dockerfile and fly.toml and it’s all working fine.
However, when I want to add health checks, it doesn’t want to work. This is what I tried in my fly.toml:
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = false
processes = ["app"]
[[http_service.checks]]
grace_period = "10s"
interval = "30s"
method = "GET"
timeout = "5s"
path = "/up"
And locally /up works fine, showing the new rails 7.1 health page.
And this is what’s in fly logs:
2023-11-14T06:58:20.814 app[9080576c5e3d08] fra [info] INFO [fly api proxy] listening at /.fly/api
2023-11-14T06:58:20.821 app[9080576c5e3d08] fra [info] 2023/11/14 06:58:20 listening on [fdaa:0:c61d:a7b:b9b7:a935:e5dd:2]:22 (DNS: [fdaa::3]:53)
2023-11-14T06:58:23.682 app[9080576c5e3d08] fra [info] 06:58:23 web.1 | started with pid 334
2023-11-14T06:58:23.682 app[9080576c5e3d08] fra [info] 06:58:23 sidekiq.1 | started with pid 335
2023-11-14T06:58:25.311 app[9080576c5e3d08] fra [info] 06:58:25 web.1 | => Booting Puma
2023-11-14T06:58:25.311 app[9080576c5e3d08] fra [info] 06:58:25 web.1 | => Rails 7.1.2 application starting in production
2023-11-14T06:58:25.311 app[9080576c5e3d08] fra [info] 06:58:25 web.1 | => Run `bin/rails server --help` for more startup options
2023-11-14T06:58:27.907 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] Puma starting in cluster mode...
2023-11-14T06:58:27.908 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Puma version: 6.4.0 (ruby 3.2.2-p53) ("The Eagle of Durango")
2023-11-14T06:58:27.908 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Min threads: 5
2023-11-14T06:58:27.908 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Max threads: 5
2023-11-14T06:58:27.909 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Environment: production
2023-11-14T06:58:27.909 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Master PID: 334
2023-11-14T06:58:27.909 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Workers: 4
2023-11-14T06:58:27.909 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Restarts: (✔) hot (✖) phased
2023-11-14T06:58:27.909 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Preloading application
2023-11-14T06:58:27.910 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] * Listening on http://0.0.0.0:3000
2023-11-14T06:58:27.914 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] Use Ctrl-C to stop
2023-11-14T06:58:27.950 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] - Worker 0 (PID: 353) booted in 0.03s, phase: 0
2023-11-14T06:58:27.950 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] - Worker 1 (PID: 354) booted in 0.02s, phase: 0
2023-11-14T06:58:27.950 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] - Worker 2 (PID: 359) booted in 0.02s, phase: 0
2023-11-14T06:58:27.953 app[9080576c5e3d08] fra [info] 06:58:27 web.1 | [334] - Worker 3 (PID: 360) booted in 0.02s, phase: 0
2023-11-14T06:58:28.061 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.060Z pid=335 tid=3rb INFO: Booted Rails 7.1.2 application in production environment
2023-11-14T06:58:28.061 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.060Z pid=335 tid=3rb INFO: Running in ruby 3.2.2 (2023-03-30 revision e51014f9c0) +YJIT [x86_64-linux]
2023-11-14T06:58:28.061 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.061Z pid=335 tid=3rb INFO: See LICENSE and the LGPL-3.0 for licensing details.
2023-11-14T06:58:28.061 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.061Z pid=335 tid=3rb INFO: Upgrade to Sidekiq Pro for more features and support: https://sidekiq.org
2023-11-14T06:58:28.062 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.061Z pid=335 tid=3rb INFO: Sidekiq 7.2.0 connecting to Redis with options {:size=>10, :pool_name=>"internal", :url=>"redis://:REDACTED@top1.nearest.of.visualizer-redis.internal:6379/1"}
2023-11-14T06:58:28.068 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.067Z pid=335 tid=3rb INFO: Sidekiq 7.2.0 connecting to Redis with options {:size=>5, :pool_name=>"default", :url=>"redis://:REDACTED@top1.nearest.of.visualizer-redis.internal:6379/1"}
2023-11-14T06:58:28.077 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.076Z pid=335 tid=3rb INFO: Loading Schedule
2023-11-14T06:58:28.077 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.076Z pid=335 tid=3rb INFO: Scheduling SharedShotCleanupJob {"every"=>"1 hour", "class"=>"SharedShotCleanupJob"}
2023-11-14T06:58:28.081 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.080Z pid=335 tid=3rb INFO: Scheduling FillAutocompleteValuesJob {"every"=>"1 hour", "class"=>"FillAutocompleteValuesJob"}
2023-11-14T06:58:28.084 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.084Z pid=335 tid=3rb INFO: Scheduling DuplicateStripeSubscriptionsJob {"every"=>["1 day", {"first_in"=>"5m"}], "class"=>"DuplicateStripeSubscriptionsJob"}
2023-11-14T06:58:28.087 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.087Z pid=335 tid=3rb INFO: Scheduling AirtableWebhookRefreshAllJob {"every"=>["6 day", {"first_in"=>"5m"}], "class"=>"AirtableWebhookRefreshAllJob"}
2023-11-14T06:58:28.090 app[9080576c5e3d08] fra [info] 06:58:28 sidekiq.1 | 2023-11-14T06:58:28.090Z pid=335 tid=3rb INFO: Schedules Loaded
could not find a good candidate within 90 attempts at load balancing. last error: no known healthy instances found for route tcp/443. (hint: is your app shut down? is there an ongoing deployment with a volume or are you using the 'immediate' strategy? have your app's instances all reached their hard limit?)
could not find a good candidate within 90 attempts at load balancing. last error: no known healthy instances found for route tcp/443. (hint: is your app shut down? is there an ongoing deployment with a volume or are you using the 'immediate' strategy? have your app's instances all reached their hard limit?)
could not find a good candidate within 90 attempts at load balancing. last error: no known healthy instances found for route tcp/443. (hint: is your app shut down? is there an ongoing deployment with a volume or are you using the 'immediate' strategy? have your app's instances all reached their hard limit?)
could not find a good candidate within 90 attempts at load balancing. last error: no known healthy instances found for route tcp/443. (hint: is your app shut down? is there an ongoing deployment with a volume or are you using the 'immediate' strategy? have your app's instances all reached their hard limit?)
…
and the app fails to be accessible from the outside.
I know Dockerfile and everything else is good, because without http_service.checks it all runs smoothly.
So what am I doing wrong?