Failing ''services.http_checks"

I want to implement http health checks with this block in on of my apps:

[[services.http_checks]]
    interval = 10000
    grace_period = "6s"
    method = "get"
    path = "/healthz"
    protocol = "http"
    timeout = 2000
    tls_skip_verify = false
    [services.http_checks.headers]

According to the logs the app starts normally but it doesn’t log the health check request like it should. Here is a glimse into the logs:

2021-05-13T09:55:37.995Z app[b28aad53] lhr [info] Starting init (commit: cc4f071)...
2021-05-13T09:55:38.010Z app[b28aad53] lhr [info] Running: `/bin/engine` as root
2021-05-13T09:55:38.016Z app[b28aad53] lhr [info] 2021/05/13 09:55:38 listening on [fdaa:0:1af2:a7b:a98:b28a:ad53:2]:22 (DNS: [fdaa::3]:53)
2021-05-13T09:55:38.050Z app[b28aad53] lhr [info] time="2021-05-13T09:55:38Z" level=info msg="opening database connection"
2021-05-13T09:55:40.119Z app[b28aad53] lhr [info] time="2021-05-13T09:55:40Z" level=info msg="applied '0' new migration(s)"
2021-05-13T09:55:42.195Z proxy[b28aad53] lhr [warn] Health check status changed 'passing' => 'warning'
2021-05-13T09:55:45.642Z app[b28aad53] lhr [info] time="2021-05-13T09:55:45Z" level=info msg="starting the http server" port=8080
2021-05-13T09:55:45.644Z app[b28aad53] lhr [info] time="2021-05-13T09:55:45Z" level=info msg="starting event processor"
2021-05-13T09:55:46.311Z proxy[b28aad53] lhr [error] Health check status changed 'warning' => 'critical'
2021-05-13T09:56:34.816Z runner[b28aad53] lhr [info] Shutting down virtual machine
2021-05-13T09:56:34.906Z app[b28aad53] lhr [info] Sending signal SIGINT to main child process w/ PID 507
2021-05-13T09:56:34.909Z app[b28aad53] lhr [info] Main child exited normally with code: 0
2021-05-13T09:56:34.910Z app[b28aad53] lhr [info] Starting clean up.
2021-05-13T09:56:34.914Z app[b28aad53] lhr [info] time="2021-05-13T09:56:34Z" level=info msg="got SIGINT..."
2021-05-13T09:56:39.904Z proxy[b28aad53] lhr [info] Health check status changed 'critical' => 'passing'

Just to be sure, are you listening on 0.0.0.0 for that port?

The health check now works but I can’t get external traffic. I can’t even ping the app.
Here is my fly.toml:

app = "my_app"

kill_signal = "SIGINT"
kill_timeout = 5

[env]
  LOG_LEVEL = "debug"
  PORT = "8080"
  PRIMARY_REGION = "lhr"

[experimental]
  auto_rollback = true
  private_network= true

[[services]]
  internal_port = 8080
  protocol = "tcp"

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.http_checks]]
    interval = 10000
    grace_period = "2s"
    method = "get"
    path = "/healthz"
    protocol = "http"
    timeout = 2000
    restart_limit = 6
    [services.http_checks.headers]


[metrics]
  port = 8080
  path = "/metrics" # default for most prometheus clients

We mistakenly assigned a “bad” IP to your app (an IP reserved for broadcast). Sorry about that.

I’ve assigned a new IP and removed the bad one.

Unfortunately the TTL on the .fly.dev record is a bit long. It won’t be updated for a little bit. You can access your site directly via your IP in the meantime (find it with flyctl ips list).

1 Like