Hoping someone can help me get to the bottom of something that is confusing me.
I am deploying an app based on the remix blues template.
The app is running fine, but it is only utilising one of the two machines. When I start up the app, I get the same message on both machines:
2024-02-15T08:40:58Z runner[d8d9e13a090d18] lhr [info]Machine started in 568ms
2024-02-15T08:40:59Z app[d8d9e13a090d18] lhr [info]> start
2024-02-15T08:40:59Z app[d8d9e13a090d18] lhr [info]> cross-env NODE_ENV=production node ./build/server.js
2024-02-15T08:41:02Z app[d8d9e13a090d18] lhr [info]✅ app ready: http://localhost:8080
2024-02-15T08:41:02Z app[d8d9e13a090d18] lhr [info]✅ metrics ready: http://localhost:8081/metrics
2024-02-15T08:41:04Z app[d8d9e13a090d18] lhr [info]HEAD / 200 - - 80.845 ms
2024-02-15T08:41:04Z app[d8d9e13a090d18] lhr [info]GET /healthcheck 200 - - 111.974 ms
2024-02-15T08:41:04Z app[e286033c6e39d8] lhr [info]HEAD / 200 - - 22.307 ms
2024-02-15T08:41:04Z app[e286033c6e39d8] lhr [info]GET /healthcheck 200 - - 26.911 ms
2024-02-15T08:41:14Z app[d8d9e13a090d18] lhr [info]HEAD / 200 - - 42.826 ms
2024-02-15T08:41:14Z app[d8d9e13a090d18] lhr [info]GET /healthcheck 200 - - 54.623 ms
2024-02-15T08:41:14Z app[e286033c6e39d8] lhr [info]HEAD / 200 - - 25.254 ms
2024-02-15T08:41:14Z app[e286033c6e39d8] lhr [info]GET /healthcheck 200 - - 28.525 ms
2024-02-15T08:41:18Z app[e286033c6e39d8] lhr [info] INFO Sending signal SIGINT to main child process w/ PID 306
2024-02-15T08:41:23Z app[e286033c6e39d8] lhr [info] INFO Sending signal SIGTERM to main child process w/ PID 306
2024-02-15T08:41:24Z app[d8d9e13a090d18] lhr [info]HEAD / 200 - - 49.803 ms
2024-02-15T08:41:24Z app[d8d9e13a090d18] lhr [info]GET /healthcheck 200 - - 59.357 ms
2024-02-15T08:41:24Z app[e286033c6e39d8] lhr [info]HEAD / 200 - - 23.789 ms
2024-02-15T08:41:24Z app[e286033c6e39d8] lhr [info]GET /healthcheck 200 - - 29.048 ms
2024-02-15T08:41:28Z app[e286033c6e39d8] lhr [warn]Virtual machine exited abruptly
2024-02-15T08:41:29Z app[e286033c6e39d8] lhr [info][ 0.057523] PCI: Fatal: No config space access function found
2024-02-15T08:41:29Z app[e286033c6e39d8] lhr [info] INFO Starting init (commit: bfa79be)...
2024-02-15T08:41:29Z app[e286033c6e39d8] lhr [info] INFO Preparing to run: `docker-entrypoint.sh npm start` as root
2024-02-15T08:41:29Z app[e286033c6e39d8] lhr [info] INFO [fly api proxy] listening at /.fly/api
2024-02-15T08:41:29Z app[e286033c6e39d8] lhr [info]2024/02/15 08:41:29 listening on [fdaa:3:e4fa:a7b:be65:79c8:89d7:2]:22 (DNS: [fdaa::3]:53)
2024-02-15T08:41:29Z runner[e286033c6e39d8] lhr [info]Machine started in 627ms
2024-02-15T08:41:30Z app[e286033c6e39d8] lhr [info]> start
2024-02-15T08:41:30Z app[e286033c6e39d8] lhr [info]> cross-env NODE_ENV=production node ./build/server.js
2024-02-15T08:41:33Z app[e286033c6e39d8] lhr [info]✅ app ready: http://localhost:8080
2024-02-15T08:41:33Z app[e286033c6e39d8] lhr [info]✅ metrics ready: http://localhost:8081/metrics
2024-02-15T08:41:34Z app[d8d9e13a090d18] lhr [info]HEAD / 200 - - 34.636 ms
2024-02-15T08:41:34Z app[d8d9e13a090d18] lhr [info]GET /healthcheck 200 - - 42.649 ms
2024-02-15T08:41:34Z app[e286033c6e39d8] lhr [info]HEAD / 200 - - 86.321 ms
2024-02-15T08:41:34Z app[e286033c6e39d8] lhr [info]GET /healthcheck 200 - - 129.445 ms
2024-02-15T08:41:44Z app[d8d9e13a090d18] lhr [info]HEAD / 200 - - 38.961 ms
but, when I make a request I get the error message:
2024-02-15T08:41:59Z proxy[e286033c6e39d8] lhr [error]instance refused connection. is your app listening on 0.0.0.0:3000? make sure it is not only listening on 127.0.0.1 (hint: look at your startup logs, servers often print the address they are listening on)
Looking at the collected metrics, it looks like all requests are being handled by one instance (d8d9e13a090d18) in this case. But the very odd thing is that the healthchecks are working fine. I guess these must come from within the fly internal network, so it seems likely to me that this is a routing issue but tbh I’m completely stuck.
My fly.toml file is below for reference:
# fly.toml app configuration file generated for ticketing-remix-fbe1 on 2023-12-14T15:19:16Z
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = "ticketing-remix-fbe1"
primary_region = "lhr"
kill_signal = "SIGINT"
kill_timeout = "5s"
[experimental]
auto_rollback = true
[build]
[deploy]
release_command = "bash ./scripts/migrate.sh"
[env]
METRICS_PORT = "8081"
PORT = "8080"
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
processes = ["app"]
[[services]]
protocol = "tcp"
internal_port = 8080
processes = ["app"]
[[services.ports]]
port = 80
handlers = ["http"]
force_https = true
[[services.ports]]
port = 443
handlers = ["tls", "http"]
[services.concurrency]
type = "connections"
hard_limit = 25
soft_limit = 20
[[services.tcp_checks]]
interval = "15s"
timeout = "2s"
grace_period = "1s"
[[services.http_checks]]
interval = "10s"
timeout = "2s"
grace_period = "5s"
method = "get"
path = "/healthcheck"
protocol = "http"
tls_skip_verify = false
[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 1024
[[metrics]]
port = 8081
path = "/metrics"