I have a simple Rails app with scale to zero on a single machine. When the machine is started by a request, the first request always fails with a 502, seemingly because the app doesn’t start fast enough (but it’s a simple Rails app, it doesn’t do anything special on startup). Any subsequent request is successful, as the machine is already running.
Is there anything I can do to prevent this? For example, would it be possible to increase the maximum number of attempts at load balancing? Any help will be greatly appreciated!
Here are the logs for when this happens:
2025-04-21T07:21:00Z proxy[e825104f015398] mad [info]Starting machine
2025-04-21T07:21:01Z app[e825104f015398] mad [info]2025-04-21T07:21:01.056784460 [01JSBJHRQKNHK885M9XD9NEMCN:main] Running Firecracker v1.7.0
2025-04-21T07:21:01Z health[e825104f015398] mad [warn]Health check on port 3000 is in a 'warning' state. Your app may not be responding properly.
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO Starting init (commit: d15e62a13)...
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO Checking filesystem on /data
2025-04-21T07:21:01Z app[e825104f015398] mad [info]/dev/vdc: clean, 12/64512 files, 8866/258048 blocks
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO Mounting /dev/vdc at /data w/ uid: 1000, gid: 1000 and chmod 0755
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO Resized /data to 1056964608 bytes
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO starting statics vsock server
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO Preparing to run: `/rails/bin/docker-entrypoint ./bin/rails server` as 1000
2025-04-21T07:21:01Z app[e825104f015398] mad [info] INFO [fly api proxy] listening at /.fly/api
2025-04-21T07:21:01Z runner[e825104f015398] mad [info]Machine started in 1.034s
2025-04-21T07:21:01Z proxy[e825104f015398] mad [info]machine started in 1.049074467s
2025-04-21T07:21:01Z proxy[e825104f015398] mad [info]machine became reachable in 7.569973ms
2025-04-21T07:21:02Z proxy[e825104f015398] mad [error][PC01] instance refused connection. is your app listening on 0.0.0.0:3000? make sure it is not only listening on 127.0.0.1 (hint: look at your startup logs, servers often print the address they are listening on)
2025-04-21T07:21:02Z app[e825104f015398] mad [info]2025/04/21 07:21:02 INFO SSH listening listen_address=[fdaa:a:54e5:a7b:49:367c:cc78:2]:22
2025-04-21T07:21:02Z health[e825104f015398] mad [error]Health check on port 3000 has failed. Your app is not responding properly.
2025-04-21T07:21:08Z app[e825104f015398] mad [info]=> Booting Puma
2025-04-21T07:21:08Z app[e825104f015398] mad [info]=> Rails 7.2.2.1 application starting in production
2025-04-21T07:21:08Z app[e825104f015398] mad [info]=> Run `bin/rails server --help` for more startup options
2025-04-21T07:21:10Z app[e825104f015398] mad [info]Puma starting in single mode...
2025-04-21T07:21:10Z app[e825104f015398] mad [info]* Puma version: 6.4.3 (ruby 3.3.5-p100) ("The Eagle of Durango")
2025-04-21T07:21:10Z app[e825104f015398] mad [info]* Min threads: 3
2025-04-21T07:21:10Z app[e825104f015398] mad [info]* Max threads: 3
2025-04-21T07:21:10Z app[e825104f015398] mad [info]* Environment: production
2025-04-21T07:21:10Z app[e825104f015398] mad [info]* PID: 668
2025-04-21T07:21:10Z app[e825104f015398] mad [info]* Listening on http://0.0.0.0:3000
2025-04-21T07:21:10Z app[e825104f015398] mad [info]Use Ctrl-C to stop
2025-04-21T07:21:12Z app[e825104f015398] mad [info]I, [2025-04-21T07:21:12.642042 #668] INFO -- : [71bf9302-ce04-4972-b86c-f5a66bd39260] Started GET "/up" for 172.19.25.177 at 2025-04-21 07:21:12 +0000
2025-04-21T07:21:12Z app[e825104f015398] mad [info]I, [2025-04-21T07:21:12.644409 #668] INFO -- : [71bf9302-ce04-4972-b86c-f5a66bd39260] Processing by Rails::HealthController#show as HTML
2025-04-21T07:21:12Z app[e825104f015398] mad [info]I, [2025-04-21T07:21:12.645447 #668] INFO -- : [71bf9302-ce04-4972-b86c-f5a66bd39260] Completed 200 OK in 1ms (Views: 0.3ms | ActiveRecord: 0.0ms (0 queries, 0 cached) | GC: 0.0ms)
2025-04-21T07:21:13Z health[e825104f015398] mad [info]Health check on port 3000 is now passing.
2025-04-21T07:21:16Z proxy[e825104f015398] mad [error][PR04] could not find a good candidate within 20 attempts at load balancing
2025-04-21T07:21:19Z app[e825104f015398] mad [info]I, [2025-04-21T07:21:19.037429 #668] INFO -- : [05d11e95-b94e-4c51-bb69-0bbfda6ab7f5] Started GET "/up" for 172.19.25.177 at 2025-04-21 07:21:19 +0000
2025-04-21T07:21:19Z app[e825104f015398] mad [info]I, [2025-04-21T07:21:19.038500 #668] INFO -- : [05d11e95-b94e-4c51-bb69-0bbfda6ab7f5] Processing by Rails::HealthController#show as HTML
2025-04-21T07:21:19Z app[e825104f015398] mad [info]I, [2025-04-21T07:21:19.039100 #668] INFO -- : [05d11e95-b94e-4c51-bb69-0bbfda6ab7f5] Completed 200 OK in 0ms (Views: 0.2ms | ActiveRecord: 0.0ms (0 queries, 0 cached) | GC: 0.0ms)
2025-04-21T07:21:19Z health[e825104f015398] mad [info]Health check on port 3000 is now passing.
(despite the health check finally passing, the client has already received a 502 by then)