Slow start with failure to connect errors

I moved my Rails app from Heroku to fly.io a few days ago and all seems well except for the startup which takes about 13 seconds to complete. Below are the logs. You will notice two refused connections at 57.41 and 57.43 and finally it will connect. The longest wait period is between 57.45 and 57.53 when it finally kicks in. My app load times in this case are views 41.6ms and ActiveRecord 14.3ms. Please request further info if needed. Any help will be most appreciated.

2024-04-12T12:44:37Z runner[4d89d026b096d8] syd [info]machine exited with exit code 0, not restarting

2024-04-12T21:57:39Z proxy[3d8d95e7c053e8] syd [info]Starting machine

2024-04-12T21:57:40Z app[3d8d95e7c053e8] syd [info][ 0.041364] PCI: Fatal: No config space access function found

2024-04-12T21:57:40Z app[3d8d95e7c053e8] syd [info] INFO Starting init (commit: 5b8fb02)...

2024-04-12T21:57:40Z app[3d8d95e7c053e8] syd [info] INFO starting statics vsock server

2024-04-12T21:57:40Z app[3d8d95e7c053e8] syd [info] INFO Preparing to run: `/rails/bin/docker-entrypoint ./bin/rails server` as 1000

2024-04-12T21:57:40Z app[3d8d95e7c053e8] syd [info] INFO [fly api proxy] listening at /.fly/api

2024-04-12T21:57:40Z runner[3d8d95e7c053e8] syd [info]Machine started in 590ms

2024-04-12T21:57:40Z app[3d8d95e7c053e8] syd [info]2024/04/12 21:57:40 listening on [fdaa:9:141c:a7b:19a:18b6:5874:2]:22 (DNS: [fdaa::3]:53)

2024-04-12T21:57:41Z proxy[3d8d95e7c053e8] syd [info]machine started in 1.596390061s

2024-04-12T21:57:41Z proxy[3d8d95e7c053e8] syd [info]machine became reachable in 6.450255ms

2024-04-12T21:57:41Z proxy[3d8d95e7c053e8] syd **[error]instance refused connection. is your app listening on 0.0.0.0:3000? make sure it is not only listening on 127.0.0.1 (hint: look at your startup logs, servers often print the address they are listening on)**

2024-04-12T21:57:41Z proxy[4d89d026b096d8] syd [info]Starting machine

2024-04-12T21:57:41Z app[4d89d026b096d8] syd [info][ 0.038213] Spectre V2 : WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!

2024-04-12T21:57:41Z app[4d89d026b096d8] syd [info][ 0.050300] PCI: Fatal: No config space access function found

2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO Starting init (commit: 5b8fb02)...

2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO starting statics vsock server

2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO Preparing to run: `/rails/bin/docker-entrypoint ./bin/rails server` as 1000

2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO [fly api proxy] listening at /.fly/api

2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info]2024/04/12 21:57:42 listening on [fdaa:9:141c:a7b:232:ff3f:e05b:2]:22 (DNS: [fdaa::3]:53)

2024-04-12T21:57:42Z runner[4d89d026b096d8] syd [info]Machine started in 750ms

2024-04-12T21:57:42Z app[3d8d95e7c053e8] syd [info]=> Booting Puma

2024-04-12T21:57:42Z app[3d8d95e7c053e8] syd [info]=> Rails 7.1.2 application starting in production

2024-04-12T21:57:42Z app[3d8d95e7c053e8] syd [info]=> Run `bin/rails server --help` for more startup options

2024-04-12T21:57:43Z proxy[4d89d026b096d8] syd [info]machine started in 1.762273764s

2024-04-12T21:57:43Z proxy[4d89d026b096d8] syd [info]machine became reachable in 6.714423ms

2024-04-12T21:57:43Z proxy[4d89d026b096d8] syd **[error]instance refused connection. is your app listening on 0.0.0.0:3000? make sure it is not only listening on 127.0.0.1 (hint: look at your startup logs, servers often print the address they are listening on)**

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]Puma starting in single mode...

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]* Puma version: 6.4.2 (ruby 3.2.2-p53) ("The Eagle of Durango")

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]* Min threads: 5

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]* Max threads: 5

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]* Environment: production

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]* PID: 306

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]* Listening on http://0.0.0.0:3000

2024-04-12T21:57:43Z app[3d8d95e7c053e8] syd [info]Use Ctrl-C to stop

2024-04-12T21:57:44Z app[4d89d026b096d8] syd [info]=> Booting Puma

2024-04-12T21:57:44Z app[4d89d026b096d8] syd [info]=> Rails 7.1.2 application starting in production

2024-04-12T21:57:44Z app[4d89d026b096d8] syd [info]=> Run `bin/rails server --help` for more startup options

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]Puma starting in single mode...

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Puma version: 6.4.2 (ruby 3.2.2-p53) ("The Eagle of Durango")

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Min threads: 5

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Max threads: 5

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Environment: production

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* PID: 306

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Listening on http://0.0.0.0:3000

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]Use Ctrl-C to stop

2024-04-12T21:57:53Z app[3d8d95e7c053e8] syd [info]I, [2024-04-12T21:57:53.452342 #306] INFO -- : [71768bc3-f328-48a3-a7f0-9db24632c193] Started GET "/" for 2a09:8280:1::30:ca5e:0 at 2024-04-12 21:57:53 +0000

2024-04-12T21:57:53Z app[3d8d95e7c053e8] syd [info]I, [2024-04-12T21:57:53.453589 #306] INFO -- : [71768bc3-f328-48a3-a7f0-9db24632c193] Processing by PagesController#home as HTML

2024-04-12T21:57:53Z app[3d8d95e7c053e8] syd [info]I, [2024-04-12T21:57:53.511333 #306] INFO -- : [71768bc3-f328-48a3-a7f0-9db24632c193] Rendered layout layouts/application.html.erb (Duration: 54.2ms | Allocations: 30477)

2024-04-12T21:57:53Z app[3d8d95e7c053e8] syd [info]I, [2024-04-12T21:57:53.511613 #306] INFO -- : [71768bc3-f328-48a3-a7f0-9db24632c193] Completed 200 OK in 58ms (Views: 41.6ms | ActiveRecord: 14.3ms | Allocations: 32015)

I just ran my production app locally and confirmed that it is listening on Port 3000

*  Environment: production
10:45:42 web.1  | *          PID: 28313
10:45:42 web.1  | * Listening on http://0.0.0.0:3000

I guess I need to restate my questions:

1. Why would the system think my app may be 'listening on 127.0.0.1' instead of 0.0.0.0:3000 and why report an ERROR?
2. Why is the startup taking 13 seconds with most between: 

2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Listening on http://0.0.0.0:3000

and

2024-04-12T21:57:53Z app[3d8d95e7c053e8] syd [info]I, [2024-04-12T21:57:53.452342 #306]  INFO -- : [71768bc3-f328-48a3-a7f0-9db24632c193] Started GET "/" for 2a09:8280:1::30:ca5e:0 at 2024-04-12 21:57:53 +0000

i.e between Listening on http://0.0.0.0:3000 and Started GET “/” for . . . ?

Are you using postgres?

Focusing on one machine:

2024-04-12T12:44:37Z runner[4d89d026b096d8] syd [info]machine exited with exit code 0, not restarting
2024-04-12T21:57:41Z proxy[4d89d026b096d8] syd [info]Starting machine
2024-04-12T21:57:41Z app[4d89d026b096d8] syd [info][ 0.038213] Spectre V2 : WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!
2024-04-12T21:57:41Z app[4d89d026b096d8] syd [info][ 0.050300] PCI: Fatal: No config space access function found
2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO Starting init (commit: 5b8fb02)...
2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO starting statics vsock server
2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO Preparing to run: `/rails/bin/docker-entrypoint ./bin/rails server` as 1000
2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info] INFO [fly api proxy] listening at /.fly/api
2024-04-12T21:57:42Z app[4d89d026b096d8] syd [info]2024/04/12 21:57:42 listening on [fdaa:9:141c:a7b:232:ff3f:e05b:2]:22 (DNS: [fdaa::3]:53)
2024-04-12T21:57:42Z runner[4d89d026b096d8] syd [info]Machine started in 750ms
2024-04-12T21:57:43Z proxy[4d89d026b096d8] syd [info]machine started in 1.762273764s
2024-04-12T21:57:43Z proxy[4d89d026b096d8] syd [info]machine became reachable in 6.714423ms
2024-04-12T21:57:43Z proxy[4d89d026b096d8] syd **[error]instance refused connection. is your app listening on 0.0.0.0:3000? make sure it is not only listening on 127.0.0.1 (hint: look at your startup logs, servers often print the address they are listening on)**
2024-04-12T21:57:44Z app[4d89d026b096d8] syd [info]=> Booting Puma
2024-04-12T21:57:44Z app[4d89d026b096d8] syd [info]=> Rails 7.1.2 application starting in production
2024-04-12T21:57:44Z app[4d89d026b096d8] syd [info]=> Run `bin/rails server --help` for more startup options
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]Puma starting in single mode...
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Puma version: 6.4.2 (ruby 3.2.2-p53) ("The Eagle of Durango")
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Min threads: 5
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Max threads: 5
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Environment: production
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* PID: 306
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]* Listening on http://0.0.0.0:3000
2024-04-12T21:57:45Z app[4d89d026b096d8] syd [info]Use Ctrl-C to stop

You mentioned 13 seconds. Much of that is between the previous machine exiting and the new one starting. That’s unusual. I run a Rails app in a number of regions, including syd, and I’m not seeing that. I can’t explain it. Perhaps you have an unusually large image size? If that can’t be avoided, you might want to look into Picking a deployment strategy to minimize down time.

Now as to the 2 second delay, the Dockerfile and bin/docker-entrypoint provided with Rails 7 will attempt to run migrations before running puma. This means that migrations will be run on each machine. A better approach is to only run migrations once, on an ephemeral VM, using a release_command. In many cases, running the following command will make a number of recommendations and changes for you:

bin/rails generate dockerfile

Thank you. Yes, I am using Postgres.
My app is just a hobby thing used by a few family members to share family history. It is a retirement project/hobby for me. I still have a lot to learn but I am already happier with fly.io than I was with Heroku.
I don’t yet understand your suggestions but will certainly look into them. I assume that the bin/rails generate dockerfile command will modily my existing dockerfile?
What is the deal with the ‘instance refused connection’ error? I added a second region to see if that helped but it just increased the amount of times the error occurred.

I forgot to add that yesterday I had a ‘ran out of memory’ crash and a recommendation from fly.io to add more RAM. There was no activity at the time. Don’t know if this is related to the other issues or not.

2024-04-12T23:54:25Z runner[3d8d7de3b3e398] syd [info]Machine created and started in 6.043s
2024-04-12T23:54:27Z app[3d8d7de3b3e398] syd [info]2024/04/12 23:54:27 New SSH Session - ronirish@exemail.com.au
2024-04-13T00:46:00Z app[3d8d7de3b3e398] syd [info][ 3095.159055] Out of memory: Killed process 310 (ruby) total-vm:337212kB, anon-rss:159456kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:540kB oom_score_adj:0

It will prompt you on how to proceed. You can view the diffs, accept, or reject the changes.

The important changes are:

--- /Users/rubys/tmp/demo/bin/docker-entrypoint	2024-04-13 17:53:03
+++ /Users/rubys/tmp/demo/bin/docker-entrypoint20240413-93403-afclh8	2024-04-13 17:58:55
@@ -1,8 +1,5 @@
 #!/bin/bash -e
 
-# If running the rails server then create or migrate existing database
-if [ "${1}" == "./bin/rails" ] && [ "${2}" == "server" ]; then
-  ./bin/rails db:prepare
-fi
+# Add any container initialization steps here

And:

--- /Users/rubys/tmp/demo/fly.toml	2024-04-13 17:55:29
+++ /Users/rubys/tmp/demo/fly.toml20240413-93403-vskk5m	2024-04-13 17:59:45
@@ -10,7 +10,7 @@
 [build]
 
 [deploy]
+  release_command = "./bin/rails db:prepare"

Fly starts your machine, then tries to connect to it before it is ready. That’s because your machine is running a migration; and will do so each time the machine restarts. You really only need to run the migration once.

Before I proceed and because I am unsure, my current /bin/docker-entrypoint file is:

# Add any container initialization steps here

exec "${@}"

and my fly.toml already contains:

[deploy]
  release_command = './bin/rails db:prepare'

This doesn’t appear to line up with what you put above. I have made no changes to these files. Should I proceed?

I ran the bin/rails generate dockerfile command and the only change was in fly.toml where single quotes were changed to double quotes - so no go.

I mentioned earlier (end of message 4) about memory. Could this cause the slow startup?

If memory is real memory, then you either have enough or you don’t. If it is swap, that could contribute to delays.

I’m curious about your fly.toml, and your rootfs sizes. The latter can be found on the dashboard for your application. Click on Overvew, and under Application Information you will find RootFS Sizes.

Two seconds is not out of the ordinary for starting a Rails application, and normally the proxy will wait for the Rails app to start before dispatching requests to the new machine.

Which means that at the moment, I can’t explain either the 13 second delay, or why you are seeing [error]instance refused connection.

I’m a failed retiree with a hobby Rails application myself, and my app potentially has a significantly longer delay than yours as I start several services and migrate multiple databases. If it helps, I can talk you through how I display a “upgrading” message while this takes place.

rootfs is 1 GB 6 machines. fly.toml is 638 bytes and follows.

# fly.toml app configuration file generated for holmes-genealogy on 2024-04-07T15:18:55+10:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#

app = 'holmes-genealogy'
primary_region = 'syd'
console_command = '/rails/bin/rails console'

[build]

[deploy]
  release_command = "./bin/rails db:prepare"

[http_service]
  internal_port = 3000
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0
  processes = ['app']

[[vm]]
  memory = '1gb'
  cpu_kind = 'shared'
  cpus = 1

[[statics]]
  guest_path = '/rails/public'
  url_prefix = '/'

Each time the system is started it works fine but of course that doesn’t last. It is very fast to load in development but I guess there is a lot more happening in production. I am a bit bemused about migrations running every time a machine starts and would like to know how to manage migrations better.

A failed retiree is not a bad thing. You still have a lot to offer.

You are set up correctly. The default Dockerfile that Rails 7.1 provides would run migrations on every start; your docker-entrypoint does not have the call to run migrations, and your fly.toml has instructions to run migrations exactly once per deploy.

Thank you, at this stage, I am going to have to accept that the cause is something to do with my application. I have already been through my gemfile and routes but will keep persevering until I get an answer. Any suggestions would be appreciated.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.