New Rails 7 app won't start - possibly OOM?

I created a new Ruby on Rails app (latest version of Rails 7). The port number in fly.toml is correct and it’s listening to 0.0.0.0. Here’s the input from my command line:

Release v2 created
Monitoring Deployment

1 desired, 1 placed, 0 healthy, 1 unhealthy [health checks: 1 total, 1 critical]
v2 failed - Failed due to unhealthy allocations - no stable job version to auto revert to
Failed Instances

==> Failure #1

Instance
  ID            = ebca6ffa
  Process       =
  Version       = 2
  Region        = ord
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 0
  Created       = 5m21s ago

Recent Events
TIMESTAMP            TYPE       MESSAGE
2022-03-15T02:03:22Z Received   Task received by client
2022-03-15T02:03:48Z Task Setup Building Task Directory
2022-03-15T02:03:51Z Started    Task started by client

Recent Logs
2022-03-15T02:03:51.000 [info] Configuring firecracker
2022-03-15T02:03:51.000 [info] Starting virtual machine
2022-03-15T02:03:51.000 [info] Starting init (commit: 0c50bff)...
2022-03-15T02:03:51.000 [info] Preparing to run: `/bin/sh -c bundle exec bin/rails s -b 0.0.0.0` as root
2022-03-15T02:03:51.000 [info] 2022/03/15 02:03:51 listening on [fdaa:0:534d:a7b:81:ebca:6ffa:2]:22 (DNS: [fdaa::3]:53)
2022-03-15T02:03:52.000 [info] Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
2022-03-15T02:04:00.000 [info] => Booting Puma
2022-03-15T02:04:00.000 [info] => Rails 7.0.2.3 application starting in development
2022-03-15T02:04:00.000 [info] => Run `bin/rails server --help` for more startup options
***v2 failed - Failed due to unhealthy allocations - no stable job version to auto revert to and deploying as v3

Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
Error abort

Here’s the output from the server:

2022-03-15T02:03:37Z runner[cd626286] ord [info]Shutting down virtual machine
2022-03-15T02:03:37Z app[cd626286] ord [info]Sending signal SIGINT to main child process w/ PID 514
error.message="problem connecting to app instance" 2022-03-15T02:03:37Z proxy[cd626286] fra [error]error.code=2000 request.method="GET" request.url="/" request.id="01FY5MZ5X582NCXWHHD4W5TKYY-fra" response.status=502
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Starting instance
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Configuring virtual machine
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Pulling container image
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Unpacking image
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Preparing kernel init
2022-03-15T02:03:51Z runner[ebca6ffa] ord [info]Configuring firecracker
2022-03-15T02:03:51Z runner[ebca6ffa] ord [info]Starting virtual machine
2022-03-15T02:03:51Z app[ebca6ffa] ord [info]Starting init (commit: 0c50bff)...
2022-03-15T02:03:51Z app[ebca6ffa] ord [info]Preparing to run: `/bin/sh -c bundle exec bin/rails s -b 0.0.0.0` as root
2022-03-15T02:03:51Z app[ebca6ffa] ord [info]2022/03/15 02:03:51 listening on [fdaa:0:534d:a7b:81:ebca:6ffa:2]:22 (DNS: [fdaa::3]:53)
2022-03-15T02:03:52Z app[ebca6ffa] ord [info]Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
2022-03-15T02:04:00Z app[ebca6ffa] ord [info]=> Booting Puma
2022-03-15T02:04:00Z app[ebca6ffa] ord [info]=> Rails 7.0.2.3 application starting in development
2022-03-15T02:04:00Z app[ebca6ffa] ord [info]=> Run `bin/rails server --help` for more startup options

I don’t see any error messages in the server logs.

Interestingly, even though the instance went critical, it’s still running according to fly status --all:

Deployment Status
  ID          = 66c74e0d-1746-b3a9-f5bb-35c000bab407
  Version     = v2
  Status      = failed
  Description = Failed due to unhealthy allocations - no stable job version to auto revert to
  Instances   = 1 desired, 1 placed, 0 healthy, 1 unhealthy

Instances
ID      	PROCESS	VERSION	REGION	DESIRED	STATUS  	HEALTH CHECKS      	RESTARTS	CREATED
ebca6ffa	app    	2 ⇡    	ord   	run    	running 	1 total, 1 critical	1       	16m7s ago
cd626286	app    	1      	ord   	stop   	complete	1 total, 1 critical	2       	31m52s ago
a27290fa	app    	0      	ord   	run    	failed  	1 total, 1 critical	14      	2022-03-14T01:21:36Z

And the VM status shows a 137 exit code, which Googling makes seem like out of memory:

Instance
  ID            = ebca6ffa
  Process       =
  Version       = 2
  Region        = ord
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 1
  Created       = 16m48s ago

Recent Events
TIMESTAMP            TYPE            MESSAGE
2022-03-15T02:03:22Z Received        Task received by client
2022-03-15T02:03:48Z Task Setup      Building Task Directory
2022-03-15T02:03:51Z Started         Task started by client
2022-03-15T02:08:48Z Alloc Unhealthy Task not running for min_healthy_time of 10s by deadline
2022-03-15T02:18:59Z Terminated      Exit Code: 137
2022-03-15T02:18:59Z Restarting      Task restarting in 1.038181586s
2022-03-15T02:19:07Z Started         Task started by client

Checks
ID                               SERVICE  STATE    OUTPUT
2c049ace173aa212ee0332a9b0a966d5 tcp-3000 critical dial tcp 172.19.5.18:3000: connect: connection refused

Recent Logs

But I’m not sure how to debug this, I don’t think a Rails app should have trouble starting up with 256MB since it’s not even getting to any of my controller code yet. Any suggestions?

I was worried about this today—an easy way to check memory usage is to look at the per-app metrics, on the graph labelled “Firecracker memory usage”. It’s possible to run a Rails 7 app in less than 225MB, although mine (with around 100 gems) is right up against the edge and using 22MB.

https://fly.io/apps/<appname>/metrics

An easy way to verify your hypothesis would be to give your VM 1GB and see if that fixes things:

fly scale memory 1024

Thanks, I took the inverse approach actually and applied the lower limit to the docker container locally. I found I could avoid the memory limit by setting RAILS_ENV=production which I hadn’t set before (I guess I’m too spoiled by Heroku which does it for you). Now it boots up fine with no memory issues! And of course I have another error :rofl: Error during failsafe response: The asset "application.js" is not present in the asset pipeline. — I will try to debug and start a separate thread for this if I get stuck.

2 Likes

This is almost definitely an OOM. If you upgrade your RAM it should work. Rails can work on 256MB of RAM but it’s pretty common to see OOMs at that level.

Good catch! We updated our Rails launcher to set RAILS_ENV at deploy time.

1 Like

what was the fix for
Error during failsafe response: The asset "application.js" is not present in the asset pipeline
as i now have that error too…

I had to configure fly.io to host the public assets since rails doesn’t serve them directly. Basically just added this to fly.toml: App Configuration (fly.toml)

1 Like

thanks @getagb
for those that come after, you’ll need below if your DockerFIle at some point

# in DockerFile
COPY . /app
RUN bundle exec rails assets:precompile

&&

# below in your fly.toml
[env]
  PORT = "8080"

[[statics]]
  guest_path = "/app/public"
  url_prefix = "/"

Also, flyctl launch now detects Rails apps and drops a decent Dockerfile in your project.

Maybe consider defaulting to the Ruby 3.1 series instead of 2.7.3? The 2.7 series shipped in 2019, and 2.7.3 is three CVE fixes behind.