New Rails 7 app won't start - possibly OOM?

getagb · March 15, 2022, 2:22am

I created a new Ruby on Rails app (latest version of Rails 7). The port number in fly.toml is correct and it’s listening to 0.0.0.0. Here’s the input from my command line:

Release v2 created
Monitoring Deployment

1 desired, 1 placed, 0 healthy, 1 unhealthy [health checks: 1 total, 1 critical]
v2 failed - Failed due to unhealthy allocations - no stable job version to auto revert to
Failed Instances

==> Failure #1

Instance
  ID            = ebca6ffa
  Process       =
  Version       = 2
  Region        = ord
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 0
  Created       = 5m21s ago

Recent Events
TIMESTAMP            TYPE       MESSAGE
2022-03-15T02:03:22Z Received   Task received by client
2022-03-15T02:03:48Z Task Setup Building Task Directory
2022-03-15T02:03:51Z Started    Task started by client

Recent Logs
2022-03-15T02:03:51.000 [info] Configuring firecracker
2022-03-15T02:03:51.000 [info] Starting virtual machine
2022-03-15T02:03:51.000 [info] Starting init (commit: 0c50bff)...
2022-03-15T02:03:51.000 [info] Preparing to run: `/bin/sh -c bundle exec bin/rails s -b 0.0.0.0` as root
2022-03-15T02:03:51.000 [info] 2022/03/15 02:03:51 listening on [fdaa:0:534d:a7b:81:ebca:6ffa:2]:22 (DNS: [fdaa::3]:53)
2022-03-15T02:03:52.000 [info] Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
2022-03-15T02:04:00.000 [info] => Booting Puma
2022-03-15T02:04:00.000 [info] => Rails 7.0.2.3 application starting in development
2022-03-15T02:04:00.000 [info] => Run `bin/rails server --help` for more startup options
***v2 failed - Failed due to unhealthy allocations - no stable job version to auto revert to and deploying as v3

Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
Error abort

Here’s the output from the server:

2022-03-15T02:03:37Z runner[cd626286] ord [info]Shutting down virtual machine
2022-03-15T02:03:37Z app[cd626286] ord [info]Sending signal SIGINT to main child process w/ PID 514
error.message="problem connecting to app instance" 2022-03-15T02:03:37Z proxy[cd626286] fra [error]error.code=2000 request.method="GET" request.url="/" request.id="01FY5MZ5X582NCXWHHD4W5TKYY-fra" response.status=502
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Starting instance
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Configuring virtual machine
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Pulling container image
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Unpacking image
2022-03-15T02:03:49Z runner[ebca6ffa] ord [info]Preparing kernel init
2022-03-15T02:03:51Z runner[ebca6ffa] ord [info]Configuring firecracker
2022-03-15T02:03:51Z runner[ebca6ffa] ord [info]Starting virtual machine
2022-03-15T02:03:51Z app[ebca6ffa] ord [info]Starting init (commit: 0c50bff)...
2022-03-15T02:03:51Z app[ebca6ffa] ord [info]Preparing to run: `/bin/sh -c bundle exec bin/rails s -b 0.0.0.0` as root
2022-03-15T02:03:51Z app[ebca6ffa] ord [info]2022/03/15 02:03:51 listening on [fdaa:0:534d:a7b:81:ebca:6ffa:2]:22 (DNS: [fdaa::3]:53)
2022-03-15T02:03:52Z app[ebca6ffa] ord [info]Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
2022-03-15T02:04:00Z app[ebca6ffa] ord [info]=> Booting Puma
2022-03-15T02:04:00Z app[ebca6ffa] ord [info]=> Rails 7.0.2.3 application starting in development
2022-03-15T02:04:00Z app[ebca6ffa] ord [info]=> Run `bin/rails server --help` for more startup options

I don’t see any error messages in the server logs.

Interestingly, even though the instance went critical, it’s still running according to fly status --all:

Deployment Status
  ID          = 66c74e0d-1746-b3a9-f5bb-35c000bab407
  Version     = v2
  Status      = failed
  Description = Failed due to unhealthy allocations - no stable job version to auto revert to
  Instances   = 1 desired, 1 placed, 0 healthy, 1 unhealthy

Instances
ID      	PROCESS	VERSION	REGION	DESIRED	STATUS  	HEALTH CHECKS      	RESTARTS	CREATED
ebca6ffa	app    	2 ⇡    	ord   	run    	running 	1 total, 1 critical	1       	16m7s ago
cd626286	app    	1      	ord   	stop   	complete	1 total, 1 critical	2       	31m52s ago
a27290fa	app    	0      	ord   	run    	failed  	1 total, 1 critical	14      	2022-03-14T01:21:36Z

And the VM status shows a 137 exit code, which Googling makes seem like out of memory:

Instance
  ID            = ebca6ffa
  Process       =
  Version       = 2
  Region        = ord
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 1
  Created       = 16m48s ago

Recent Events
TIMESTAMP            TYPE            MESSAGE
2022-03-15T02:03:22Z Received        Task received by client
2022-03-15T02:03:48Z Task Setup      Building Task Directory
2022-03-15T02:03:51Z Started         Task started by client
2022-03-15T02:08:48Z Alloc Unhealthy Task not running for min_healthy_time of 10s by deadline
2022-03-15T02:18:59Z Terminated      Exit Code: 137
2022-03-15T02:18:59Z Restarting      Task restarting in 1.038181586s
2022-03-15T02:19:07Z Started         Task started by client

Checks
ID                               SERVICE  STATE    OUTPUT
2c049ace173aa212ee0332a9b0a966d5 tcp-3000 critical dial tcp 172.19.5.18:3000: connect: connection refused

Recent Logs

But I’m not sure how to debug this, I don’t think a Rails app should have trouble starting up with 256MB since it’s not even getting to any of my controller code yet. Any suggestions?

indirect · March 15, 2022, 5:00am

I was worried about this today—an easy way to check memory usage is to look at the per-app metrics, on the graph labelled “Firecracker memory usage”. It’s possible to run a Rails 7 app in less than 225MB, although mine (with around 100 gems) is right up against the edge and using 22MB.

https://fly.io/apps/<appname>/metrics

indirect · March 15, 2022, 5:03am

An easy way to verify your hypothesis would be to give your VM 1GB and see if that fixes things:

fly scale memory 1024

getagb · March 15, 2022, 2:53pm

Thanks, I took the inverse approach actually and applied the lower limit to the docker container locally. I found I could avoid the memory limit by setting RAILS_ENV=production which I hadn’t set before (I guess I’m too spoiled by Heroku which does it for you). Now it boots up fine with no memory issues! And of course I have another error Error during failsafe response: The asset "application.js" is not present in the asset pipeline. — I will try to debug and start a separate thread for this if I get stuck.

kurt · March 15, 2022, 3:21pm

This is almost definitely an OOM. If you upgrade your RAM it should work. Rails can work on 256MB of RAM but it’s pretty common to see OOMs at that level.

jsierles · March 15, 2022, 3:31pm

Good catch! We updated our Rails launcher to set RAILS_ENV at deploy time.

matoni109 · May 16, 2022, 9:53am

what was the fix for
Error during failsafe response: The asset "application.js" is not present in the asset pipeline
as i now have that error too…

getagb · May 16, 2022, 12:12pm

I had to configure fly.io to host the public assets since rails doesn’t serve them directly. Basically just added this to fly.toml: App Configuration (fly.toml)

matoni109 · May 16, 2022, 11:41pm

thanks @getagb
for those that come after, you’ll need below if your DockerFIle at some point

# in DockerFile
COPY . /app
RUN bundle exec rails assets:precompile

&&

# below in your fly.toml
[env]
  PORT = "8080"

[[statics]]
  guest_path = "/app/public"
  url_prefix = "/"

jsierles · May 17, 2022, 7:18am

Also, flyctl launch now detects Rails apps and drops a decent Dockerfile in your project.

indirect · May 20, 2022, 6:57pm

Maybe consider defaulting to the Ruby 3.1 series instead of 2.7.3? The 2.7 series shipped in 2019, and 2.7.3 is three CVE fixes behind.

Topic		Replies	Views
Rails deploy failing - [error]Health check on port 3000 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes. Questions / Help rails	14	1578	March 24, 2023
Rails Failing Health Check Port 3000 Questions / Help rails	3	67	August 15, 2024
Health checks failing: error waiting for vsock readiness Build debugging	4	451	January 7, 2023
This is the first time im using fly and ruby on rails and i got an error while deploying my first app Questions / Help rails	7	1117	February 4, 2023
Can't get Rails health check to work	1	335	November 21, 2023

New Rails 7 app won't start - possibly OOM?

Related topics