Rails app stuck on Running release task (pending)

Hi,

Over the last couple of days when trying to deploy my Rails apps, they have all been getting stuck on: Running release task (pending).

A lot of the time after about 10 minutes it will finally complete the task, but sometimes it just times out. Does anyone know what could be causing this? I have tried deleting the builder but it didn’t help.

In my builder logs, when this happens it is just stuck in a loop of this:


2023-03-13T16:18:26.490 app[9185900a065383] lhr [info] time="2023-03-13T16:18:26.490154662Z" level=debug msg="checking docker activity"

2023-03-13T16:18:26.490 app[9185900a065383] lhr [info] time="2023-03-13T16:18:26.490456763Z" level=debug msg="Calling GET /v1.41/containers/json?filters=%7B%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0"

This could mean a few things:

  1. The release command is running forever and not outputting anything. This is separate from the builder, you should see release command output in fly logs -a <name-of-app>
  2. The release command is crashing before it logs output
  3. It’s taking a while to schedule the release command. If we’re experiencing issues scheduling

If it completes sometimes after 10 min, I’d lean towards the first problem.

My app is now offline, as it seems to be stuck in a pending state. Do you know how this can be fixed? I’ve tried fly scale count 1 as well as deploying again but it didn’t help. When I tried to deploy, i got the error ‘No deployment available to monitor’.

➜  bishbashbooked git:(master) fly status --all
App
  Name     = bishbashbooked
  Owner    = personal
  Version  = 64
  Status   = pending
  Hostname = bishbashbooked.fly.dev
  Platform = nomad

Instances
ID      	PROCESS	VERSION	REGION	DESIRED	STATUS	HEALTH CHECKS      	RESTARTS	CREATED
4550ed7a	web    	64 ⇡   	lhr   	run    	failed	1 total            	2       	12m27s ago
60ed1144	web    	62     	lhr   	stop   	failed	1 total            	2       	28m26s ago
7fccf5eb	web    	62     	lhr   	stop   	failed	1 total            	2       	44m39s ago
24f97e6d	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	1h0m ago
39c389e3	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	1h16m ago
8694d2e2	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	1h31m ago
b8ab3468	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	1h47m ago
63209094	web    	62     	lhr   	stop   	failed	1 total            	2       	2h2m ago
37ee677f	web    	62     	lhr   	stop   	failed	1 total            	2       	2h18m ago
a33d4d7e	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	2h34m ago
3191ff2c	web    	62     	lhr   	stop   	failed	1 total            	2       	2h50m ago
4ce0c3e2	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	3h5m ago
e9a480a4	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	3h21m ago
a2b9632f	web    	62     	lhr   	stop   	failed	1 total            	2       	3h37m ago
79960e11	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	3h53m ago
3233542f	web    	62     	lhr   	stop   	failed	1 total            	2       	4h9m ago
9080f377	web    	62     	lhr   	stop   	failed	1 total            	2       	4h24m ago
a17554b7	web    	62     	lhr   	stop   	failed	1 total            	2       	4h40m ago
a361b3c2	web    	62     	lhr   	stop   	failed	1 total            	2       	4h56m ago
72274f1a	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	5h12m ago
7034b05c	web    	62     	lhr   	stop   	failed	1 total            	2       	5h27m ago
3ec0393b	web    	62     	lhr   	stop   	failed	1 total            	2       	5h43m ago
cc86e721	web    	62     	lhr   	stop   	failed	1 total            	2       	5h59m ago
93a6b789	web    	62     	lhr   	stop   	failed	1 total            	2       	6h14m ago
a6b1a214	web    	62     	lhr   	stop   	failed	1 total            	2       	6h30m ago
6a4c2918	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	6h46m ago
08db3f3f	web    	62     	lhr   	stop   	failed	1 total            	2       	7h1m ago
51501e37	web    	62     	lhr   	stop   	failed	1 total, 1 critical	2       	7h10m ago
2006f5e9	web    	62     	lhr   	stop   	failed	1 total            	2       	7h17m ago
2298ed52	web    	62     	lhr   	stop   	failed	1 total            	2       	7h21m ago
442ab264	web    	62     	lhr   	stop   	failed	1 total            	2       	7h23m ago
86f8c946	web    	62     	lhr   	stop   	failed	1 total            	2       	7h26m ago
7ad4b4fe	web    	62     	lhr   	stop   	failed	1 total            	2       	7h40m ago
788a4b0f	web    	62     	lhr   	stop   	failed	1 total            	2       	7h45m ago
e012c6fa	web    	62     	lhr   	stop   	failed	                   	0       	8h34m ago
7f87ea74	web    	62     	lhr   	stop   	failed	1 total            	2       	8h38m ago
0563a509	web    	62     	lhr   	stop   	failed	1 total            	2       	8h46m ago
66cfa911	web    	62     	lhr   	stop   	failed	1 total            	2       	8h57m ago
8ba0d7b3	web    	62     	lhr   	stop   	failed	1 total            	2       	9h0m ago
2fe9dcdf	web    	62     	lhr   	stop   	failed	1 total            	2       	9h2m ago
0cf12c5e	web    	62     	lhr   	stop   	failed	1 total            	2       	9h4m ago
b09387f3	web    	62     	lhr   	stop   	failed	                   	2       	10h30m ago
0cf19dcc	web    	61     	lhr   	run    	failed	1 total            	2       	16h18m ago
38dea705	web    	60     	lhr   	stop   	failed	1 total            	2       	16h26m ago
06cc7f41	web    	59     	lhr   	run    	failed	1 total            	2       	21h7m ago
0a5d9d0d	web    	59     	lhr   	run    	failed	1 total            	2       	22h43m ago
f0c6f125	app    	57     	lhr   	stop   	failed	1 total            	2       	23h8m ago

STATUS: failed probably means the app process is crashing.

You can see more specifics about an individual VM by running fly vm status <id>. Like fly vm status 60ed1144.

It looks like it’s failing due to port 8080:

Checks
ID                              	SERVICE 	STATE  	OUTPUT
3df2415693844068640885b45074b954	tcp-8080	warning

Do you know why this might be failing? This is my fly.toml, which is unchanged since the last successful deploy:

app = "bishbashbooked"
kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[build]
  [build.args]
    BUILD_COMMAND = "bin/rails fly:build"
    SERVER_COMMAND = "bin/rails fly:server"

[deploy]
  release_command = "bin/rails fly:release"

[env]
  PORT = "8080"

[processes]
  web = "bin/rails fly:server"

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  http_checks = []
  internal_port = 8080
  processes = ["web"]
  protocol = "tcp"
  script_checks = []
  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

[[statics]]
  guest_path = "/app/public"
  url_prefix = "/"

Hi @harold,

Yesterday, I had a similar issue in my Rails app:

2023-03-13T20:05:13Z app[232499a0] mad [info]2023-03-13T20:05:13.262Z pid=520 tid=k4c ERROR: heartbeat: Connection timed out - user specified timeout
2023-03-14T04:42:48Z health[f48e233d] mad [error]Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.
2023-03-14T04:43:00Z health[f48e233d] mad [info]Health check on port 8080 is now passing.
2023-03-14T04:55:03Z health[f48e233d] mad [error]Health check on port 8080 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.
2023-03-14T04:55:18Z health[f48e233d] mad [info]Health check on port 8080 is now passing.

I had to stop my VM and make a new deploy. Everything worked correctly.

Let me know how it goes,
Sergio Turpín

If vm status shows exit codes that are anything but zero, it means the process is crashing.

A healthcheck in warning state is most likely to mean it never passed.

Things to try when you are really stuck:

  1. Replace your Dockerfile with the first example on this page: Minimal Rails application · Fly Docs . Also delete/comment out the [build], [deploy], [env], and [processes] section. Finally, change internal_port to 3000. If this fails to deploy, the problem is not your application.

  2. If the previous step succeeds and you have not significantly customized your Dockerfile, try replacing it with what we are currently are providing by running the following commands:

    bundle add dockerfile-rails --optimistic --group development
    bin/rails generate dockerfile
    bundle lock --add-platform x86_64-linux
    

    Continue with the fly.toml from the previous step. Additionally change /app to /rails in the [[statics]] section.

  1. If the previous step fails, and you have Docker installed locally, try launching your application locally:

    bin/rails generate dockerfile --compose
    export RAILS_MASTER_KEY=$(cat config/master.key)
    docker compose build
    docker compose up
    

    Powershell users will want to use the following command instead of export:

    $Env:RAILS_MASTER_KEY = Get-Content 'config\master.key'
    

thanks for the reply, I tried deploying with the minimal dockerfile and modified fly.toml, and still got this error:

--> v72 failed - Failed due to unhealthy allocations - rolling back to job version 71 and deploying as v73

Any suggestions?

Update -

I reverted back to the dockerfile and fly.toml from the last successful deploy and tried again. It worked and the app is back online, but I don’t know why it’s worked now and wasn’t earlier today.