"fly deploy --image=edgedb/edgedb ..." command fails with "sh: 0: Can't open start.sh"

I’m attempting to follow the EdgeDB guide for deploying to Fly.io but the deployment keeps failing for the same reason. Here’s the console output:

% flyctl deploy --image=edgedb/edgedb --remote-only --app $EDB_APP
==> Verifying app config
--> Verified app config
==> Building image
Searching for image 'edgedb/edgedb' remotely...
image found: img_0lq7478rl6yp6x35
==> Creating release
--> release v8 created

--> You can detach the terminal anytime without stopping the deployment
==> Monitoring deployment

 1 desired, 1 placed, 0 healthy, 1 unhealthy [restarts: 2] [health checks: 2 total]
Failed Instances

Failure #1

Instance
ID      	PROCESS	VERSION	REGION	DESIRED	STATUS	HEALTH CHECKS	RESTARTS	CREATED 
85aba1c2	       	8      	ord   	run    	failed	2 total      	2       	19s ago	

Recent Events
TIMESTAMP           	TYPE           	MESSAGE                                                         
2022-05-30T18:57:43Z	Received       	Task received by client                                        	
2022-05-30T18:57:43Z	Task Setup     	Building Task Directory                                        	
2022-05-30T18:57:46Z	Started        	Task started by client                                         	
2022-05-30T18:57:48Z	Terminated     	Exit Code: 127                                                 	
2022-05-30T18:57:48Z	Restarting     	Task restarting in 1.199537727s                                	
2022-05-30T18:57:56Z	Started        	Task started by client                                         	
2022-05-30T18:57:58Z	Terminated     	Exit Code: 127                                                 	
2022-05-30T18:57:58Z	Restarting     	Task restarting in 1.004968936s                                	
2022-05-30T18:58:05Z	Started        	Task started by client                                         	
2022-05-30T18:58:07Z	Terminated     	Exit Code: 127                                                 	
2022-05-30T18:58:07Z	Not Restarting 	Exceeded allowed attempts 2 in interval 5m0s and mode is "fail"	
2022-05-30T18:58:07Z	Alloc Unhealthy	Unhealthy because of failed task                               	

2022-05-30T18:57:46Z   [info]2022/05/30 18:57:46 listening on [fdaa:0:44cb:a7b:9ad9:85ab:a1c2:2]:22 (DNS: [fdaa::3]:53)
2022-05-30T18:57:46Z   [info]sh: 0: Can't open start.sh
2022-05-30T18:57:47Z   [info]Stan child exited normally with code: 127
2022-05-30T18:57:47Z   [info]Starting clean up.
2022-05-30T18:57:53Z   [info]Starting instance
2022-05-30T18:57:53Z   [info]Configuring virtual machine
2022-05-30T18:57:53Z   [info]Pulling container image
2022-05-30T18:57:53Z   [info]Unpacking image
2022-05-30T18:57:53Z   [info]Preparing kernel init
2022-05-30T18:57:55Z   [info]Configuring firecracker
2022-05-30T18:57:55Z   [info]Starting virtual machine
2022-05-30T18:57:56Z   [info]Starting init (commit: aa54f7d)...
2022-05-30T18:57:56Z   [info]Preparing to run: `sh start.sh` as root
2022-05-30T18:57:56Z   [info]2022/05/30 18:57:56 listening on [fdaa:0:44cb:a7b:9ad9:85ab:a1c2:2]:22 (DNS: [fdaa::3]:53)
2022-05-30T18:57:56Z   [info]sh: 0: Can't open start.sh
2022-05-30T18:57:57Z   [info]Stin child exited normally with code: 127
2022-05-30T18:57:57Z   [info]Starting clean up.
2022-05-30T18:58:03Z   [info]Starting instance
2022-05-30T18:58:03Z   [info]Configuring virtual machine
2022-05-30T18:58:03Z   [info]Pulling container image
2022-05-30T18:58:03Z   [info]Unpacking image
2022-05-30T18:58:03Z   [info]Preparing kernel init
2022-05-30T18:58:04Z   [info]Configuring firecracker
2022-05-30T18:58:04Z   [info]Starting virtual machine
2022-05-30T18:58:05Z   [info]Starting init (commit: aa54f7d)...
2022-05-30T18:58:05Z   [info]Preparing to run: `sh start.sh` as root
2022-05-30T18:58:05Z   [info]2022/05/30 18:58:05 listening on [fdaa:0:44cb:a7b:9ad9:85ab:a1c2:2]:22 (DNS: [fdaa::3]:53)
2022-05-30T18:58:05Z   [info]sh: 0: Can't open start.sh
2022-05-30T18:58:06Z   [info]Main child exited normally with code: 127
2022-05-30T18:58:06Z   [info]Starting clean up.
--> v8 failed - Failed due to unhealthy allocations - no stable job version to auto revert to and deploying as v9 

--> Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
Error abort

Any idea what the issue might be?

Hi @ianduvall just to get a better understanding of what the issue maybe can you:

  • fly doctor and paste the output
  • run fly version update and try another deploy.

Also what region are you attempting to deploy to?

% fly doctor        
Testing authentication token... PASSED
Testing flyctl agent... PASSED
Testing local Docker instance... PASSED
Pinging WireGuard gateway (give us a sec)... PASSED
% fly version update
Error no available update

No luck on redeploying. Trying to deploy to ORD.

Can you share your fly.toml? Specifically, the experimental section, if it exists.

I suspect something in there is overriding edgedb/edgedb docker entrypoint.

Here’s my fly.toml:

# fly.toml file generated for st-app on 2022-05-30T08:47:53-05:00

app = "st-app"

kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[experimental]
  allowed_public_ports = []
  auto_rollback = true
  cmd = "start.sh"
  entrypoint = "sh"

[[services]]
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.http_checks]]
    grace_period = "5s"
    interval = 10000
    method = "get"
    path = "/healthcheck"
    protocol = "http"
    timeout = 2000
    tls_skip_verify = false

    [services.http_checks.headers]

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

Will this .toml config apply to all of my apps? I assumed it would only apply to my “st-app” app.

It only might apply if override the app name (with fly deploy -a <app-name>) in the same directory as that fly.toml. This is likely what’s happening here, I assume you don’t want this behaviour.

Is there a start.sh present in the WORKDIR? I think this is the issue. If you’re running fly deploy --image=edgedb/edgedb, then it won’t pick up anything from your filesystem.

What’s in your start.sh? If you need to run scripts before edgedb entrypoints, you can put scripts in /docker-entrypoint.d (because their entrypoint will use run-parts to run anything in there)

Ok so here’s the full command I’m running: flyctl deploy --image=edgedb/edgedb --remote-only --app $EDB_APP. I’m following the Remix starter.

I think the start.sh is meant for starting the Remix app not the EdgeDB app. It’s contents are:

#!/bin/sh

set -ex
edgedb migrate
npm run start

That sounds right. Can you try your initial flyctl deploy --image=edgedb/edgedb --remote-only --app $EDB_APP from a directory where fly.toml isn’t present?

This is a bit confusing, but it’s a feature if you need to launch multiple apps with the same config.

Same error unfortunately. I think I’m going to nuke these apps and restart the guide from the beginning knowing that I need create the apps from separate directories.

Ah yes it’s probably stuck with the old configuration.

Your EdgeDB remix app is different to your EdgeDB app. Make sure you do not have the same EDB_APP set for both of those. I believe, this is why --app $EDB_APP ends up picking EdgeDB remix’s fly.toml which sure enough has start.sh while edgedb/edgedb doesn’t.

On your laptop, better to explicitly pass --app <app-name-here> --config <fly-config.toml> rather rely on env globals like $EDB_APP.