I’m getting recurring failed deployments with this error message:
2023-08-08T12:28:30.285 runner[4d89151f096987] sin [error] Unable to pull image, not found, canceling deploy
I have tried fly deploy --no-cache. flyctl reports no errors. No issues showing on https://status.flyio.net/
Misiek
August 8, 2023, 12:55pm
2
I have the same problem… for me it started around 5h ago
Misiek
August 8, 2023, 12:59pm
3
more over… the fly deploy is wrongly showing that it waits for “something”
Waiting for all green machines to start
Machine 9080edea271587 [app] - created
when in reality (using LOG_LEVEL=debug) it gets from the API "error": "not_found: machine not found"…
Machine 5683dedf43d68e [app] - created
DEBUG <-- 404 https://api.machines.dev/v1/apps/meerkat/machines/5683dedf43d68e/wait?instance_id=01H7AK7Q53W3RCJ6EZFF2GEDJ2&state=started&timeout=60 (396.93ms)
DEBUG {
"error": "not_found: machine not found"
}
Same problem: Can't deploy the app
It spent me 3 hour, still can’t deploy the app
How did you guys fix this @madawei2699 @Misiek @craigdrayton ?
I started facing this issue since last night. Been more than 15 hours, but no resolution.
Gwaggli
September 23, 2023, 12:09pm
6
Same here… I have the same app twice (test and prod) and test works just fine but i cant deploy to my prod environment… Any solutions
1 Like
Can you try running against our secondary registry and see if it works? It’ll help narrow down the issue:
FLY_REGISTRY_HOST=registry2.fly.io fly deploy
1 Like
Using registry2.fly.io fixed the issue for me
@sniper2804 for me, the issue resolved itself after some time.
I ended up moving away from fly.io . Encountered too many unreported intermittent issues for me to place trust in it.
Heath
December 5, 2025, 3:59am
11
Hi! I started randomly getting this issue today, oddly I have a worker group and an app group each with two machines, and I’m able to deploy to three out of four of them, one machine - always the same one gets this error.
Here’s what my config file looks like:
app = 'appname'
primary_region = 'ewr'
kill_signal = 'SIGINT'
kill_timeout = '5s'
[experimental]
auto_rollback = true
[build]
dockerfile = 'Dockerfile.web'
[deploy]
release_command = 'python manage.py migrate --noinput'
[env]
DJANGO_SETTINGS_MODULE = 'appname.settings_production'
PORT = '8080'
[processes]
app = 'gunicorn --bind 0.0.0.0:8080 --workers 1 --threads 8 --timeout 0 appname.wsgi:application'
worker = 'celery -A appname worker -l INFO --beat --concurrency=2'
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = 'stop'
auto_start_machines = true
min_machines_running = 1
processes = ['app']
[[http_service.checks]]
interval = '10s'
timeout = '2s'
grace_period = '5s'
method = 'GET'
path = '/healthcheck/'
[http_service.checks.headers]
Host = 'localhost:8000'
X-Forwarded-Proto = 'https'
[[vm]]
memory = '2gb'
cpu_kind = 'shared'
cpus = 1
processes = ['worker', 'app']
I tried switching the registry but didn’t help. Also restarted the borked machine but not dice, going to see if I figure out how to get just that machine to update.
Any suggestions welcome not sure what to look at.
Here’s the fail:
Same deployment on other machines:
Hi… Since you don’t have volumes, I would clone the Machine that does work and then destroy the malfunctioning Machine—once that new node has booted up successfully. (Cloning from within the same process group, of course.)
Occasionally individual physical host machines have network problems, etc.
1 Like
Heath
December 5, 2025, 4:20am
13
Hey thanks, that was a nice simple fix!
1 Like