Uploading build image to Fly registry times out

Lately, my deploys have been taking longer because of retries or straight up timing out because the upload build image step seems to time out after the image is uploaded (it’s stuck on the progress bar, before it would normally say “pushed”).

I’m running on a remote builder, and my relevant image layer is only 40ish MBs, so I don’t think it’s size related. Does it have something to do with the builder’s region?

1 Like

We’re looking into registry performance, should have it resolved soon. I don’t think it has anything to do with the region at this point, but will tag you if that changes.

Was lately more than 4 hours ago? We had some over night (US time) capacity problems with our registry. These should have been fine for the last several hours.

Can you paste your logs if this happens again?

Yeah, over the last couple of days, usually around evening in EU time.

The status page should also indicate any issues https://status.flyio.net/ - so that’s a faster way to check what’s going on.

Hmm, to be honest, pushing the image from a remote builder to the registry has always taken a while for me - only recently has it started taking so long that it times out and goes into a retry loop. I’m not sure how logs could help here, it’s just the Docker image push output that hangs around on 40MB/40MB for a minute or so before either reporting pushed or retrying in X seconds.

The fact that is goes to 100% completion and then hangs is interesting information, thanks. Will keep that in mind. Seems clear that there isn’t a problem in pushing the actual layer objects itself, but in one of the finishing-up steps.

1 Like

I’m also getting this issue, although the size of my layer is ~350mb. Hang once is uploaded

I’ve just started using fly and I am getting the same issue, despite using --remote-only:

fly deploy --remote-only
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-shy-snowflake-577 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 31.1s (0/1)                                                                                                                                                                                                            
[+] Building 3.8s (12/12) FINISHED                                                                                                                                                                                                  
[...]
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/pd-coupling]
29d2e4731077: Pushing [==================================================>]  64.65MB
5bbc84a2aad9: Layer already exists 
c6689efc6ce8: Layer already exists 
df24978ae806: Layer already exists 
b1c3d2c82c14: Layer already exists 
6f6e69c2c592: Layer already exists 
53b8bfee7a0a: Layer already exists 
5b3f1ed98915: Layer already exists 
6b183c62e3d7: Layer already exists 
882fd36bfd35: Layer already exists 
d1dec9917839: Layer already exists 
d38adf39e1dd: Layer already exists 
4ed121b04368: Layer already exists 
d9d07d703dd5: Layer already exists 
Error failed to fetch an image or build from source: error rendering push status stream: EOF

Trying to re-deploy multiple times sometimes randomly fixes it.

I am getting this issue constantly. It seems to go like this:

  • Get to around ~433MB upload
  • Pause for about 10-20 seconds
  • Continue uploading till it gets to 519MB, then says “Pushing 519MB”
  • Timeout and retry
  • Repeat

Curiously this only happens on an Elixir umbrella app. Another non-umbrella Elixir app deploys just fine. A bit curious as to why that is.

I’m only at the point of investigating whether we can use Fly.io for our hosting, but the Elixir umbrella app support is a bit of a roadblock for us.

FYI, config files here:

fly.toml

# fly.toml file generated for elbaite-api on 2022-12-15T12:42:06+11:00

app = "elbaite-api"
kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[build]
  builder = "heroku/buildpacks:20"
  buildpacks = ["https://cnb-shim.herokuapp.com/v1/hashnuke/elixir"]

[env]
  PORT = "8080"

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[statics]]
  guest_path = "apps/api_web/priv/static"
  url_prefix = "/static"

[[services]]
  http_checks = []
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []
  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

elixir_buildpack.config:

elixir_version=1.13.4
erlang_version=24.0

No Dockerfile yet, coz apparently flyctl doesn’t generate one when it’s an umbrella app. Not sure why that is?

Is there a canonical Dockerfile for umbrella apps?

I don’t seem to be able to find a good intro to deploying an Elixir umbrella app to fly.io. Any hints?

Thanks in advance.

So, the solution eventually came out like this:

  • run fly launch
  • set up Postgres, copying the DATABASE_URL
  • cancel out of deploying
  • run fly secrets set DATABASE_URL <url from above>
  • run:
export SECRET_KEY_BASE=$(mix phx.gen.secret)
fly secrets set SECRET_KEY_BASE=$SECRET_KEY_BASE

THEN run fly deploy.

Dunno why this is so broken in Umbrella apps.

@chrismccord Any thoughts?

Happy to contribute code etc so others don’t have this drama. I’ve lost half a day figuring this out with a HelloWorld umbrella app.

I had the same problem. After about 1 hour of retrying, the 4 stubborn layers (40mb - 70mb each) trickled through. Nothing I did seemed to speed them up, but simply trying fly deploy again and again after it inevitably timed out with

Error failed to fetch an image or build from source: error rendering push status stream: EOF

Mine wasn’t big either; just a stock standard rails app.

2 Likes