Deploy stuck sending build context to Docker daemon

Hello,

Since today, I’ve been unable to deploy my application, which is getting stuck both on my local machine (OSX) as well as on my github CI runner. It seems it is suck during the “Sending build context to docker daemon” step.

Output:

$ fly deploy --remote-only
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-wild-wood-3352 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  1.049MB

I’ve let it run for about 30 mins so far with no response. How can I troubleshoot this further?

Builder/app is in the sea region.

A good next step might be to run fly doctor diag, which will generate a code that you can share to send us diagnostic info.

In this case, you might want to run the deploy command with at LOG_LEVEL=debug to get more detail about the failing/hanging deployment.

You could also query the remote builder app itself for more information with someting like fly logs -a fly-builder-wild-wood-3352 to inspect the activity of the builder itself.

Finally, you might be able to work around a remote builder issue by building your image locally: fly deploy --local-only will use the docker daemon on your machine for building.

Thanks for your reply.

Diagnostic code: purple-star-cool-pine-polished-smoke-4048

Debug deploy logs:

DEBUG <-- 200 https://api.fly.io/graphql (122.62ms) {"data":{"validateWireGuardPeers":{"invalidPeerIps":[]}}}
Waiting for remote builder fly-builder-wild-wood-3352... ⢿ DEBUG Remote builder available, but pinging again in 50ms to be sure
Waiting for remote builder fly-builder-wild-wood-3352... ⡿ DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
Waiting for remote builder fly-builder-wild-wood-3352... ⣟ DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
Waiting for remote builder fly-builder-wild-wood-3352... ⣯ DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
Waiting for remote builder fly-builder-wild-wood-3352... ⣷ DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
DEBUG Remote builder available, but pinging again in 50ms to be sure
Waiting for remote builder fly-builder-wild-wood-3352... ⣾ DEBUG Remote builder is ready to build!
Remote builder fly-builder-wild-wood-3352 ready
==> Creating build context
--> Creating build context done
DEBUG fetching docker server info
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  1.049MBue)Sending build context to Docker daemon  524.4kB

It hangs at this point with no more log output.

(Seemingly) relevant builder logs:

2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.474171915Z" level=debug msg="Calling HEAD /_ping"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.550337537Z" level=debug msg="Calling HEAD /_ping"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.573381509Z" level=debug msg="Calling HEAD /_ping"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.595837377Z" level=debug msg="Calling GET /v1.41/info"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.638423462Z" level=debug msg="Calling HEAD /_ping"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.679059902Z" level=debug msg="Calling POST /v1.41/build?buildargs=%7B%7D&buildid=f04de65e7239643b84a9b53aa7e290340755f4c250b92ea4676404f911e04805&cachefrom=null&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=null&memory=0&memswap=0&networkmode=&platform=linux%2Famd64&remote=upload-request&rm=0&session=lkj6chv6ekw0qlcmww4r4ydm4&shmsize=0&t=registry.fly.io%2Fdealr%3Adeployment-1652749419&target=&ulimits=null&version=2"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.702054290Z" level=debug msg="Calling POST /v1.41/build?buildargs=null&buildid=upload-request%3Af04de65e7239643b84a9b53aa7e290340755f4c250b92ea4676404f911e04805&cachefrom=null&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=&labels=null&memory=0&memswap=0&networkmode=&rm=0&shmsize=0&target=&ulimits=null&version=2"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.702512319Z" level=debug msg="Calling POST /session"
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.703252567Z" level=info msg="parsed scheme: \"\"" module=grpc
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.703308051Z" level=info msg="scheme \"\" not registered, fallback to default scheme" module=grpc
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.703332407Z" level=info msg="ccResolverWrapper: sending update to cc: {[{localhost  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2022-05-17T01:03:41Z app[dddebe39] sjc [info]time="2022-05-17T01:03:41.703342275Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc

Deploying locally did work, so that’s a temporary workaround, but eventually i’ll
need to get the CI runner working again as well.

I think there was (is?) some perturbations. Yesterday all the deployment involving the builder runner (i.e. using --remote-only which is particularly used in CI/CD) was failing for me.
A quick workaround was to run from the dev computer without the option.
I also noticed a new UI dashboard, so maybe they are rolling new updates.
So let’s hope for the best from now on!

Though, I want to rant a little bit…
@eli, proposing a solution and giving some explanations for further exploitation is very helpful. But to be fair, since yesterday and according to other post in the forum, it’s obvious that there is a problem from the fly.io side and asking for diagnostic data in that context is not a good play imho. Sorry, I really don’t want to be harsh. But it’s frustrating to deal with problems while being in production and trying to figure out some solutions thinking the problem is on our side.
/rant-off

Have a nice day!

Is there an easy way to recreate my builder instance in case its a problem there?

Another clue that something may be up with my builder, fly list apps shows the builder in a pending status:

 % fly list apps
Update available 0.0.325 -> v0.0.327.
Run "flyctl version update" to upgrade.
  NAME                       | STATUS  | ORG      | DEPLOYED
-----------------------------*---------*----------*-----------------
  dealr                      | running | personal | 42 seconds ago
  dealr-dev-db               | running | personal | 1 week ago
  fly-builder-wild-wood-3352 | pending | personal |

The builder logs are looping the following every second:

2022-05-18T02:50:35Z app[dddebe39] sjc [info]time="2022-05-18T02:50:35.827290907Z" level=info msg="containers active, keepalive"
2022-05-18T02:50:35Z app[dddebe39] sjc [info]time="2022-05-18T02:50:35.827297739Z" level=debug msg="liveness loop caused deadline reset"
2022-05-18T02:50:36Z app[dddebe39] sjc [info]time="2022-05-18T02:50:36.827417169Z" level=debug msg="checking docker activity"
2022-05-18T02:50:36Z app[dddebe39] sjc [info]time="2022-05-18T02:50:36.827722041Z" level=debug msg="Calling GET /v1.41/containers/json?filters=%7B%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0"
2022-05-18T02:50:36Z app[dddebe39] sjc [info]time="2022-05-18T02:50:36.829151822Z" level=debug msg="found runc process"

I am having the same issue. I always used --remote-only because pushing to the registry from my local machine was painfully slow and this was suggested somewhere to speed things up. Now it is the other way around.

@minism you can log into the fly dashboard and find the builder listed under “apps”. You can just delete it and it will recreate a new builder the next time you try to deploy. Unfortunately this did not solve this issue for me, but deploying locally, without --remote-only or using --local-only works and is sufficient fast for me now.

@jascha Thanks for the tip. For whatever reason, recreating the builder seems to have resolved the issue for me, for now.

Hi, this morning (around 09:30 UTC), I observe the same problem when trying to redeploy two applications in FRA which had re-built OK remotely up to yesterday evening:

$ fly deploy
==> Verifying app config
--> Verified app config
==> Building image
Error failed to fetch an image or build from source: error connecting to docker: failed building options: failed probing "personal": context deadline exceeded

Then I updated flyctl to the latest version and deleted the build host, but still get the same error, while fly doctor diag does not complain:

$ fly version
flyctl v0.0.327 linux/amd64 Commit: 2e5a176 BuildDate: 2022-05-17T21:26:08Z

$ fly doctor diag
Collecting fly.toml... ok
Collecting config.yml... ok
Collecting Dockerfile... ok
Collecting fly agent logs... ok
Collecting local diagnostics... ok

Your Diagnostic Code (safe to share): ancient-wood-small-field-withered-dream-4616

Interestingly, the shorter form fly doctor complains about UDP port used by WireGuard blocked on egress. Which is not the case on my side, neither on IPv4 nor on IPv6, and connectivity from my place to FRA is fine as well:

$ fly doctor 
Testing authentication token... PASSED
Testing flyctl agent... PASSED
Testing local Docker instance... Nope
Pinging WireGuard gateway (give us a sec)... FAILED
(Error: ping gateway: no response from gateway received)

We can't establish connectivity with WireGuard for your personal organization.

WireGuard runs on 51820/udp, which your local network may block.

If this is the first time you've ever used 'flyctl' on this machine, you
can try running 'flyctl doctor' again.

Finally:

$ LOG_LEVEL=debug fly deploy
...
Waiting for remote builder fly-builder-bold-paper-1362... ⣻ DEBUG <-- 200 https://api.fly.io/graphql (154.31ms) {"data":{"validateWireGuardPeers":{"invalidPeerIps":[]}}}
DEBUG result image:<nil> error:error connecting to docker: failed building options: failed probing "personal": context deadline exceeded
Error failed to fetch an image or build from source: error connecting to docker: failed building options: failed probing "personal": context deadline exceeded
$

Thanks.

thanks for this output! quick question-- you mention initially that the failed deploy was for applications in the FRA region, but that connectivity to FRA is working.

were you trying to redeploy to London, by chance? If so, it’s possible that you might have run into an incident from earlier today, and a redeploy might work.