Remote builder connection timeout

Saw some other posts about remote builders having issues and it sounds like some bugs have been swatted there, but I haven’t yet been able to get a remote build to proceed past the connecting phase. Some logs:

LOG_LEVEL=debug fly agent daemon-start

DEBUG Loaded flyctl config from/home/user/.fly/config.yml                                                     
DEBUG determined hostname: "mgmt"                                                                             
DEBUG determined working directory: "/home/user"                                                              
DEBUG determined user home directory: "/home/user"      
DEBUG determined config directory: "/home/user/.fly"                                                          
DEBUG ensured config directory exists.
DEBUG ensured config directory perms. 
DEBUG cache loaded.                        
DEBUG config initialized.                                                               
DEBUG initialized task manager.                                                                 
DEBUG skipped querying for new release                                                                        
DEBUG client initialized.
2022/01/29 09:43:36.715478 srv OK 15783                                                                       
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: ValidateWireGuardPeersInput!) { validat$
WireGuardPeers(input: $input) { invalidPeerIps } }","variables":{"input":{"peerIps":["fdaa:0:48a8:a7b:1537:0:$
:102"]}}}                                                                                                     
}      
DEBUG <-- 200 https://api.fly.io/graphql (243.81ms) {"data":{"validateWireGuardPeers":{"invalidPeerIps":[]}}}
2022/01/29 09:45:37.030996 srv validated wireguard peers                                                     
2022/01/29 09:46:39.599999 #1 connected ...             
2022/01/29 09:46:39.600127 srv config change at: 2022-01-29 09:46:39.598207552 -0800 PST                      
2022/01/29 09:46:39.600149 #1 <- (    4) "ping"
2022/01/29 09:46:39.600215 #1 -> (   58) "8\x00ok {\"PID\":15783,\"Version\":\"0.0.292\",\"Background\":false$
\n"                                        
2022/01/29 09:46:39.600231 #1 dropped.                                                  
2022/01/29 09:46:39.600425 #2 connected ...                                                     
2022/01/29 09:46:39.600460 #2 <- (   18) "establish personal"                                                
DEBUG --> POST https://api.fly.io/graphql {{"query":"query($orgType: OrganizationType) { organizations(type: $
orgType) { nodes { id slug name type } } }"

...

2022/01/29 09:54:57.583848 #14 dropped.
2022/01/29 09:54:57.584492 #15 connected ...
2022/01/29 09:54:57.584747 #15 <- (   53) "connect personal [fdaa:0:48a8:a7b:87:0:847e:2]:2375 0"
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: ValidateWireGuardPeersInput!) { validateWireGuardPeers(input: $input) { invalidPeerIps } }","variables":{"input":{"peerIps":["fdaa:0:48a8:a7b:1537:0:a:102"]}}}
}
DEBUG <-- 200 https://api.fly.io/graphql (187.55ms) {"data":{"validateWireGuardPeers":{"invalidPeerIps":[]}}}
2022/01/29 09:55:38.343591 srv validated wireguard peers

$ LOG_LEVEL=debug fly deploy --remote-only

DEBUG client initialized.
DEBUG app config loaded from /home/user/flyio/fly.toml
==> Verifying app config
--> Verified app config
==> Building image
DEBUG trying remote docker daemon
DEBUG Reporting buildDEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: StartSourceBuildInput!) { startSourceBuild(input: $input) { sourceBuild { id } } }","variables":{"input":{"appId":"REDACTED"}}}
}
DEBUG <-- 200 https://api.fly.io/graphql (526.89ms) {"errors":[{"message":"StartSourceBuildInput isn't a defined input type (on $input)","locations":[{"line":1,"column":10}],"path":["mutation"],"extensions":{"code":"variableRequiresValidType","typeName":"StartSourceBuildInput","variableName":"input"}},{"message":"Field 'startSourceBuild' doesn't exist on type 'Mutations'","locations":[{"line":1,"column":44}],"path":["mutation","startSourceBuild"],"extensions":{"code":"undefinedField","typeName":"Mutations","fieldName":"startSourceBuild"}},{"message":"Variable $input is declared by anonymous mutation but not used","locations":[{"line":1,"column":1}],"path":["mutation"],"extensions":{"code":"variableNotUsed","variableName":"input"}}]}
DEBUG Failed storing buildDEBUG Trying 'Buildpacks' strategy
DEBUG no buildpack builder configured, skipping
DEBUG result image:<nil> error:<nil>
DEBUG Trying 'Dockerfile' strategy
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: EnsureMachineRemoteBuilderInput!) { ensureMachineRemoteBuilder(input: $input) { machine { id state ips { nodes { family kind ip } } }, app { name organization { slug } } } }","variables":{"input":{"appName":"REDACTED","organizationId":null}}}
}
DEBUG <-- 200 https://api.fly.io/graphql (194.2ms) {"data":{"ensureMachineRemoteBuilder":{"machine":{"id":"REDACTED","state":"starting","ips":{"nodes":[{"family":"v6","kind":"privatenet","ip":"fdaa:0:48a8:a7b:87:0:847e:2"},{"family":"v6","kind":"public","ip":"2604:1380:4510:401::847e:3"},{"family":"v4","kind":"private","ip":"172.19.7.26"}]}},"app":{"name":"fly-builder-REDACTED-9008","organization":{"slug":"personal"}}}}}
DEBUG checking ip &{Family:v6 Kind:privatenet IP:fdaa:0:48a8:a7b:87:0:847e:2 MaskSize:0}
Waiting for remote builder fly-builder-REDACTED-9008... connecting ⣾ DEBUG --> POST https://api.fly.io/graphql {{"query":"query ($appName: String!) { app(name: $appName) { id name hostname deployed status version appUrl config { definition } organization { id slug } services { description protocol internalPort ports { port handlers } } ipAddresses { nodes { id address type createdAt } } imageDetails { repository version } } }","variables":{"appName":"REDACTED"}}
}
Waiting for remote builder fly-builder-REDACTED-9008... connecting
...
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: ValidateWireGuardPeersInput!) { validateWireGuardPeers(input: $input) { invalidPeerIps } }","variables":{"input":{"peerIps":["fdaa:0:48a8:a7b:1537:0:a:102"]}}}
}
Waiting for remote builder fly-builder-REDACTED-9008... connecting ⡿ DEBUG <-- 200 https://api.fly.io/graphql (241.12ms) {"data":{"validateWireGuardPeers":{"invalidPeerIps":[]}}}

...

Waiting for remote builder fly-builder-REDACTED-9008... connecting ⣻ DEBUG Remote builder unavailable, retrying in 200ms (err: error during connect: Get "http://[fdaa:0:48a8:a7b:87:0:847e:2]:2375/_ping": connect tcp [fdaa:0:48a8:a7b:87:0:847e:2]:2375: operation timed out)
WARN Remote builder did not start on time. Check remote builder logs with `flyctl logs -a fly-builder-REDACTED-9008`
DEBUG result image:<nil> error:error connecting to docker: remote builder app unavailable
Error failed to fetch an image or build from source: error connecting to docker: remote builder app unavailable

Status of wireguard tunnels:

$ flyctl wireguard list                                                                                       
Automatically selected personal organization: REDACTED                                                        
+-----------------------------------+--------+------------------------------+                                 
|               NAME                | REGION |           PEER IP            |                                 
+-----------------------------------+--------+------------------------------+                                 
| interactive-linux--REDACTED-428   | lax    | fdaa:0:48a8:xxx:xxxx:0:a:102 |                                 
| interactive-linux--REDACTED-572   | lax    | fdaa:0:48a8:xxx:xxxx:0:a:2   |                                 
+-----------------------------------+--------+------------------------------+ 

Have tried deleting fly/config.yml, re-auth’ing and starting from scratch, but no luck. Happy to troubleshoot further.

WARN Remote builder did not start on time. 
Check remote builder logs with `flyctl logs -a fly-builder-REDACTED-9008`

Do you see anything interesting in the suggested command’s output?

I’m seeing something similar…

$ fly deploy -c fly.production.toml -a production
==> Verifying app config
--> Verified app config
==> Building image
Waiting for remote builder fly-builder-withered-fog-8661... ⣻ WARN The running flyctl background agent (v0.0.295) is older than the current flyctl (v0.0.294).
WARN The out-of-date agent will be shut down along with existing wireguard connections. The new agent will start automatically as needed.
WARN Remote builder did not start in time. Check remote builder logs with `flyctl logs -a fly-builder-withered-fog-8661`
Error failed to fetch an image or build from source: error connecting to docker: remote builder app unavailable
$ fly deploy -c fly.production.toml -a production
==> Verifying app config
--> Verified app config
==> Building image
WARN Remote builder did not start in time. Check remote builder logs with `flyctl logs -a fly-builder-withered-fog-8661`
Error failed to fetch an image or build from source: error connecting to docker: remote builder app unavailable
$ flyctl logs -a fly-builder-withered-fog-8661
2022-02-03T21:05:41Z app[82d57f42] sjc [info]net.ipv4.tcp_rmem = 8192 87380 33554432
2022-02-03T21:05:41Z app[82d57f42] sjc [info]net.ipv4.udp_rmem_min = 16384
2022-02-03T21:05:41Z app[82d57f42] sjc [info]net.ipv4.tcp_wmem = 8192 65536 33554432
2022-02-03T21:05:41Z app[82d57f42] sjc [info]net.ipv4.udp_wmem_min = 16384
2022-02-03T21:05:41Z app[82d57f42] sjc [info]Running /dockerproxy
2022-02-03T21:05:41Z app[82d57f42] sjc [info]time="2022-02-03T21:05:41.992219418Z" level=info msg="Build SHA:d8136858bf5972cdc4373d1c30cb8e5c6176858f Time:2021-12-27T20:34:34Z"
2022-02-03T21:05:42Z app[82d57f42] sjc [info]time="2022-02-03T21:05:42.279362887Z" level=info msg="Starting up"
2022-02-03T21:05:42Z app[82d57f42] sjc [info]time="2022-02-03T21:05:42.283235057Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2022-02-03T21:05:42Z app[82d57f42] sjc [info]time="2022-02-03T21:05:42.283477692Z" level=debug msg="Listener created for HTTP on unix (/var/run/docker.sock)"
2022-02-03T21:05:42Z app[82d57f42] sjc [info]time="2022-02-03T21:05:42.283536863Z" level=warning msg="Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network." host="tcp://0.0.0.0:2375"
2022-02-03T21:05:42Z app[82d57f42] sjc [info]time="2022-02-03T21:05:42.283569465Z" level=warning msg="Binding to an IP address, even on localhost, can also give access to scripts run in a browser. Be safe out there!" host="tcp://0.0.0.0:2375"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.284086540Z" level=debug msg="Listener created for HTTP on tcp (0.0.0.0:2375)"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.284129871Z" level=debug msg="Containerd not running, starting daemon managed containerd"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.287287482Z" level=info msg="libcontainerd: started new containerd process" pid=620
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.288030556Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.288048790Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.288081792Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.288090468Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.373443979Z" level=info msg="starting containerd" revision=7b11cfaabd73bb80907dd23182b9347b4245eb5d version=v1.4.12
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.397619983Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.397966272Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.401253206Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"ip: can't find device 'aufs'\\nmodprobe: can't change directory to '/lib/modules': No such file or directory\\n\"): skip plugin" type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.401280928Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.401444014Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /data/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.401456457Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402078634Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402104753Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402214879Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402442857Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402614128Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /data/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402661617Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402703766Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.402739232Z" level=info msg="metadata content store policy set" policy=shared
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.403753695Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.403794972Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.404876420Z" level=info msg="loading plugin \"io.containerd.service.v1.introspection-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.404926774Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.404957011Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405027934Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405101041Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405138752Z" level=info msg="loading plugin \"io.containerd.service.v1.leases-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405168217Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405197081Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405240132Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405495160Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.405658977Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407085452Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407136989Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407602502Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407665770Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407691749Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407703411Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407717417Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407731183Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407760057Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407807857Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.407825420Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408171589Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408199010Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408214209Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408225961Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408596666Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408709748Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408759071Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408779158Z" level=debug msg="sd notification" error="<nil>" notified=false state="READY=1"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.408793205Z" level=info msg="containerd successfully booted in 0.038945s"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.414451875Z" level=debug msg="Created containerd monitoring client" address=/var/run/docker/containerd/containerd.sock
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.416242864Z" level=debug msg="Started daemon managed containerd"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.418356818Z" level=debug msg="Golang's threads limit set to 57240"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.421081277Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.421099190Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.421114940Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.421123456Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.421457733Z" level=debug msg="metrics API listening on /var/run/docker/metrics.sock"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.422856485Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.422874539Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.422887524Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.422906149Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.423637480Z" level=debug msg="Using default logging driver json-file"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.424957445Z" level=debug msg="[graphdriver] priority list: [btrfs zfs overlay2 fuse-overlayfs aufs overlay devicemapper vfs]"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.425395977Z" level=debug msg="processing event stream" module=libcontainerd namespace=plugins.moby
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427335073Z" level=error msg="mkdir /data/docker/overlay2/check-overlayfs-support471292860/lower1: no space left on device" storage-driver=overlay2
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427357566Z" level=error msg="[graphdriver] prior storage driver overlay2 failed: driver not supported"
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427539326Z" level=debug msg="Cleaning up old mountid : start."
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427658249Z" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427677936Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427778886Z" level=debug msg="received signal" signal=terminated
2022-02-03T21:05:43Z app[82d57f42] sjc [info]time="2022-02-03T21:05:43.427862873Z" level=debug msg="sd notification" error="<nil>" notified=false state="STOPPING=1"
2022-02-03T21:05:44Z app[82d57f42] sjc [info]time="2022-02-03T21:05:44.428955929Z" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix:///var/run/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
2022-02-03T21:05:44Z app[82d57f42] sjc [info]failed to start daemon: error initializing graphdriver: driver not supported
2022-02-03T21:05:44Z app[82d57f42] sjc [info]Name:   default
2022-02-03T21:05:44Z app[82d57f42] sjc [info]Driver: docker-container
2022-02-03T21:05:44Z app[82d57f42] sjc [info]Nodes:
2022-02-03T21:05:44Z app[82d57f42] sjc [info]Name:     default
2022-02-03T21:05:44Z app[82d57f42] sjc [info]Endpoint: default
2022-02-03T21:05:44Z app[82d57f42] sjc [info]Error:    Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
2022-02-03T21:05:44Z app[82d57f42] sjc [info]time="2022-02-03T21:05:44.447478141Z" level=debug msg="Checking dockerd healthyness"
2022-02-03T21:05:44Z app[82d57f42] sjc [info]time="2022-02-03T21:05:44.447616931Z" level=error msg="error waiting on docker: exit status 1"
2022-02-03T21:05:44Z app[82d57f42] sjc [info]time="2022-02-03T21:05:44.451570735Z" level=info msg="dockerd has exited"
2022-02-03T21:05:44Z app[82d57f42] sjc [info]time="2022-02-03T21:05:44.451581796Z" level=fatal msg="dockerd exited before we could ascertain its healthyness"

Bumping this thread because I’m still seeing this. @kurt or anyone else at Fly - any ideas? See logs above.

Hey! It looks like the builder ran out of space, which means the Docker daemon cannot start on it. Are you able to login with fly ssh console -a your-fly-builder-app?

A quick fix for the space issue is to fly apps destroy your-builder-app.

Thanks, that seemed to fix it. Are there any plans to improve the experience in the future?

Yes! We’re looking into it.

Here’s a technical explanation, for your peace of mind and for ours!

Currently, builders store Docker’s data on a persistent volume. If that volume fills up, the Docker daemon can’t start and the VM exits.

Normally we run docker prune to clean up unused objects when the disk is 90% full. But, a few things could prevent that from triggering.

One is heavy usage of the builder, either by retaining large build caches, or by pushing many deploys in a short period of time. Normally, the builders idle out after 10 minutes of inactivity. So pruning would happen on the next boot.

1 Like

I think I ran in to something similar? My builder just couldn’t even load in the Node image before timing out. So this would have to do with the volume image filling up? Took me a while to figure out what was going on :sweat_smile:

Luckily deleting my builder app fixed the problem, but I couldn’t figure out why my staging branch built but my main branch couldn’t for the longest time haha

==> Building image
Waiting for remote builder fly-builder-falling-silence-6046...
Remote builder fly-builder-falling-silence-6046 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  1.418MB

#1 [internal] load remote build context
#1 DONE 0.1s

#1 [internal] load remote build context
#1 DONE 0.0s

#2 copy /context /
#2 DONE 0.2s

#3 [internal] load metadata for docker.io/library/node:14
#3 ERROR: failed to do request: Head "https://registry-1.docker.io/v2/library/node/manifests/14": dial tcp 52.72.252.48:443: i/o timeout