Hello!
Is your builder still reporting a large amount of usage?
If so, can you check out https://fly-metrics.net/ , select your builder app, and then expand the Volumes graph to see the usage reported there?
Thanks! I’m super curious about that growing usage. I may have some extra steps to take if that’s still going on and if it’s actually growing that large. In theory the volumes are capped at 50gb.
Same here today, with the same logs that @kharri1073 was getting.
Earlier in the day I was having a somewhat similar problem when deploying a trivial Node app with a buildpack (it stalled after “downloaded newer image…” with the same logs); switching to a Dockerfile allowed me to deploy, but now not anymore.
Nothing interesting with LOG_LEVEL=debug. No errors reported when cancelling the build with Ctrl-C. I can destroy the builder but it doesn’t change matters. fly-metrics dashboard for the builder shows it idle – no IO, CPU < 1%.
Edit: On my side this has been using flyctl 0.0.405 & 0.0.406.
I’m seeing identical behavior to what @kharri1073 and @robjwells are describing as well. And as @hpx7 mentions, my builder is also consistently reverting to a “Suspended” state after it is created.
It was working ~12 hours ago, but now for some reason it hangs on Waiting for remote builder...
I’ve tried the following:
Reinstalled the CLI (fly v0.0.406 darwin/arm64 Commit: ba78bd6f BuildDate: 2022-10-07T23:28:22Z)
Deploys with remote builders are working for me again (with flyctl 0.0.409).
What is interesting is that on my first attempt today I got this error message, which I didn’t before:
Error error connecting to docker: failed building options: agent: failed to start
The agent failed to start with the following error log:
2022/10/11 13:21:21.054597 srv another instance of the agent is already running
I killed the existing agent process (with kill -9) and restarted the agent with flyctl agent restart and now things appear to be fine. Maybe it was a problem with the agent all along?
I have a lot of symptoms that are described here. fly deploy is stuck, builder in suspended state, but with a twist.
When accessing the “Machines” tab on the builder, the “Scale” tab appears. Desperate to find a solution, I tried to scale the builder to a dedicated cpu.
Since then the builder logs “Pulling container image” every 2s.
Deleting the builder and creating a new one (without rescaling it) doesn’t solve the problem.
After resetting wireguard with fly wireguard reset, the last mutation has a payload of two ipv6 in peerIps, the response is still the same, no invalid peer IP but now, after the last mutation I get a loop of :
DEBUG Remote builder unavailable, retrying in xms (err: Get "http://[<ipv6>]:2375/_ping": context deadline exceeded)
DEBUG Remote builder unavailable, retrying in xms (err: Get "http://[<ipv6>]:2375/_ping": context deadline exceeded)
DEBUG Remote builder unavailable, retrying in xms (err: Get "http://[<ipv6>]:2375/_ping": context deadline exceeded)
Killing the process, machines and restarting everything didn’t work for me. What worked for me though is the following (and not sure which part fixed the issue, resetting or issuing)
Restarting Wireguard and reactivating it first manually and then typing the following commands
fly wireguard reset
flyctl ssh issue --agent
fly deploy