These deploys have worked for weeks now, suddenly not working:
==> Verifying app config
Validating fly.toml
✓ Configuration is valid
--> Verified app config
==> Building image
Waiting for depot builder...
WARN failed to finish build in graphql: Post "https://api.fly.io/graphql": context canceled
The last line shows after I ctrl-c the build.
Edit:
Here is the error that eventually pops:
Error: failed to fetch an image or build from source: error building: timed out connecting to machine: failed to list workers: Unavailable: connection error: desc = "transport: authentication handshake failed: EOF"
Appreciate it, that override is working for now it seems.
For what it’s worth, I also tried the “Reset” button in the dashboard on the builder, but that didn’t do anything as far as I could tell. No alerts or changes to the UI letting me know it worked or anything.
Getting the same thing. Are there any advantages to using Depot as a builder? This is the 2nd issue in the last week and our builds are running slower.
There were several listed in the original announcement, but the main reason to acclimate your own process to them at this point is simply that the others will be going away in the near(-ish) future, .
Another option is to figure out how to deploy your own instances of the legacy builders. (These are available as an open-source repository, under the non-obvious name of RCHAB.) The neater Nix/Guix approach also has many vocal proponents here—although that way is not a small transition.
In all three scenarios, Fly.io no longer hands out freebies in the form of forgetting to bill for the huge performance vCPU class Machine, the large volume, …
Depot was down for me yesterday too, so definitely seems like this was an outage (second one in the two weeks I’ve been trying out this platform…).
It is discouraging that there’s no evidence of this issue reflected on the status page, particularly since someone from Fly noted they were looking into it so they were clearly aware something was going on.
Things seem fine again today, but it’s the lack of transparency that makes me uneasy about committing to this platform more than the presence of issues.
The Infrastructure Log is the place to go for that transparency—although it’s a retrospective write-up (typically coming out the week after the incident).
The current status page mostly only shows global outages (“[t]his page is for updates about global incidents”). In contrast, the implication above was that this particular one was only affecting a subset of users.
The gap in between is broadly acknowledged to be a flaw in the current setup, and Fly said during one of last year’s outages that they intend to fill it with automatically reported system-wide metrics (including the results of “synthetic alerts”—operations proactively attempted themselves). A person who was having trouble with his own builds could just look at the graphs and see an accumulation of red boxes in iad, say, and then have some confidence that wasn’t his misconfiguration, etc.
Thanks, that does help to know. Agreed that it’s a pretty major gap in the meantime though, I hope they’re working on that to ship sooner rather than later.
Any update with this? My deploys were working fine and now all of a sudden I am getting this error as well.
Tried to use the fly deploy --depot=false but it created a new and very expensive machine instead of using my existing one.
I tried the ‘reset’ button for the App Builder as well to no avail.
Any ideas?
Thanks!
If its name starts with fly-builder- then you won’t be charged for it. (That’s how the legacy builders have always worked.)
You can keep an eye on the billing preview in the dashboard, though, if you have doubts.
Aside: If there are two legacy builder Machines then I’m not sure of the policy, however. Personally, I’d probably destroy the older one, just to be on the safe side…