We’ve been running for months and this is the first time I’ve seen this particular issue. Is there an service issue at the moment with consul in iad or is anyone else experiencing this issue?
We just started seeing this issue this morning as well. Machines that have been running for months are unable to boot and connect to a consul URL. One of our database servers is in an endless reboot cycle.
The consul URL’s server appears to be disconnecting the request with no response (premature connection disconnection with no data, showing as “consul-iad-5.fly-shared.net unexpectedly closed the connection.” in Chrome). The logs show “EOF” for us, too.
Well, glad it’s not just us then. Hopefully some fly.io folks notice the thread and can chime in. I’m reluctant to do any deploys at the moment since I don’t know if it’ll break our production systems.
Yeah, I wouldn’t. Right now I don’t believe any new servers can start (or restart) successfully.
The Consul server that is not responding is a Hashicorp Consul instance (presumably) which orchestrates / organizes networking configurations. Apps/servers register with it when they come online so that it can route traffic to them. If it’s offline… no registration, no routing.
I restarted our dev servers and they came back up without any issues. We’ll do a few more test deploys before doing a prod release, but so far so good. Thanks!