Edge serves no TLS certificate for custom domain, *.fly.dev on same edge IP works

Since ~10:00 UTC today one of my apps’ custom domains fail TLS handshake with no peer certificate available / SSL_ERROR_SYSCALL. The app itself is healthy and
.fly.dev on the same edge IP responds normally.

Symptoms:

  • Custom domain (apex + subdomains, all resolving to the same Fly edge IP): TLS handshake hangs ~10s, no cert returned
  • .fly.dev on the same edge IP: HTTP 302 in ~0.15s :white_check_mark:

App status: all machines started, health checks passing, region ams.

Certs: wildcard cert listed as Issued.

Reproduction:
openssl s_client -connect :443 -servername
→ no peer certificate available
→ SSL handshake has read 0 bytes

Fly status page reports “All Systems Operational”. fly certs show for the hostname returns “certificate not found” — unclear if related.

Can you check the edge cert binding for this hostname? Happy to share app name and Request ID privately.

1 Like

As a small side note… There’s an active incident for this now (appearing just a few minutes after OP created the thread), for those who might not have seen it yet:

https://status.flyio.net/incidents/4fxnx2qr9x1d

We are investigating an issue with the Vault server that stores TLS certificates. Provisioning new TLS certificates may fail, and connecting to domains whose existing certificate has not yet been cached may fail.

1 Like

all our web apps are down, this is a major issue for us

This is also causing a major impact for our fleet!

Glad this was resolved fast, interested in a brief post mortem, if possible

we’ll have a more detailed postmortem up on Infra Log next week, but the tl;dr is a large Vault cluster is painful to manage and we need to move to something better suited for our use case (likely our Petsem).