Since ~10:00 UTC today one of my apps’ custom domains fail TLS handshake with no peer certificate available / SSL_ERROR_SYSCALL. The app itself is healthy and
.fly.dev on the same edge IP responds normally.
Symptoms:
Custom domain (apex + subdomains, all resolving to the same Fly edge IP): TLS handshake hangs ~10s, no cert returned
.fly.dev on the same edge IP: HTTP 302 in ~0.15s
App status: all machines started, health checks passing, region ams.
Certs: wildcard cert listed as Issued.
Reproduction:
openssl s_client -connect :443 -servername
→ no peer certificate available
→ SSL handshake has read 0 bytes
Fly status page reports “All Systems Operational”. fly certs show for the hostname returns “certificate not found” — unclear if related.
Can you check the edge cert binding for this hostname? Happy to share app name and Request ID privately.
As a small side note… There’s an active incident for this now (appearing just a few minutes after OP created the thread), for those who might not have seen it yet:
We are investigating an issue with the Vault server that stores TLS certificates. Provisioning new TLS certificates may fail, and connecting to domains whose existing certificate has not yet been cached may fail.
we’ll have a more detailed postmortem up on Infra Log next week, but the tl;dr is a large Vault cluster is painful to manage and we need to move to something better suited for our use case (likely our Petsem).