Cloudflare 525 error randomly occurs

Yeah I’ve solved this for the next 15 years :rofl:

2 Likes

Thanks, @pier this an excellent guide!

2 Likes

Thanks for awesome guidance.

I just tried following the steps and change SSL/TLS encryption mode to Full (strict). Now the page is getting Error 525.

I guess after I set SSL_KEY in secrets, in my app i need a function to put the SSL_KEY into a file and place the file in the root? how you accomplish this?

Or

Or the SSL_KEY env will be automatically detected by fly.io as cert key? and I do not need to do anything from my side?

fyi, I am quite new this these things, and i am in confused mode. hehe

I guess after I set SSL_KEY in secrets, in my app i need a function to put the SSL_KEY into a file and place the file in the root? how you accomplish this?

I don’t think you need to save it to a file in order to use it.
You could directly read the env variable in your app code and use it from there.
Steps 2 & 3 from Use Cloudflare Certs to FlyIO - #8 by ignoramous shows you how.

Or the SSL_KEY env will be automatically detected by fly.io as cert key? and I do not need to do anything from my side?

I’m afraid you have to handle TLS in your app :slight_smile:
Fly won’t auto-detect that env as a cert and do anything.

1 Like

Get it! I will try that. Thanks FrequentSolver again :gift_heart:

To read the certs and enable HTTPS will depend on your application, framework, runtime, etc.

For example, in Fastify and other Node servers you simply pass the certs when configuring the server.

https://nodejs.org/dist/latest-v14.x/docs/api/https.html#https_https_createserver_options_requestlistener

I don’t use CloudFlare, but is it really not possible to have them communicate with https://<app>.fly.dev? Because that’s kind of silly.

You probably can setup the CDN with a CNAME record instead of an A record over a Fly subdomain.

My issue was about making requests from Cloudflare Workers to the Fly app while using an A record to the IPv4 of the Fly app.

Yep, if you set that app.fly.dev as a CNAME record, they will pull from that using https.

My issue was getting random 525s (when behind Cloudflare, which I wanted to do to get its geo-ip header). But it did work 99% of the time.

Not sure from @Loonb if they are consistently getting a 525 or randomly, like me. I solved it by handing TLS in-app but given all the changes to the proxy/routing since then, that may not be needed as I’ve not tried reverting back.

2 Likes

Yes randomly and rarely. Currently I leave it as it is. Coz I have no idea how to do the TLS in app using django.

This was really helpful, thank you!

I’ve set up my app to handle SSL and it all works, but I’m still getting random 525 errors that last 5 - 10 minutes.

I’m clutching at straws trying to figure out what I’m missing to stop these errors.

This might be a n00b question, but if I’ve added the origin certs to my app, do I still need the fly created SSL certificate and _acme-challenge DNS record? Could having those in place be causing my random 525 errors?

Thank you so much! :pray:

In step 5 I detailed how to tell Fly that your app will be handling SSL (instead of Fly).

If Cloudflare is trying to communicate with your app, and it receives the Fly certs instead of the Cloudflare issued cert, then you will get the 525 errors.

AFAIK the DNS records are just for verification purposes and shouldn’t interfere with SSL.

1 Like

Perfect, thank you! I’ll delete the Fly SSL certificate and hopefully the 525 errors will stop! :crossed_fingers:

Is this happening randomly from the same browser for the same user?

It looks like we can’t fulfill #5 here: Community Tip - Fixing Error 525: SSL handshake failed - Tutorial - Cloudflare Community

(I know this was pointed out before)

Our ciphers are more limited than Cloudflare’s. This is because we’re using a different TLS library that only supports TLSv1.2 and TLSv1.3 with modern ciphers. Even with a shorter list of ciphers, we still support the vast majority of clients. Clients that don’t support any of these are considered insecure.

No, it was happening for two separate uptime checks and I checked via my laptop and phone.

For the record (know this is an old comment), this would be a dealbreaker for us since we need broader SSL termination support than Fly offers (ref).

There are also a ton of other CF features (eg WAF stuff) that we can’t give up, are happy with, and which in any case I wouldn’t expect Fly to start offering.

There’s another thread which feels modestly related, with a commenter and I looking for static count-per-region scaling.

Putting these together, Fly could support a “fewer bells-and-whistles” style of deploy, which feels like it would be a subset of the features already built in to fly. e.g.:

  • Want magic anycast routing? A & AAAA your domain to your fly anycast ip, we’ll handle SSL termination and do the rest.
  • Using Cloudflare or don’t want us doing geographic load balancing? CNAME your domain to yourapp.<region>.fly.dev and it’ll always route to your instances in that region.

(I would certainly respect that offering & maintaining more deployment styles would increase your support costs, but I have to think this kind of ‘operational parity’ would make it easier to conquest existing e.g. Heroku customers. If that’s a goal…)

1 Like

“Unsupported” for us just means we can’t do much to help. Cloud Flare’s error pages seem more designed to deflect blame than to help people debug these types of problems. We need a lot more information than they provide to prevent these errors. We wouldn’t do anything to intentionally disable it, though.

You can do what you’re asking though!. fly ips allocate-v4 --region <region> will route the way you want. You should be able to point anything you want to those.

3 Likes

One other thought about Cloudflare 525s: that code can be a bit misleading. It may not actually be caused by an SSL issue. I recently got a random 525 error page, after not touching anything, and noticed a vm in the app had been recently replaced by Fly (maybe capacity, region-load, hardware etc - have to allow for those kind of things). And so Cloudflare would have been briefly unable to connect. It reported that as a 525. Not e.g a 500 or some other 50x code.

1 Like

Yeah that makes sense! My best guess is there’s some timeout mismatch between their proxies and our proxies. If we time a connection out and they don’t, it might just be that they serve an error.

When a VM gets replaced in your app you shouldn’t get a connection error these days, but it’s possible!

1 Like

There are also a ton of other CF features (eg WAF stuff) that we can’t give up, are happy with, and which in any case I wouldn’t expect Fly to start offering.

Why not? I expect Fly to offer WAF, particularly because they already terminate HTTP/S with Nginx (edit: or, is it something else based on rust?) (and permissively licensed open source WAF for Nginx is a thing).