Issue with new deployment, also how do I roll back?

Hey folks,

I deployed to nikola-receipt-receiver and now everything appears to be down. Tried to re-deploy and am having the same issue… am getting 404s and 401s. I’m still investigating. Is there a way to roll back to v303 while I investigate?

Thank you

example: https://subscribe.nikolaapp.com/30d_trial

In safari I get “Unauthorized”

Via curl I get “Not found settings.”

Looks like the app is running, seeing a lot of this in the logs

9/23/21 19:52:53.976 Initiating DB reconnect for <DB.YieldingDBConnectionPool.DBQueryWorker object at 0x7fb4b846fb90>
1 Like

Happy to rollback for you if you need!

@David doh I missed the subject, rolling back now

1 Like

I have basically 0% confidence this isn’t my fault, but it does look like I’m seeing errors not generated by my code. If there isn’t an obvious issue in fly, we should probably roll it back, but I would def like to understand what’s going on.

I had the same issue too. I did a deploy, and my whole project was taken offline, and a bunch of requests from random referrers came into my CDN. I have deleted my app on fly.io, and pointed my DNS to my backup server for the time being. The same issue happened to me a month ago: Error logs saying "Internal problem" result in 502s

Edit: this happend late 2PM - early 3PM EST. BunnyCDN logs have now slowed down, but am monitoring…

I rolled back to v305 which was stable and healthy, but still seeing the 4xx error. We don’t generate any 4xx errors. Is there anything in your code or libraries that might print that “No found settings” message?

When did this happen to you? Can you share an app name?

Thanks! Sadly the issue still persists. I do not see “Not found settings.” in my code. This is my verbose curl. The 404 is interesting to me…

% curl -v https://subscribe.nikolaapp.com/30d_trial
*   Trying 77.83.142.8...
* TCP_NODELAY set
* Connected to subscribe.nikolaapp.com (77.83.142.8) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=subscribe.nikolaapp.com
*  start date: Jul 21 11:29:01 2021 GMT
*  expire date: Oct 19 11:28:59 2021 GMT
*  subjectAltName: host "subscribe.nikolaapp.com" matched cert's "subscribe.nikolaapp.com"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fac8280d600)
> GET /30d_trial HTTP/2
> Host: subscribe.nikolaapp.com
> User-Agent: curl/7.64.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 4294967295)!
< HTTP/2 404 
< date: Thu, 23 Sep 2021 20:01:24 GMT
< server: Fly/4b6d89a (2021-09-22)
< via: 2 fly.io
< fly-request-id: 01FGA3Z2MATVDCK389BEZF20AS
< 
* Connection #0 to host subscribe.nikolaapp.com left intact
Not found settings.* Closing connection 0

It’s dead now – I had to kill it as my CDN was getting bombarded with requests, but I PM’d you. Please don’t try to bring by app back from death – something going on with Fly.io’s networking was redirecting thousands of requests to my CDN – and killing my app has seemed to settle it

@michael it appears from my logs that my instances are not getting any traffic. That DB reset log is expected, it’s just periodically resetting a mysql connection.

The situation suggests to me that:

  1. something, perhaps some now-problematic part of my docker file or fly config, is making it some traffic isn’t getting to the server.
  2. some aspect of the config hasn’t been fully reverted, as traffic is still not getting through even after the revert.

Michael, FYI, the behaviour is the exact same as what went on with my deployment last month: Error logs saying "Internal problem" result in 502s

  1. App goes down immediately after a deploy
  2. A flood of traffic comes into my CDN (if my app cannot oauth, it’ll redirect itself to a static page that’s located on my CDN). Referrers are all random, e.g. Instagram, Pornhub, etc. etc.

I’d imagine something weird is going on with anycast.

@jake 502s with those log messages are a very different problem. Application 4xx errors aren’t something we generate, so this is most likely unrelated.

We would like to look at the flood of traffic you are getting with referrers, that is strange and also probably unrelated. Will you post a new topic with logs?

Hi kurt,

But the behaviour has been exactly the same as last time. I have just received a screenshot from a client of mine where my subdomain that pointed to a fly.io server redirected to a completely different server asking for some stripe payment.

Do you have an email I can send my BunnyCDN logs too and this screenshot I received?

I’m also getting no response to a new deploy of an app, but no errors in the logs. I reverted to a previous working version and still have the same issue. Nothing seems to be hitting my app from the fly load balancers.

we’re looking into it

1 Like

You’re probably aware, but FYI I’m now getting 502s

I had a similar issue, but seems like it’s fixed now, seconds after I posted.

Update: I don’t have a full read on everything yet, but it looks like my health-checks are working again. Also it looks like it didn’t revert (which is now fine)