Issue with new deployment, also how do I roll back?

davidhodge · September 23, 2021, 7:51pm

Hey folks,

I deployed to nikola-receipt-receiver and now everything appears to be down. Tried to re-deploy and am having the same issue… am getting 404s and 401s. I’m still investigating. Is there a way to roll back to v303 while I investigate?

Thank you

davidhodge · September 23, 2021, 7:52pm

example: https://subscribe.nikolaapp.com/30d_trial

In safari I get “Unauthorized”

Via curl I get “Not found settings.”

michael · September 23, 2021, 7:53pm

Looks like the app is running, seeing a lot of this in the logs

9/23/21 19:52:53.976 Initiating DB reconnect for <DB.YieldingDBConnectionPool.DBQueryWorker object at 0x7fb4b846fb90>

michael · September 23, 2021, 7:54pm

Happy to rollback for you if you need!

michael · September 23, 2021, 7:55pm

@davidhodge doh I missed the subject, rolling back now

davidhodge · September 23, 2021, 7:56pm

I have basically 0% confidence this isn’t my fault, but it does look like I’m seeing errors not generated by my code. If there isn’t an obvious issue in fly, we should probably roll it back, but I would def like to understand what’s going on.

jake · September 23, 2021, 7:57pm

I had the same issue too. I did a deploy, and my whole project was taken offline, and a bunch of requests from random referrers came into my CDN. I have deleted my app on fly.io, and pointed my DNS to my backup server for the time being. The same issue happened to me a month ago: Error logs saying "Internal problem" result in 502s

Edit: this happend late 2PM - early 3PM EST. BunnyCDN logs have now slowed down, but am monitoring…

michael · September 23, 2021, 7:59pm

I rolled back to v305 which was stable and healthy, but still seeing the 4xx error. We don’t generate any 4xx errors. Is there anything in your code or libraries that might print that “No found settings” message?

michael · September 23, 2021, 8:01pm

When did this happen to you? Can you share an app name?

davidhodge · September 23, 2021, 8:01pm

Thanks! Sadly the issue still persists. I do not see “Not found settings.” in my code. This is my verbose curl. The 404 is interesting to me…

% curl -v https://subscribe.nikolaapp.com/30d_trial
*   Trying 77.83.142.8...
* TCP_NODELAY set
* Connected to subscribe.nikolaapp.com (77.83.142.8) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=subscribe.nikolaapp.com
*  start date: Jul 21 11:29:01 2021 GMT
*  expire date: Oct 19 11:28:59 2021 GMT
*  subjectAltName: host "subscribe.nikolaapp.com" matched cert's "subscribe.nikolaapp.com"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fac8280d600)
> GET /30d_trial HTTP/2
> Host: subscribe.nikolaapp.com
> User-Agent: curl/7.64.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 4294967295)!
< HTTP/2 404 
< date: Thu, 23 Sep 2021 20:01:24 GMT
< server: Fly/4b6d89a (2021-09-22)
< via: 2 fly.io
< fly-request-id: 01FGA3Z2MATVDCK389BEZF20AS
< 
* Connection #0 to host subscribe.nikolaapp.com left intact
Not found settings.* Closing connection 0

jake · September 23, 2021, 8:05pm

It’s dead now – I had to kill it as my CDN was getting bombarded with requests, but I PM’d you. Please don’t try to bring by app back from death – something going on with Fly.io’s networking was redirecting thousands of requests to my CDN – and killing my app has seemed to settle it

davidhodge · September 23, 2021, 8:08pm

@michael it appears from my logs that my instances are not getting any traffic. That DB reset log is expected, it’s just periodically resetting a mysql connection.

The situation suggests to me that:

something, perhaps some now-problematic part of my docker file or fly config, is making it some traffic isn’t getting to the server.
some aspect of the config hasn’t been fully reverted, as traffic is still not getting through even after the revert.

jake · September 23, 2021, 8:11pm

Michael, FYI, the behaviour is the exact same as what went on with my deployment last month: Error logs saying "Internal problem" result in 502s

App goes down immediately after a deploy
A flood of traffic comes into my CDN (if my app cannot oauth, it’ll redirect itself to a static page that’s located on my CDN). Referrers are all random, e.g. Instagram, Pornhub, etc. etc.

I’d imagine something weird is going on with anycast.

kurt · September 23, 2021, 8:13pm

@jake 502s with those log messages are a very different problem. Application 4xx errors aren’t something we generate, so this is most likely unrelated.

We would like to look at the flood of traffic you are getting with referrers, that is strange and also probably unrelated. Will you post a new topic with logs?

jake · September 23, 2021, 8:16pm

Hi kurt,

But the behaviour has been exactly the same as last time. I have just received a screenshot from a client of mine where my subdomain that pointed to a fly.io server redirected to a completely different server asking for some stripe payment.

Do you have an email I can send my BunnyCDN logs too and this screenshot I received?

bekit · September 23, 2021, 8:17pm

I’m also getting no response to a new deploy of an app, but no errors in the logs. I reverted to a previous working version and still have the same issue. Nothing seems to be hitting my app from the fly load balancers.

michael · September 23, 2021, 8:19pm

we’re looking into it

davidhodge · September 23, 2021, 8:21pm

You’re probably aware, but FYI I’m now getting 502s

accounts · September 23, 2021, 8:26pm

I had a similar issue, but seems like it’s fixed now, seconds after I posted.

davidhodge · September 23, 2021, 8:32pm

Update: I don’t have a full read on everything yet, but it looks like my health-checks are working again. Also it looks like it didn’t revert (which is now fine)

Topic		Replies	Views
SSL Connection Issues after a deployment	21	800	April 13, 2021
Something went wrong? Questions / Help	42	1504	September 22, 2022
Error logs saying "Internal problem" result in 502s	10	458	August 16, 2021
Deployment issue for one service	3	478	December 25, 2020
Trying to deploy a Phoenix app, nothing works, close to giving up	6	733	December 11, 2021

Issue with new deployment, also how do I roll back?

Related topics