Fly deploy seems to have failed connection at Proxy

Zane_Milakovic · February 14, 2022, 7:26pm

I am building a Go / KeyDB app.

Of 5 deploys today, only 1 worked. None of them fail the deployment or health check, and all worked locally. I am not convinced yet it’s my code. But they won’t receive requests.

https://showdown-labs.fly.dev

A few things -

The first thing my server does on a request, is log the path. Logs don’t show incoming requests.
The [statics] section works fine when the deploy works (response is sent back). When the deployment finishes and is healthy, but there is no response, [statics] breaks also, which should not be hitting my app.
Only errors I am seeing are from proxy

After a while of the failed request, Minutes later the log will show this error.

2022-02-14T18:29:19Z proxy[bc9c1c18] chi [error]Error 2: Internal problem

Notice the proxy in the message, as opposed to runner or app.

Can my deployment break the proxy? I am really confused.

I have also seen this error, but I am unsure if it was during a period of failure.

error.message="Undocumented" 2022-02-14T11:59:34Z proxy[ffe08d52] mia [error]error.code=1 request.method="POST" request.url="/HNAP1/" request.id="01FVW1M3RM7ADXBC563MYCNT20" response.status=502

greg · February 14, 2022, 7:54pm

Could you post the fly.toml file to see if anything stands out in that? Probably not but doesn’t hurt.

I’m not very familiar with Go but I assume it builds using a port of 8080. So that would need to match what is in the fly.toml for requests to get to the app.

Given you say the first thing the server does is log a request, it sounds like the request is not getting to your app from the proxy. And hence that internal problem showing in the logs. The small delay for that line to appear shouldn’t be an issue as logs for me are near realtime but can be slightly delayed. I assume that if the Fly proxy receives a request but then can’t, for some reason, connect to your app, and so pass it along, reports a problem, and hence that line.

Zane_Milakovic · February 14, 2022, 8:03pm

I can post the fly.toml, but that has not changed.

Yeah, the requets do not appear to be getting to my application. The logs are delayed, but when I do a fresh deploy, I can see those logs, or any messages from the server starting.

# fly.toml file generated for showdown-labs on 2022-01-29T17:11:59-05:00

app = "showdown-labs"

kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[build]
  dockerfile = "./Dockerfile"

  [build.args]
    BP_KEEP_FILES = "./goapp/public/*"

[deploy]
  strategy = "rolling"

[env]
  PORT = "8080"
  KEYDB_HOST = "showdown-labs-keydb-fly.internal"
  KEYDB_PAGE_CACHE_DB = 0
  KEYDB_PAGE_DATA_DB = 1
  KEYDB_STALE_TTL = 3600
  KEYDB_STORE_TTL = 86400

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[statics]]
  guest_path = "/goapp/public"
  url_prefix = "/"

[[services]]
  http_checks = []
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

kurt · February 14, 2022, 8:06pm

This is probably related to our slow state propagation. Does this app have one instance?

The unknown errors are requests trying to hit VMs that have been shut down. The error could be better, but these will go away as we roll out better service discovery (which is happening this week, unless something goes wrong).

Zane_Milakovic · February 14, 2022, 8:20pm

Yeah, it does only have one instance. But it never recovers. An hour later, it still won’t recieve traffic.

I never had this issue before today.

Even if it’s not routing to my app, should [statics] be impacted by this service discovery issue?

kurt · February 14, 2022, 8:23pm

Oh I misunderstood your post, sorry about that. Let me have a look at what might be happening here.

[statics] are impacted by the service discovery issue, each new deploy registers statics the same way it does app instances, those also need to propagate.

kurt · February 14, 2022, 9:54pm

Ok, we identified the issue. The physical host your app was landing on wasn’t accepting connections for new VMs. Your app should be working now. This was a failure mode we haven’t seen before, so we’re instrumenting it not so we can catch it next time.

Zane_Milakovic · February 14, 2022, 10:19pm

Yeah it totally is.

I spent HOURS trying to understand what I did wrong in my app to cause that. Happy it wasn’t me.

What red flags can I look too in the future to understand my app vs fly?

Also are their health checks that run that should of identified this was not reachable? Or are the health checks lower level and internal so proxy issues won’t be flagged by it?

kurt · February 14, 2022, 10:41pm

The health checks happen out of band from the proxy (right now), so the proxy wouldn’t have seen it.

If you have an app successfully deploy and it’s not accepting traffic, that’s a giant red flag. I didn’t read your initial post right or I’d have told you that immediately.

Zane_Milakovic · February 15, 2022, 12:05am

I am just new to Go, so worried I screwed something up. =)

But fair enough, I just didn’t know if there was a way to check service discovery or anything.

Topic		Replies	Views
Error logs saying "Internal problem" result in 502s	10	454	August 16, 2021
my application does not implement Build debugging	1	299	March 28, 2023
Failed to proxy HTTP request	4	692	May 25, 2023
FLy status shows up but app is down for seven hours	9	833	March 21, 2023
Something not right on Fly.io	35	1896	March 4, 2023

Fly deploy seems to have failed connection at Proxy

Related topics