ECONNREFUSED to internal service

Hi there,

I’m getting an ECONNREFUSED error when trying to hit one app, from another. I have a web app and demo-services app. The web is hitting the demo-services app. The demo-services fly.toml file looks like this:

app = "demo-services"

kill_signal = "SIGINT"
kill_timeout = 5

[processes]
  worker = "yarn bull"
  graphql = "bash -c 'yarn prisma migrate deploy && yarn start:graphql'"
  webhook = "yarn start:webhook"

[deploy]
  strategy = "rolling"

[env]
  GRAPHQL_PORT = "8080"
  WEBHOOK_PORT = "8000"

[experimental]
  allowed_public_ports = []
  auto_rollback = true
  private_network = true

[[services]]
  http_checks = []
  internal_port = 8080
  processes = ["graphql"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 50
    soft_limit = 25
    type = "connections"

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"


[[services]]
  http_checks = []
  internal_port = 8000
  processes = ["webhook"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 50
    soft_limit = 25
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

You can see I have a couple different processes running, the one I’m trying to hit is graphql, which is internal (not exposed to public internet).

The url I’m using is:
http://demo-services.internal:8080/graphql

They are both node apps, and the graphql service is listening using the following code:

httpServer.listen({ port: process.env.GRAPHQL_PORT, host: '::', ipv6Only: true })

The weirdest part is:
a.) It actually works sometimes, perhaps it’s something region related, or when my machine is in a new location for the first time?
b.) I can always successfully hit the URL from my machine if I’m connected to my orgs wireguard, so it doesn’t seem to be an issue there…

So this is the actual error I occasionally get:

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info] FetchError: request to http://demo-services.internal:8080/graphql failed, reason: connect ECONNREFUSED fdaa:0:4934:a7b:2c01:da56:6713:2:8080

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at ClientRequest. (/app/node_modules/node-fetch/lib/index.js:1491:11)

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at ClientRequest.emit (node:events:527:28)

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at Socket.socketErrorListener (node:_http_client:454:9)

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at Socket.emit (node:events:527:28)

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at emitErrorNT (node:internal/streams/destroy:164:8)

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at emitErrorCloseNT (node:internal/streams/destroy:129:3)

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]     at processTicksAndRejections (node:internal/process/task_queues:83:21) {

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]   type: 'system',

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]   errno: 'ECONNREFUSED',

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info]   code: 'ECONNREFUSED'

2022-05-17T15:17:04.118 app[5c8eb155] ewr [info] }

Any help would be greatly appreciated!

PS – I followed this thread: (Node) How do I connect to my API app from my WEB app in Fly?, but that didn’t seem to do the trick for me…

You’ve done the things that should work (the IPv6/listen stuff, and calling the request using the internal port … both things that are easy to miss).

And you say it is working … sometimes.

So it would be a case of why it sometimes doesn’t resolve. A few thoughts

  1. Are you using a base image that uses alpine? I used to keep getting random DNS errors using that for Node. So do other people. Nobody has figured out why that I know of. I switched to a -slim image and the DNS issues went away.
  2. Is it correct you have two different processes listening on the same port and protocol? Personally in case there is any conflict/race, I’d either change one of the ports and/or temporarily remove one entirely. To get e.g your /graphql endpoint to work consistently, then add back in the webhook, and if that breaks it, well that would prove it was that.
  3. If none of that helps, experiment with the other hostnames available privately. Do any of those work any better? Shouldn’t be needed, but by this point it would be a guess: Private Networking
  4. If still no luck, debug the DNS the app can resolve. My example uses Fastify but the Express version will be similar. Check the bit starting import dns at the end of the long post: (Node) How do I connect to my API app from my WEB app in Fly? - #10 by greg And call that from e.g different regions to see if your hunch about it being a new vm or vm in a certain region is the culprit, since if the DNS can’t resolve, that would explain why it could not connect

Hi Greg,

Thanks for the response. After some more investigation I see that the dns.promises.resolveTxt(_apps.internal) is returning all the correct apps. I also noticed the instance id in the error message: ECONNREFUSED fdaa:0:4934:a7b:2c00:e404:37fd:2:8080 (40437fd) is correct.

I think it may* be a problem when the demo-services project is booting up a new vm, and the old instance has a desired state of stop but is still running… Shouldn’t new requests be forwarded to the new instances automatically? Or is this something I need to do on my own?

Thanks

1 Like

Ahhh. Upon further review it appears to have made a request to the wrong process within the app.

It makes sense then to get the ECONNREFUSED because that particular process is listening on a different port.

Not sure how to solve this one…


Any tips would be great.

Hi @franzwarning

Long story short, internal requests inside fly.io don’t play nice with multi-process apps.

Essentially the request is sent to one of the processes randomly and currently there is no way to control which process gets your internal request.

If you want this to work you’ll need to go through the proxy layer by using the public dns.

I would recommend splitting your multi-process app into multiple single-process apps as currently multi-process apps deploy separate VMs for each process anyway.

1 Like

Ahh ok makes sense. In general then, what’s the purpose of the processes section? Is it only meant for apps that don’t expose anything to the public? Because once you have one port exposed to the public and one process that isn’t listening on that port, you’ll always ECONNREFUSED…

1 Like

This only happens when you make an internal request. If you give your graphql service a public port and make a request using the public domain for this app and the graphql port then it should succeed as it will go through the proxy and won’t be an internal request.

1 Like

iiuc, mostly for monorepo/single-codebase (single runc-image?) but multiple processes that have different scaling characteristics (ie, each proc needs its own VM with varying RAM/CPU allocations).

More: Need Help Understanding Multi Process Environments

nb: This feature has been in preview / beta since forever.