"Could not find an instance to route to"

Hi there,

I’m trying to deploy a basic Phoenix app with a Postgres DB, but get:

2023-01-23T14:30:07Z proxy[f7066498] cdg [warn]request.method="GET" request.url="https://restless-dust-3654.fly.dev/favicon.ico" request.id="01GQFGHJ8DV4T6H0Q33BZDF9ZA-cdg"
error.message="instance refused connection"

2023-01-23T14:30:13Z proxy[f7066498] cdg [error]request.method="GET" request.id="01GQFGHJ8DV4T6H0Q33BZDF9ZA-cdg"
error.message="could not find an instance to route to"

Other people seem to have stumbled into this same issue (here, here and here).

It seems that the underlying vm f7066498 is unhealthy, but I can’t restart it. What can I do about it?

1 Like

Hi,

Does that instance/vm show up when you run fly status? Only there was an issue yesterday where old instances were being routed. Perhaps this could be left over from that?

Or perhaps this is unrelated. If so, it may be worth trying to add a new one, even temporarily. I think it’s fly scale count 2 (or whatever number you want). Of course the more vms you have, the greater the redundancy, but the greater the cost.

Hi @greg,

thanks for your response. This is what fly status gave me:

App
  Name     = xxxx
  Owner    = personal
  Version  = 1
  Status   = running
  Hostname = xxxx.fly.dev
  Platform = nomad

Instances
ID          PROCESS VERSION REGION  DESIRED STATUS  HEALTH CHECKS       RESTARTS    CREATED
62ceeaa3    app     1       cdg     run     running 1 total, 1 critical 0           15h46m ago

I just did fly scale count 2, then fly deploy again, and got:

 2 desired, 2 placed, 0 healthy, 2 unhealthy
--> v3 failed - Failed due to unhealthy allocations - not rolling back to stable job version 3 as current job has same specification and deploying as v4

fly status:

App
  Name     = xxxx
  Owner    = personal
  Version  = 3
  Status   = running
  Hostname = xxxx.fly.dev
  Platform = nomad

Deployment Status
  ID          = 4b06ad20-3e21-2cb7-38ea-f26cc2db18a4
  Version     = v3
  Status      = failed
  Description = Failed due to unhealthy allocations - not rolling back to stable job version 3 as current job has same specification
  Instances   = 2 desired, 2 placed, 0 healthy, 2 unhealthy

Instances
ID      	PROCESS	VERSION	REGION	DESIRED	STATUS 	HEALTH CHECKS      	RESTARTS	CREATED
d41e5cb6	app    	3      	cdg   	run    	running	1 total, 1 critical	0       	1h10m ago
62ceeaa3	app    	3      	cdg   	run    	running	1 total, 1 critical	0       	16h57m ago

fly vm status d41e5cb6 now gives me:

Instance
  ID            = d41e5cb6
  Process       = app
  Version       = 3
  Region        = cdg
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 0
  Created       = 1h11m ago

Events
TIMESTAMP           	TYPE           	MESSAGE
2023-01-24T08:46:28Z	Received       	Task received by client
2023-01-24T08:46:28Z	Task Setup     	Building Task Directory
2023-01-24T08:46:35Z	Started        	Task started by client
2023-01-24T08:52:02Z	Alloc Unhealthy	Task not running for min_healthy_time of 10s by deadline

Checks
ID                              	SERVICE 	STATE   	OUTPUT
3df2415693844068640885b45074b954	tcp-8080	critical	dial tcp 172.19.15.146:8080: connect: connection refused

Recent Logs

and fly vm status 62ceeaa3:

Instance
  ID            = 62ceeaa3
  Process       = app
  Version       = 3
  Region        = cdg
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 0
  Created       = 16h59m ago

Events
TIMESTAMP           	TYPE           	MESSAGE
2023-01-23T16:59:23Z	Received       	Task received by client
2023-01-23T16:59:23Z	Task Setup     	Building Task Directory
2023-01-23T16:59:27Z	Started        	Task started by client
2023-01-23T17:04:23Z	Alloc Unhealthy	Task not running for min_healthy_time of 10s by deadline
2023-01-23T17:13:03Z	Alloc Unhealthy	Task not running for min_healthy_time of 10s by deadline
2023-01-24T08:52:02Z	Alloc Unhealthy	Task not running for min_healthy_time of 10s by deadline

Checks
ID                              	SERVICE 	STATE   	OUTPUT
3df2415693844068640885b45074b954	tcp-8080	critical	dial tcp 172.19.16.42:8080: connect: connection refused

Recent Logs

strangely fly apps list tells me that all is fine:

NAME              OWNER        	STATUS   	PLATFORM	LATEST DEPLOY
xxxx              personal     	running  	nomad   	1h13m ago
xxxx-db           personal     	deployed 	machines
1 Like

Hi,

No problem. Ah, yep, that output shows the issue. It’s not that issue from a couple of days ago, as clearly the vms you have were created X hours ago. Instead it’s that your vm(s) are failing their health checks, and as such the Fly proxy (essentially a load balancer) has nowhere to route the request to. As it requires the vm is healthy. And as none of yours are, it can’t do much from its end.

So the fix is to make the healthcheck pass. Then Fly will see the vms as healthy, and so will have a vm to route the request to.

How to do that depends on your app. I don’t know its innards but you can see that Fly is trying to tcp connect to port 8080 … and failing. I would assume (total guess) your app is not listening on 8080. Perhaps it is listening on another port? You can customise the healthcheck Fly does in your fly.toml file. If your file was a default one e.g created by fly launch you may need to edit its healthcheck section to change the path/port to where your app actually listens. You need some route/path/port to return healthy. Or you could remove the healthcheck entirely (in theory) but that’s not ideal as you do want them to be healthy.

And then you can scale back down to one vm (if you want to reduce cost).

1 Like

Thanks again @greg, it’s all working now :slight_smile:

Since I was using the default config provided in fly.toml, I didn’t even think of modifying its values. My bad. Also, I didn’t have a good mental model of an app running vs an app being deployed.

This being said, since a fresh Phoenix app listens on port 4000 by default, maybe the default value could be modified in the auto-generated fly.toml file…?

1 Like