fly proxy hanging (and other problems)

I’m unable to proxy my db connection. It just hangs without doing anything. If I quit the command, I get this:

Error: read unix ->/Users/user/.fly/fly-agent.sock: use of closed network connection

Same result when trying fly pg connect – just hangs.

Doctor’s report:

Testing authentication token… PASSED
Testing flyctl agent… PASSED
Testing local Docker instance… Nope
Pinging WireGuard gateway (give us a sec)…

Wireguard is taking forever (a sec is several minutes so far). (Going to let it run until it times out or returns something.)

(Error: ping gateway: pinger: websocket: failed to WebSocket dial: failed to send handshake request: Get “https://maa1.gateway.6pn.dev:443/”: read tcp 192.168.0.102:49547->103.6.87.247:443: read: operation timed out)

Tried a wireguard reset.

Error: upstream service is unavailable

Ok, websockets mode seems to work.

Seems things have been rather unstable lately…

Error: tunnel unavailable for organization {org name}: failed probing “{org name}”: context deadline exceeded

This is STILL persisting. Can someone please help?

Could only get it to work by connecting to my VPN. Strange.

However I now cannot deploy. Turning VPN off and nothing responds again.

I don’t think this is an issue with my local system at all, as nothing else is broken.

But whenever I interact with flyctl, I feel like I’m fighting with a brick wall.

It said it couldn’t connect to the builder.

So I destroyed the builder.

Ran deploy again, and it’s doing the exact same thing.

WARN Remote builder did not start in time. Check remote builder logs with flyctl logs -a fly-builder-damp-fire-2944

WARN Failed to start remote builder heartbeat: remote builder app unavailable

Error: failed to fetch an image or build from source: error connecting to docker: server returned a non-200 status code: 504

I check the logs, and all it’s doing is waiting for activity. No errors. Nothing.

Can you rerun this with LOG_LEVEL=debug? That should have a request ID I can look into.

1 Like

I can’t see any request IDs yet, but I am getting quite a bit of this so far:

DEBUG failed to connect metrics websocket: websocket.Dial wss://flyctl-metrics.fly.dev/socket: bad status

I don’t think that would cause the issues you’re seeing. Can you share the entire output?

Sure:

DEBUG Loaded flyctl config from/Users/?/.fly/config.yml
DEBUG determined hostname: "mike.local"
DEBUG determined working directory: "/Users/?/code/bot"
DEBUG determined user home directory: "/Users/?"
DEBUG determined config directory: "/Users/?/.fly"
DEBUG ensured config directory exists.
DEBUG ensured config directory perms.
DEBUG cache loaded.
DEBUG config initialized.
DEBUG skipped querying for new release
DEBUG client initialized.
DEBUG app config loaded from /Users/?/code/bot/fly.toml
DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "query ($appName: String!) { appbasic:app(name: $appName) { id name platformVersion organization { id slug paidPlan } } }",
  "variables": {
    "appName": "?"
  }
}


DEBUG {}
DEBUG <-- 200 https://api.fly.io/graphql (534.81ms)

DEBUG {
  "data": {
    "appbasic": {
      "id": "?",
      "name": "?",
      "platformVersion": "machines",
      "organization": {
        "id": "yw2NK09lG62eytyo75jV2OwzbVT3O05qR",
        "slug": "{org name}",
        "paidPlan": false
      }
    }
  }
}

==> Verifying app config
DEBUG Starting task manager
DEBUG Config has metrics token

Validating /Users/?/code/bot/fly.toml
Platform: machines
✓ Configuration is valid
--> Verified app config
DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "query ($appName: String!) { appcompact:app(name: $appName) { id name hostname deployed status appUrl platformVersion organization { id slug paidPlan } postgresAppRole: role { name } imageDetails { repository version } } }",
  "variables": {
    "appName": "?"
  }
}


DEBUG {}
DEBUG <-- 200 https://api.fly.io/graphql (594.12ms)

DEBUG {
  "data": {
    "appcompact": {
      "id": "?",
      "name": "?",
      "hostname": "?.fly.dev",
      "deployed": true,
      "status": "deployed",
      "appUrl": null,
      "platformVersion": "machines",
      "organization": {
        "id": "yw2NK09lG62eytyo75jV2OwzbVT3O05qR",
        "slug": "{org name}",
        "paidPlan": false
      },
      "postgresAppRole": null,
      "imageDetails": {
        "repository": "?",
        "version": null
      }
    }
  }
}

==> Building image
DEBUG trying remote docker daemon
DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "mutation($input: EnsureMachineRemoteBuilderInput!) { ensureMachineRemoteBuilder(input: $input) { machine { id state ips { nodes { family kind ip } } }, app { name organization { id slug } } } }",
  "variables": {
    "input": {
      "appName": "?",
      "organizationId": null
    }
  }
}


DEBUG {}
DEBUG failed to connect metrics websocket: websocket.Dial wss://flyctl-metrics.fly.dev/socket: bad status

DEBUG <-- 200 https://api.fly.io/graphql (1m1.73s)

DEBUG {
  "data": {
    "ensureMachineRemoteBuilder": {
      "machine": {
        "id": "91850eea2e6e83",
        "state": "started",
        "ips": {
          "nodes": [
            {
              "family": "v6",
              "kind": "public",
              "ip": "2605:4c40:216:7da5:0:b71b:7c18:1"
            },
            {
              "family": "v4",
              "kind": "private",
              "ip": "172.19.129.202"
            },
            {
              "family": "v6",
              "kind": "privatenet",
              "ip": "fdaa:2:83f9:a7b:d6e9:b71b:7c18:2"
            }
          ]
        }
      },
      "app": {
        "name": "fly-builder-damp-fire-2944",
        "organization": {
          "id": "yw2NK09lG62eytyo75jV2OwzbVT3O05qR",
          "slug": "{org name}"
        }
      }
    }
  }
}

DEBUG checking ip &{Family:v6 Kind:public IP:2605:4c40:216:7da5:0:b71b:7c18:1 MaskSize:0}

DEBUG checking ip &{Family:v4 Kind:private IP:172.19.129.202 MaskSize:0}

DEBUG checking ip &{Family:v6 Kind:privatenet IP:fdaa:2:83f9:a7b:d6e9:b71b:7c18:2 MaskSize:0}

Waiting for remote builder fly-builder-damp-fire-2944... 🌍DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "query ($appName: String!) { appbasic:app(name: $appName) { id name platformVersion organization { id slug paidPlan } } }",
  "variables": {
    "appName": "?"
  }
}


DEBUG {}
Waiting for remote builder fly-builder-damp-fire-2944... 🌎DEBUG <-- 200 https://api.fly.io/graphql (308.89ms)

DEBUG {
  "data": {
    "appbasic": {
      "id": "?",
      "name": "?",
      "platformVersion": "machines",
      "organization": {
        "id": "yw2NK09lG62eytyo75jV2OwzbVT3O05qR",
        "slug": "{org name}",
        "paidPlan": false
      }
    }
  }
}

DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "mutation($input: ValidateWireGuardPeersInput!) { validateWireGuardPeers(input: $input) { invalidPeerIps } }",
  "variables": {
    "input": {
      "peerIps": [
        "fdaa:2:83f9:a7b:1bfe:0:a:2",
        "fdaa:1:33c4:a7b:1bfe:0:a:602",
        "fdaa:2:83fc:a7b:1bfe:0:a:2"
      ]
    }
  }
}


DEBUG {}
Waiting for remote builder fly-builder-damp-fire-2944... 🌏DEBUG <-- 200 https://api.fly.io/graphql (273.48ms)

DEBUG {
  "data": {
    "validateWireGuardPeers": {
      "invalidPeerIps": []
    }
  }
}

WARN Failed to start remote builder heartbeat: failed building options: failed probing "{org name}": context deadline exceeded

DEBUG Config has metrics token

DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "\n# @genqlient\nmutation ResolverCreateBuild ($input: CreateBuildInput!) {\n\tcreateBuild(input: $input) {\n\t\tid\n\t\tstatus\n\t}\n}\n",
  "variables": {
    "input": {
      "appName": "?",
      "builderType": "remote",
      "clientMutationId": "",
      "imageOpts": {
        "buildArgs": {
          "NODE_ENV": "production"
        },
        "buildPacks": null,
        "builder": "",
        "builtIn": "",
        "builtInSettings": null,
        "dockerfilePath": "",
        "extraBuildArgs": null,
        "imageLabel": "",
        "imageRef": "",
        "noCache": false,
        "publish": true,
        "tag": "registry.fly.io/?:deployment-01H7B689RMC6FAW3W5P9AZP98X",
        "target": ""
      },
      "machineId": "",
      "strategiesAvailable": [
        "Buildpacks",
        "Dockerfile",
        "Builtin"
      ]
    }
  },
  "operationName": "ResolverCreateBuild"
}

DEBUG {0x14000aca570}
DEBUG <-- 200 https://api.fly.io/graphql (560.68ms)

DEBUG {
  "data": {
    "createBuild": {
      "id": "2954682",
      "status": "started"
    }
  }
}

DEBUG Trying 'Buildpacks' strategy

DEBUG no buildpack builder configured, skipping
DEBUG result image:<nil> error:<nil>

DEBUG Trying 'Dockerfile' strategy

DEBUG --> POST https://api.fly.io/graphql

DEBUG {
  "query": "mutation($input: EnsureMachineRemoteBuilderInput!) { ensureMachineRemoteBuilder(input: $input) { machine { id state ips { nodes { family kind ip } } }, app { name organization { id slug } } } }",
  "variables": {
    "input": {
      "appName": "?",
      "organizationId": null
    }
  }
}


DEBUG {}
DEBUG failed to connect metrics websocket: websocket.Dial wss://flyctl-metrics.fly.dev/socket: bad status

DEBUG Config has metrics token

DEBUG failed to connect metrics websocket: websocket.Dial wss://flyctl-metrics.fly.dev/socket: bad status

DEBUG <-- 504 https://api.fly.io/graphql (1m0.25s)

DEBUG <html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

DEBUG result image:<nil> error:error connecting to docker: server returned a non-200 status code: 504

DEBUG --> POST https://api.fly.io/graphql

DEBUG Config has metrics token

DEBUG {
  "query": "\n# @genqlient\nmutation ResolverFinishBuild ($input: FinishBuildInput!) {\n\tfinishBuild(input: $input) {\n\t\tid\n\t\tstatus\n\t\twallclockTimeMs\n\t}\n}\n",
  "variables": {
    "input": {
      "appName": "?",
      "buildId": "2954682",
      "builderMeta": {
        "builderType": "",
        "buildkitEnabled": false,
        "dockerVersion": "",
        "platform": "",
        "remoteAppName": "",
        "remoteMachineId": ""
      },
      "clientMutationId": "",
      "finalImage": {
        "id": "",
        "sizeBytes": 0,
        "tag": ""
      },
      "logs": "error connecting to docker: server returned a non-200 status code: 504",
      "machineId": "",
      "status": "failed",
      "strategiesAttempted": [
        {
          "error": "",
          "note": "no buildpack builder configured, skipping",
          "result": "failed",
          "strategy": "Buildpacks"
        },
        {
          "error": "error connecting to docker: server returned a non-200 status code: 504",
          "note": "",
          "result": "failed",
          "strategy": "Dockerfile"
        }
      ],
      "timings": {
        "buildAndPushMs": 60255,
        "buildMs": 60255,
        "builderInitMs": 60255,
        "contextBuildMs": -1,
        "imageBuildMs": -1,
        "pushMs": -1
      }
    }
  },
  "operationName": "ResolverFinishBuild"
}

DEBUG {0x14000d228d0}
DEBUG <-- 200 https://api.fly.io/graphql (554.63ms)

DEBUG {
  "data": {
    "finishBuild": {
      "id": "2954682",
      "status": "failed",
      "wallclockTimeMs": 60827
    }
  }
}

DEBUG Task manager done
DEBUG failed to connect metrics websocket: websocket.Dial wss://flyctl-metrics.fly.dev/socket: bad status

DEBUG Config has metrics token

DEBUG Shutdown timed out, exiting
Error: failed to fetch an image or build from source: error connecting to docker: server returned a non-200 status code: 504

(redacted a touch, just names)

Could this be related to the JNB connectivity issues?

Finally managed to deploy. First build got stuck at one of the yarn steps. So I cancelled and destroyed the builder. Ran it again and all went smooth.

Going to boil this down to network/gateway issues, which sounds plausible given that my fibre line was also hiccuping throughout the day.

I managed to proxy as well, but then restarted the db instance as it was failing cpu health-checks quite regularly. Now I cannot proxy.

Error: tunnel unavailable for organization {org name}: failed probing “{org name}”: context deadline exceeded

What does “context deadline exceeded” mean? It sounds like it has something to do with a timeout of sorts, but it doesn’t wait very long to show the error.

@ben-io – do you have any updates? All of these issues are persisting.

I’m unable to do anything at this point. Can’t clear wireguard. Can’t use websockets. Uninstalling and re-installing does not help.

When running the agent in the foreground, I see notes about dropped connections, unavailable upstream services, and context deadline exceeded.

No idea what else to try.

Please help.

WARN Failed to start remote builder heartbeat: failed building options: failed probing “personal”: context deadline exceeded

Error: failed to fetch an image or build from source: failed building options: failed probing “personal”: read tcp [fdaa:1:33c4:a7b:1bfe:0:a:600]:19746->[fdaa:1:33c4::3]:53: i/o timeout

DEBUG {
  "data": {
    "finishBuild": {
      "id": "2989109",
      "status": "failed",
      "wallclockTimeMs": 8442
    }
  }
}

Enabled websockets, disabled and restarted the agent twice, and it could then start the build. Everything was going smoothly, until it was loading the build context, which went very slowly, and eventually timed out with the following:

------
 > [internal] load build context:
------
Error: failed to fetch an image or build from source: error building: failed to solve: rpc error: code = Canceled desc = grpc: the client connection is closing

I cannot do a local build as docker times out at random points, usually during an apt command (either getting a package or unpacking another – it’s completely random).

1 Like

Hi Mike, if you’re in South Africa or working on apps/builders hosted in JNB, there appears to have been ongoing network congestion issues across many Internet Service Providers in the area due to a pair of undersea cable breaks that occurred on Sunday. We worked with our datacenter provider to apply routing changes to mitigate the network issues as best we can, but this is probably the source of ongoing issues you’re experiencing.