Since (~10:30 UTC) 4 hours ago (of what I know), all Machines in app, udns
, have failed to respond to any TCP or HTTP request.
There have been no deployments since 15 hours or so back.
Here are the only logs from the service, right now:
proxy[0e286074a97686] syd [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
logs:
2022-09-20T11:55:28.712 proxy[0e286074a97686] syd [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:28.744 proxy[9e78424ebd5783] nrt [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:28.763 proxy[9e78424ebd5783] nrt [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:28.814 proxy[6e82559a296187] iad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:29.119 proxy[73287114c44085] gru [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:29.314 proxy[6e82559a296187] iad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:29.496 proxy[73287114c44085] gru [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:29.575 proxy[6e82559a296187] iad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:29.613 proxy[5683991c16e28e] cdg [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:30.040 proxy[6e82975b296287] fra [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:30.721 proxy[6e82559a296187] iad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:31.226 proxy[6e82975b296287] fra [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:33.718 proxy[e784375b066683] mad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:33.770 proxy[73287114c44085] gru [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:34.569 proxy[6e82975b296287] fra [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:34.599 proxy[73287114c44085] gru [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:34.780 proxy[6e82975b296287] fra [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:35.058 proxy[e784375b066683] mad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:35.173 proxy[6e82975b296287] fra [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:35.294 proxy[6e82559a296187] iad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:35.420 proxy[0e286074a97686] syd [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:35.698 proxy[5683991c16e28e] cdg [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:36.098 proxy[73287114c44085] gru [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
2022-09-20T11:55:36.682 proxy[e784375b066683] mad [error] Error: failed to connect to fly machine: Supposedly started, and not stopped, but: connection refused by remote server
I couldn’t flyctl ssh
into any of the Machine VMs, too.
LOG_LEVEL=DEBUG fly ssh console -a udns
out:DEBUG Retrieving app info for udns
DEBUG --> POST https://api.fly.io/graphql
{
"query": "query ($appName: String!) { appcompact:app(name: $appName) { id name hostname deployed status appUrl platformVersion organization { id slug } postgresAppRole: role { name } imageDetails { repository version } } }",
"variables": {
"appName": "udns"
}
}
DEBUG {}
DEBUG querying for release resulted to v0.0.393
DEBUG <-- 200 https://api.fly.io/graphql (430.68ms)
{
"data": {
"appcompact": {
"id": "udns",
"name": "udns",
"hostname": "udns.fly.dev",
"deployed": true,
"status": "deployed",
"appUrl": "https://137.66.7.89",
"platformVersion": "machines",
"organization": {
"id": "ekQlpPKZYvQjGH4o4ge0Y82e0Oh8m5",
"slug": "celzero"
},
"postgresAppRole": null,
"imageDetails": {
"repository": "unknown",
"version": "unknown"
}
}
}
}
DEBUG --> POST https://api.fly.io/graphql
{
"query": "mutation($input: ValidateWireGuardPeersInput!) { validateWireGuardPeers(input: $input) { invalidPeerIps } }",
"variables": {
"input": {
"peerIps": [
"fdaa:0:35f3:a7b:1bfe:0:a:702"
]
}
}
}
DEBUG {}
DEBUG <-- 200 https://api.fly.io/graphql (2.03s)
{
"data": {
"validateWireGuardPeers": {
"invalidPeerIps": []
}
}
}
DEBUG --> POST https://api.fly.io/graphql
{
"query": "mutation($input: ValidateWireGuardPeersInput!) { validateWireGuardPeers(input: $input) { invalidPeerIps } }",
"variables": {
"input": {
"peerIps": [
"fdaa:0:35f3:a7b:1bfe:0:a:702"
]
}
}
}
DEBUG {}
DEBUG <-- 200 https://api.fly.io/graphql (270.6ms)
{
"data": {
"validateWireGuardPeers": {
"invalidPeerIps": []
}
}
}
DEBUG --> GET http://[fdaa:0:35f3::3]:4280/v1/apps/udns/machines
DEBUG <-- 200 http://[fdaa:0:35f3::3]:4280/v1/apps/udns/machines (1m0.03s)
[
{
"id": "23908066ec6875",
"name": "udns-vin",
"state": "started",
"region": "",
"instance_id": "",
"private_ip": "",
"config": null,
"created_at": "1970-01-01T00:00:00Z",
"updated_at": "0001-01-01T00:00:00Z"
},
{
"id": "73287340a36285",
"name": "udns-hkg",
"state": "started",
"region": "hkg",
"instance_id": "01GDBTRJ2T2NEMBTRZNVPCH5S4",
"private_ip": "fdaa:0:35f3:a7b:7f07:f613:60ba:2",
"config": {
"env": {
"CLOUD_PLATFORM": "fly",
"LOG_LEVEL": "info",
"MACHINES_TIMEOUT_SEC": "180",
"NODE_ENV": "production"
},
"init": {
"exec": null,
"entrypoint": null,
"cmd": null,
"tty": false
},
"image": "registry.fly.io/udns:deployment-01GDBJYPWZQT6N34AZ437M6C26",
"metadata": {
"process_group": "app"
},
"restart": {
"policy": ""
},
"services": [
{
"protocol": "tcp",
"internal_port": 8080,
"ports": [
{
"port": 443
},
{
"port": 8080
}
],
"concurrency": {
"type": "connections",
"hard_limit": 75,
"soft_limit": 60
}
},
{
"protocol": "tcp",
"internal_port": 10000,
"ports": [
{
"port": 853
},
{
"port": 10000
}
],
"concurrency": {
"type": "connections",
"hard_limit": 75,
"soft_limit": 60
}
}
],
"guest": {
"cpu_kind": "shared",
"cpus": 1,
"memory_mb": 256
},
"metrics": null
},
"image_ref": {
"registry": "registry.fly.io",
"repository": "udns",
"tag": "deployment-01GDBJYPWZQT6N34AZ437M6C26",
"digest": "sha256:baba8a10af4d87038621bee9cd5eec98e0884c23c69fdeb57c4a302d0d4e52ac",
"labels": {}
},
"created_at": "2022-09-18T00:43:11Z",
"updated_at": "2022-09-20T03:06:29Z",
"events": [
{
"type": "start",
"status": "started",
"source": "flyd",
"timestamp": 1663643189335
},
{
"type": "start",
"status": "starting",
"source": "user",
"timestamp": 1663643188841
},
{
"type": "exit",
"status": "stopped",
"request": {
"exit_event": {
"exit_code": 0,
"exited_at": 1663643169004,
"guest_exit_code": 0,
"guest_signal": -1,
"oom_killed": false,
"requested_stop": false,
"restarting": false,
"signal": -1
},
"restart_count": 0
},
"source": "flyd",
"timestamp": 1663643169328
},
{
"type": "start",
"status": "started",
"source": "flyd",
"timestamp": 1663642036547
},
{
"type": "start",
"status": "starting",
"source": "user",
"timestamp": 1663642036183
},
{
"type": "exit",
"status": "stopped",
"request": {
"exit_event": {
"exit_code": 0,
"exited_at": 1663642031471,
"guest_exit_code": 0,
"guest_signal": -1,
"oom_killed": false,
"requested_stop": false,
"restarting": false,
"signal": -1
},
"restart_count": 0
},
"source": "flyd",
"timestamp": 1663642032047
}
]
},
....
]
}
]
Oops, something went wrong! Could you try that again?
I am not sure what caused this behaviour as the Machines were working for 10 hours before the event, just fine. The only odd thing I notice is tcp_checks
are at their default (but apparently that’s an expected limitation of the current flyctl deploy -c ...
for Machines apps), and seem very aggressive (for udns
):
➜ fly config display -a udns
"tcp_checks": [
{
"grace_period": "1s",
"interval": "15s",
"restart_limit": 0,
"timeout": "2s"
}
]
fly m list -a udns
machines:23 machines have been retrieved.
udns
ID NAME STATE REGION IMAGE IP ADDRESS VOLUME CREATED LAST UPDATED
23908066ec6875 udns-vin started : 1970-01-01T00:00:00Z 0001-01-01T00:00:00Z
73287340a36285 udns-hkg started hkg udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:7f07:f613:60ba:2 2022-09-18T00:43:11Z 2022-09-20T03:06:29Z
3d8dd1efe40989 udns-ams started ams udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:23c5:eb5b:a06b:2 2022-09-19T19:25:21Z 2022-09-20T08:43:44Z
9e78424ebd5783 udns-nrt started nrt udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:aaa:206:68e0:2 2022-09-18T00:39:18Z 2022-09-20T05:26:40Z
e784375b066683 udns-mad started mad udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:2656:8569:420:2 2022-09-18T00:44:26Z 2022-09-20T03:51:24Z
1781994c9d9089 udns-ord started ord udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:b0bb:7e80:f1ad:2 2022-09-18T00:46:05Z 2022-09-20T06:20:00Z
06e82557bd2987 udns-dfw started dfw udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:2203:b916:9f80:2 2022-09-18T00:44:50Z 2022-09-20T03:00:31Z
6e82975b296287 udns-fra started fra udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:b9b8:a0ab:ad59:2 2022-09-18T00:43:51Z 2022-09-20T02:32:01Z
5683991c16e28e udns-cdg started cdg udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:aebf:737e:d205:2 2022-09-18T00:38:00Z 2022-09-20T02:49:07Z
3d8d991fe14289 udns-lax started lax udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:85:148c:bf5e:2 2022-09-18T00:42:46Z 2022-09-20T04:30:36Z
6e82559a296187 udns-iad started iad udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:21e0:9463:9892:2 2022-09-18T01:20:05Z 2022-09-20T03:33:17Z
73287119f67285 udns-sin started sin udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:232f:561:a9c6:2 2022-09-18T00:41:28Z 2022-09-20T02:03:24Z
39080554b5ed87 udns-sjc started sjc udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:b2e2:1878:ee0b:2 2022-09-18T00:40:34Z 2022-09-20T04:34:37Z
7328726b001085 udns-ewr started ewr udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:ab3:6adc:89ee:2 2022-09-17T23:30:21Z 2022-09-20T04:23:02Z
73287171b5d385 udns-maa started maa udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:14bf:c355:dc01:2 2022-09-18T01:20:24Z 2022-09-19T21:58:57Z
d5683061b4158e udns-mia started mia udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:2cc3:a2b6:8dce:2 2022-09-18T00:49:49Z 2022-09-20T02:42:59Z
3d8d340be44089 udns-lhr started lhr udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:aa9c:bc2a:6b28:2 2022-09-18T00:42:09Z 2022-09-20T01:16:19Z
73287114c44085 udns-gru started gru udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:1f60:f6e5:9160:2 2022-09-18T00:40:59Z 2022-09-20T08:40:37Z
0e286074a97686 udns-syd started syd udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:9e6b:360b:afc8:2 2022-09-18T00:40:00Z 2022-09-20T01:35:26Z
3d8dd59fe15689 udns-sea started sea udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:2dbb:d65b:d958:2 2022-09-17T23:29:33Z 2022-09-20T01:09:30Z
d568373eb628e9 udns-scl started scl udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:d33b:925:71f8:2 2022-09-18T00:52:55Z 2022-09-20T01:50:11Z
0e286035f43867 udns-yul started yul udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:b729:2dea:5e07:2 2022-09-18T00:56:11Z 2022-09-19T23:16:26Z
24d891dec4d687 udns-yyz started yyz udns:deployment-01GDBJYPWZQT6N34AZ437M6C26 fdaa:0:35f3:a7b:aa2:e626:2efe:2 2022-09-18T00:47:38Z 2022-09-20T02:21:56Z
Also notice (from the logs) that a Machine deployed to vin
has gone.