Fly machines list API acting odd: returning started instances with no image or region information

I have been seeing this sporadically. After every 30-60 mins, when I query the fly machines API, I get an instance as follows

...
    {
        "id": "e7843d0ae34183",
        "name": "<name>",
        "state": "started",
        "region": "",
        "instance_id": "01H21FQ61ZBXSG4Q24735DTH8S",
        "private_ip": "",
        "config": {
            "init": {},
            "restart": {}
        },
        "image_ref": {
            "registry": "",
            "repository": "",
            "tag": "",
            "digest": "",
            "labels": null
        },
        "created_at": "1970-01-01T00:00:00Z",
        "updated_at": "0001-01-01T00:00:00Z"
    }
...

When this happens, the list API takes much longer to respond also. I am not able to see the machine details using the machine details API, the API just does not seem to return a response.

After I stop this instance and delete it, list API starts to respond in a usual quick time.

Please help. Has been happening since the past 48-72 hours a lot. My app creates ephemeral game servers and destroys them when they are not needed. So many machines get created and destroyed. The server id mentioned above is real, and I have since destroyed it.

After some additional debugging… I can see that for the same API call using different API tokens, one of them seems to return correctly, while another seems to return the incomplete machine details as above. Not sure if this is replicable, but it is happening in one instance for me right now.

Have been observing since yesterday. It does not seem to be related to API token, it just happens sporadically. I just saw it happen again on one request, and then a minute later the same request returned complete data.

In case it helps to debug internally, here are the response headers

fly-trace-id: 3266eead1beaa70982ac83ca3e515c36
date: Mon, 05 Jun 2023 08:27:50 GMT
x-envoy-upstream-service-time: 5006
server: Fly/bba2dac0 (2023-06-02)
transfer-encoding: chunked
content-encoding: gzip
via: 1.1 fly.io
fly-request-id: 01H25APPA849TTJZ122DP65YTX-maa

in the response I got an incomplete object as follows

    {
        "id": "4d890e9be7ee87",
        "name": "<name>",
        "state": "started",
        "region": "",
        "instance_id": "01H24XPAPWF4R4N9DK51DG65VB",
        "private_ip": "",
        "config": {
            "init": {},
            "restart": {}
        },
        "image_ref": {
            "registry": "",
            "repository": "",
            "tag": "",
            "digest": "",
            "labels": null
        },
        "created_at": "1970-01-01T00:00:00Z",
        "updated_at": "0001-01-01T00:00:00Z"
    }

I use the list call to figure out the current stopped/running servers, and route the users to an appropriate one to host their game.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.