422 Unprocessable Entity, {"error":"failed to get org"} - Creating A New Machine API

For the past few days I occasionally get the following error message when I try to create a machine using the API

422 Unprocessable Entity, {"error":"failed to get org"}

Usually after a few seconds and some retry the error disappears. I originally thought it may be a temporary networking issue but the issue has persisted since last week. I am yet to figure out what the root cause is. Nothing has change in my code.

Has anyone else experienced a similar issue?
Has there been any changes to Fly Machine API?

Did some troubleshooting and so far my verdict is some sort of networking/routing issue on Fly API side :thinking: This is hard to replicate as it doesn’t happen all the time but it has been pretty noticeable in last few days and I need to retry many times to get a machine created.

Machine create request to Machines API, it often result in HTTP 422 {"error":"failed to get org"}. I also tried to change the region.

Request to Hong Kong

POST /v1/apps/coder-default-default/machines HTTP/1.1
Host: api.machines.dev
User-Agent: Go-http-client/1.1
Content-Length: 471
Authorization: Bearer MASKED
Content-Type: application/json
Accept-Encoding: gzip

{
  "config": {
    "auto_destroy": true,
    "env": {
      "FOO": "BAR"
    },
    "guest": {
      "cpu_kind": "shared",
      "cpus": 2,
      "memory_mb": 2048
    },
    "image": "busybox:latest",
    "init": {
      "exec": [
        "/bin/sleep",
        "inf"
      ]
    },
    "services": [
      {
        "internal_port": 80,
        "ports": [
          {
            "handlers": [
              "tls",
              "http"
            ],
            "port": 443
          },
          {
            "handlers": [
              "http"
            ],
            "port": 80
          }
        ],
        "protocol": "tcp"
      },
      {
        "internal_port": 8080,
        "ports": [
          {
            "handlers": [
              "tls",
              "http"
            ],
            "port": 8080
          }
        ],
        "protocol": "tcp"
      }
    ]
  },
  "name": "default",
  "region": "hkg"
}

HTTP 422 Response

HTTP/1.1 422 Unprocessable Entity
content-type: application/json; charset=utf-8
fly-span-id: c437e163c15d5945
fly-trace-id: d1c35bf6909052ba84267139b87a2b8f
date: Tue, 13 Feb 2024 10:39:23 GMT
x-envoy-upstream-service-time: 508
server: Fly/ba9e227a (2024-01-26)
transfer-encoding: chunked
via: 1.1 fly.io
fly-request-id: 01HPH0SHPXGVMW9FFJ7Q8RYN3H-sin

1e
{"error":"failed to get org"}

0

I repeated the same request a few minutes later and got HTTP 200.

POST /v1/apps/coder-default-default/machines HTTP/1.1
Host: api.machines.dev
User-Agent: Go-http-client/1.1
Content-Length: 471
Authorization: Bearer MASKED
Content-Type: application/json
Accept-Encoding: gzip

{
  "config": {
    "auto_destroy": true,
    "env": {
      "FOO": "BAR"
    },
    "guest": {
      "cpu_kind": "shared",
      "cpus": 2,
      "memory_mb": 2048
    },
    "image": "busybox:latest",
    "init": {
      "exec": [
        "/bin/sleep",
        "inf"
      ]
    },
    "services": [
      {
        "internal_port": 80,
        "ports": [
          {
            "handlers": [
              "tls",
              "http"
            ],
            "port": 443
          },
          {
            "handlers": [
              "http"
            ],
            "port": 80
          }
        ],
        "protocol": "tcp"
      },
      {
        "internal_port": 8080,
        "ports": [
          {
            "handlers": [
              "tls",
              "http"
            ],
            "port": 8080
          }
        ],
        "protocol": "tcp"
      }
    ]
  },
  "name": "default",
  "region": "hkg"
}

HTTP 200 Response

HTTP/1.1 200 OK
content-type: application/json; charset=utf-8
fly-span-id: f0ae086a123e7a9d
fly-trace-id: 854ebb9dfd159033056b1318abe5dd6a
date: Tue, 13 Feb 2024 10:41:01 GMT
x-envoy-upstream-service-time: 2409
server: Fly/ba9e227a (2024-01-26)
transfer-encoding: chunked
via: 1.1 fly.io
fly-request-id: 01HPH0WFVQB9NXTV0KHMVTES8F-sin

{
  "id": "17811ee6a02608",
  "name": "default",
  "state": "created",
  "region": "hkg",
  "instance_id": "...",
  "private_ip": "...",
  "config": {
    "env": {
      "FOO": "BAR"
    },
    "init": {
      "exec": [
        "/bin/sleep",
        "inf"
      ]
    },
    "guest": {
      "cpu_kind": "shared",
      "cpus": 2,
      "memory_mb": 2048
    },
    "services": [
      {
        "protocol": "tcp",
        "internal_port": 80,
        "ports": [
          {
            "port": 443,
            "handlers": [
              "tls",
              "http"
            ]
          },
          {
            "port": 80,
            "handlers": [
              "http"
            ]
          }
        ],
        "force_instance_key": null
      },
      {
        "protocol": "tcp",
        "internal_port": 8080,
        "ports": [
          {
            "port": 8080,
            "handlers": [
              "tls",
              "http"
            ]
          }
        ],
        "force_instance_key": null
      }
    ],
    "image": "busybox:latest",
    "auto_destroy": true,
    "restart": {}
  },
  "image_ref": {
    "registry": "registry-1.docker.io",
    "repository": "library/busybox",
    "tag": "latest",
    "digest": "sha256:538721340ded10875f4710cad688c70e5d0ecb4dcd5e7d0c161f301f36f79414",
    "labels": null
  },
  "created_at": "2024-02-13T10:41:01Z",
  "updated_at": "2024-02-13T10:41:01Z",
  "events": [
    {
      "id": "01HPH0WHSE8E6F3V1PTKJMSV3C",
      "type": "launch",
      "status": "created",
      "source": "user",
      "timestamp": 1707820861230
    }
  ]
}

I changed the region to Singapore, same behavior :person_shrugging:

Hi @pi3ch ! I’m starting to investigate your issue. Would be possible you are creating the app and immediately trying to create the Machine via API? If that’s the case, would be possible if you try to create the app, pause for a few to let the state propagate and then try to create the machine?

1 Like

Hi @aschiavo, interesting, that’s pretty much what I have tried yesterday after my post and so far I haven’t seen the issue :tada:. I use my self-maintained terraform to provision machines. I tried different providers (e.g. rest_api). Adding a time delay between fly_app and fly_machine resources seems do the job.

# SNIP
resource "fly_app" "workspace_fly_app" {
  name = "my-app-name"
  org  = var.fly_org
}

resource "time_sleep" "wait_2_seconds" {
  depends_on = [fly_app.workspace_fly_app]

  create_duration = "2s"
}

resource "fly_machine" "workspace" {
  depends_on   = [time_sleep.wait_2_seconds]
  # ...

}

It looks like a new racing bug :thinking: I will be doing more test over the coming days.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.