Machine stuck in lease loop: is this a known bug?

Hi everyone,

We’ve been running a Phoenix/Elixir app (`rivena`) on Fly with 2 shared-cpu-1x machines in CDG. Last Friday, one machine got permanently stuck and never recovered.

The error, repeated by the proxy every 1-2 seconds for hours:

machine ID 48e3e4ef57d358 lease currently held by 93c3d86b-a3ac-5d3d-9a47-0725b7c98f14@tokens.fly.io, expires at 2026-06-13T14:06:26Z

Some retries also got `rate limit exceeded`.

This happened 3 times on the same machine between 13:36 and 14:06 UTC on June 13. The second machine (`d89492eb145718`) was completely unaffected and handled all traffic fine — so no downtime for users.

We ended up destroying the stuck machine and creating a fresh one, which started immediately without issues.

A few questions:

- Is this a known issue with Fly’s proxy / machine leasing?

- What triggers a lease to become permanently held like this?

- Is there a way to recover without destroying the machine?

- Should we expect it to happen again?

Thanks!

Hi there!

If a machine has an active lease, it will prevent some actions from happening to it, a lease is a temporary lock on the machine to prevent other processes from updating the machine. Leases will expire and should show their expiration dates when you run into this error.

You can manually clear leases with fly machine lease clear <machine_id>, there is some documentation for that here: https://fly.io/docs/flyctl/machine-leases-clear/

Thanks! Good to know `fly machine lease clear` exists. We ended up destroying the machine but we’ll use `lease clear` next time. Closing this.