Hitting rate limits? (502 Bad Gateway)

I am using GraphQL to deploy many apps in parallel and I am hitting what appear to be rate limits.

Things start breaking at around attempting to deploy 20 applications simultaneously.

Server just starts responding with “502 Bad Gateway”

At first I thought it is limited to how fast apps can be created, but I am hitting them even for deleting apps:

mutation($appId: ID!) {
  deleteApp(appId: $appId) {
    organization { id }

https://status.flyio.net/ indicates no incidents.

No additional details in the response.

If GraphQL interface is rate-limited, how to by-pass these limits?

Yeah, the GraphQL API isn’t designed to programmatically launch apps/VMs. We have a Machines API specifically for this.

You don’t need to create and remove apps, necessarily. You can create/start/stop/destroy individual Machines with the Machines API.

What are you using the VMs for? We can give you some ideas about how best to use the API. GraphQL creates a lot of moving infrastructure, Machines are much simpler and more reliable for this kind of work.

Yeah, happy to refactor to whatever makes sense.

In short, I have a container that hosts websocket server.

What I am doing at the moment is, I have a websocket server hosted outside of Fly (I need this anyway), that accepts connections, and for every accepted connection it spins up a new Fly app. This is because I need to reserve an exclusive container instance for the duration of the websocket connection.

If there was a lower level way to achieve the same, I could do that.

Using machines, is there a way to get a unique IP address (path, port, or another method) for that machine that I could use for the duration of the machine?

Ah! We have good support for that kind of workload with Machines, but it’s a big change from what you’re doing now.

The best way to do this on Fly.io is to create a router app that accepts all requests, then create a bunch of machines in a separate app. These machines should have a service with hard_limit=1 set for concurrency, and they should exit when they’re done with a request.

When you get a request to the router app, you can do whatever logic you need to authenticate it, then use the fly-replay header to reroute the request/websocket to one of your worker machines: The Fly-Replay Header · Fly Docs

Machines reset their state when they stop, so it’s safe to send different users to them between stop/start. Replay handles websockets just fine (we use this ourselves). The hard_limit ensures that only one websocket can be attached to a single machine at any given time. The important thing here is that the machine exits by itself when the websocket ends, otherwise it won’t clean up.

Part of the problem you’re running into now is that the load balanced IPs we give to apps are relatively complex. So you’re having to wait on coordination, the IP to propagate to our edges, etc. If you are up for making this work with Machines + Replay I think it’ll be a lot more reliable.

Our router actually starts a stopped machine if it needs to, so you can basically just replay requests and let us handle the rest (once a machine exists).

This is an interesting solution!

If I wanted to proxy the request through the router app, it sounds like I could just initiate new web socket from the router app?

I think you mentioned fly-replay only because that would be more efficient.

Really cool design by the way!

1 Like

Yeah you could handle the websocket from the router, then start a machine, then proxy to that machine.

Replay just saves all that effort. We handle starting, proxying at that point and your router can go back to doing other stuff.

Need to stress test both setups.

Being able to inspect traffic and inject messages makes my application design quite a bit simpler. However, if your design is noticeable more efficient, then (with quite a bit of work lol) I can move that logic to the individual nodes have them talk with the router when needed.

Thank you!

@kurt I don’t understand how auto scaling is supposed to work.

How do I set the machine type in fly.toml? There is nothing about it in App Configuration (fly.toml) · Fly Docs

Do I need to start machines as described in this guide? Machines · Fly Docs

If yes, then what does auto scaling do?

If yes, then why does it ask for an image parameter as opposed to inheriting it from the app?

Putting the above aside, I would expect that if hard_limit=1 that Fly creates a new machine for every request. Though, given that app configuration does not mention anything about machine types, it is not clear how would it know what machine type to provision. Furthermore, even when I have a dozen open sockets, I don’t see anything under the fly machines list --app ....

After a bit of searching, I discovered that there is fly autoscale This appears to be disabled by default. It would probably be a good idea to mention this somewhere next to hard_limit documentation.

Found Scaling and Autoscaling · Fly Docs which answered all my questions.

Autoscaling is deprecated and doesn’t work with machines.

Machines are very low level, they’re the plumbing we’re using to replace Nomad for all apps. Most of our flyctl features are higher level (including scaling).

The best way to autoscale machines is to create as many as you want to have on at any given time, then leave them stopped. We’ll boot one when a request comes in.

This is probably how autoscaling is going to work for all apps in the future. fly autoscale max=50 will just create 50 stopped machines.

Thanks Kurt!

autoscale was enough for me to prototype what I am building. Will refactor to manually managing machines. Very impressed with Fly.io!