Is Fly.io a Good Choice for Edge AI Computing?

Hey everyone,

I’m exploring options for deploying AI models at the edge and was wondering if Fly.io is a good choice for this use case. Given its focus on global app deployment and low-latency performance, it seems like a potential fit for AI computer inference workloads.

A few questions I have:

  • Does Fly.io provide GPU support for AI inference?
  • How well does it handle scaling AI workloads across different regions?
  • Are there any performance limitations when running AI models on Fly.io?

I would love to hear from anyone who has tried deploying AI workloads on Fly.io! Any insights or alternatives are also welcome.

Thanks!

I wonder if the question is too general to answer. Have a look at the GPU offerings to see if they might work for you. You’ll be setting up all your own software, so if you are after AI-as-a-service, Fly is probably not a good fit.

Do you have a system that runs in Docker already? If so you can try Fly in a couple of hours, making it easy to experiment.

  1. Yes
  2. Keep in mind fly GPU is limited in numbers and regions
  3. I’m not sure if ai compute at edge has any significant benefits. LLM inference usually has a large latency cost already and users expect it. A second or so latency is expected so you’re not gaining much by shaving 100ms off.

My 2c

1 Like

We came to similar conclusions: We Were Wrong About GPUs · The Fly Blog

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.