Deploying fasttrackr.ai-like AI Assistants on Fly.io - Any Gotchas?

mystical_engineer · May 14, 2025, 2:11pm

I’m experimenting with deploying a lightweight AI assistant platform (similar to Fasttrackr.AI It is like a AI Stack For RIAs & Financial Advisors) on Fly io to reduce latency for a global user base.

The setup involves FastAPI + queue workers, and some endpoints hit external LLM APIs. A few questions:

Has anyone had success minimizing cold starts for sporadically-used AI endpoints?
Are there patterns for region-specific routing when users log in from multiple countries?
For usage that fluctuates during the workday (like a productivity tool), is there a cost-efficient autoscaling strategy?

Appreciate any real-world advice - especially if you’ve worked on a Fasttrackr.AI style app or anything AI-heavy on Flyio.

Topic		Replies	Views
Is Fly.io a Good Choice for Edge AI Computing? gpu	4	51	April 1, 2025
Handling Long-Running Tasks with Automatic Machine Shutdown on Fly.io Python autoscaling	10	138	March 21, 2025
Fly NodeJS Contractor Help help-me-help-you	1	488	May 5, 2022
CLI/ Billing / General inquiry \| urgent response will be well appreciated Questions / Help	4	145	March 21, 2024
How I Fly blog	3	609	November 28, 2023

Deploying fasttrackr.ai-like AI Assistants on Fly.io - Any Gotchas?

Related topics