I’m running a FastAPI project on Fly.io, deployed as a microservice using Docker. I’m taking advantage of Fly.io’s serverless-like behavior: when there’s no incoming traffic for a while, the machine shuts down, and it spins up again when a request comes in. This is great for reducing costs — but I’m facing an issue with cold starts.
I currently have 2 machines in my Fly.io app. When there’s no traffic for some time, both go to sleep. The issue occurs when a request comes in while both machines are cold.
Here’s what happens:
A request hits the app → machines start spinning up.
For about 2–3 seconds, I consistently get 502 Bad Gateway (nginx) errors.
After that, FastAPI starts responding correctly.
So the backend eventually comes online, but nginx seems to start before FastAPI is ready. During those first few seconds, it’s probably trying to proxy requests to port 8000, but uvicorn isn’t listening yet.
Dockerfile:
FROM python:3.11-slim AS backend
WORKDIR /app
COPY ./requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /app/requirements.txt
COPY . /app
FROM nginx:latest AS frontend
COPY ./nginx/nginx.conf /etc/nginx/nginx.conf
RUN ln -sf /usr/share/zoneinfo/Europe/Istanbul /etc/localtime
COPY --from=backend /usr/local /usr/local
COPY --from=backend /app /app
COPY start.sh /start.sh
RUN chmod +x /start.sh
EXPOSE 8080
CMD ["/start.sh"]
proxy_next_upstream and related directives may be helpful. Per the documentation, passing a request to the next server can be limited by the number of tries and by time.
You can reduce (and possibly all but eliminate) this window by changing auto-stop in your fly.toml from off to suspend. At the moment, this is limited to machines with a max of 2GB of RAM, and in some cases applications don’t take well to clock skew issues when woken up. But should it work for you, your application will be back up and running almost instantly.
Thank you very much for your quick response. Solving this on the nginx side using proxy_next_upstream makes a lot of sense.
Wouldn’t changing the auto_suspend setting from off to suspend affect the cost? I assumed it would likely incur higher charges.
As an alternative, I solved the issue by updating my start.sh script as follows:
#!/bin/bash
cd /app
uvicorn main:app --host 0.0.0.0 --port 8000 &
FASTAPI_PID=$!
echo "Waiting for FastAPI to start..."
MAX_RETRIES=15
RETRY_COUNT=0
until curl -s http://127.0.0.1:8000/health > /dev/null 2>&1; do
RETRY_COUNT=$((RETRY_COUNT+1))
if [ $RETRY_COUNT -ge $MAX_RETRIES ]; then
echo "Timed out waiting for FastAPI after 15 seconds. Starting nginx anyway."
break
fi
echo "Waiting for FastAPI... ($RETRY_COUNT/$MAX_RETRIES)"
sleep 1
done
if [ $RETRY_COUNT -lt $MAX_RETRIES ]; then
echo "FastAPI is up and running"
fi
# Start nginx in the foreground
echo "Starting nginx..."
exec nginx -g 'daemon off;'
At the moment, there is no cost for using suspend. And while I obviously can’t say that will never change, I can say that I’m unaware of any plans to change that. It is something we encourage people to use, the primary reason why it is not the default is that it can cause problems with some applications that are sensitive to clock issues.
I very much like you start script, that solves the problem nicely. That hadn’t occurred to me. I’ll try to remember to point others with similar issues at your solution.