I have a gpu-enabled Fly app that I use to run some Python-based inferencing. I’m running a custom Docker image with a FastAPI/uvicorn server. After 60 seconds of idle, the server will shutdown and so the fly machine. As soon as a request is received, the machine will start and process it. This all works great, but anybody with the URL can make requests and wake it up (even though I have authorization logic in place).
Is there some kind of built-in authentication to avoid this or do I have to put a cheap API gateway that forwards requests to the GPU-enable fly app only if the request is authorized?
I believe you have to have some kind of proxy since the GPU is tied to a machine.
I would make your gpu app private, then proxy the request via flycast
after it’s auth/authorized.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.