I have a simple application that is running
Bumblebee, a framework in Elixir that allows me to run machine learning models with Phoenix. The repo is at GitHub - dwyl/image-classifier: 🖼️ Classify images and attempt to extract data from or describe their contents.
I have this application deployed on
fly.io, which sleeps after a period of inactivity. I’ve followed the instructions in Speed up your boot times with this one Dockerfile trick · The Phoenix Files to speed up the boot time of the
fly.io instance and it works to a certain extent - the model is properly cached in a volume and is not re-downloaded every time the instance is back up, being loaded when the app goes from sleep to active.
I’ve inclusively created a small guide to deploy this with
fly.io, in case anyone is interested. This was also discussed in Build failed with elixir bumblebee - #4 by matthewford.
While the app works normally, even if the models are cached, the amount of time the app takes to go from sleeping to active is roughly 25 seconds (I’m using a fairly large model - Salesforce/blip-image-captioning-large · Hugging Face - on a
performance-4x machine instance). I don’t know if this is normal or not and I realise it might be beyond my control.
2023-11-21T09:33:01.113 - Starting machine
2023-11-21T09:33:26.140 - Access AppWeb.Endpoint at https://imgai.fly.dev
I assume it’s taking 25 seconds to load the model into the CPU.
However, is there anything I can do to hopefully reduce this time? This is very much niche, but I’m curious if there’s anything I can do in practice besides scaling up the machines.
Thank you so much for reading