I’m assuming it’s not possible to run Llama 3.1 405b yet on Fly.io, since you can’t chain A100 GPUs together (I believe you’d need about 8 of them).
You can request up to 8 GPUs using the vm.gpus
config in the fly.toml file.
Here’s the relevant docs Fly Launch configuration (fly.toml) · Fly Docs
Be aware that you might possible run into GPU availability issues if you’re requesting a lot of machines with 8x A100-80GB GPUs each. If you do, reach out to fly’s customer success team and have a chat with them.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.