Impossible to run Llama 3.1 405b?

mottle-support · July 26, 2024, 2:09am

I’m assuming it’s not possible to run Llama 3.1 405b yet on Fly.io, since you can’t chain A100 GPUs together (I believe you’d need about 8 of them).

charsleysa · July 27, 2024, 7:19am

Hi @mottle-support

You can request up to 8 GPUs using the vm.gpus config in the fly.toml file.
Here’s the relevant docs Fly Launch configuration (fly.toml) · Fly Docs

Be aware that you might possible run into GPU availability issues if you’re requesting a lot of machines with 8x A100-80GB GPUs each. If you do, reach out to fly’s customer success team and have a chat with them.

system · August 3, 2024, 7:19am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Attempting to run llama-cpp-python on an a100-40GB GPU server (SIGILL) Build debugging	0	68	October 4, 2024
Cannot create a10 machines Questions / Help gpu	2	158	June 17, 2024
"Elixir Llama2-13b on Fly GPUs" doesn't work on new account?	3	24	September 20, 2024
GPU Benchmarking Fresh Produce gpu	16	1576	April 8, 2024
Build and deploy llama.cpp server on fly.io Show & Tell gpu	0	248	June 20, 2024

Impossible to run Llama 3.1 405b?

Related topics