Attempting to run llama-cpp-python on an a100-40GB GPU server (SIGILL)

quanrong · October 4, 2024, 10:27am

I have an app that uses llama-cpp-python. In a normal CPU server it runs fine, however when running it on a GPU server it crashes with Main child exited with signal (with signal 'SIGILL', core dumped? false)
Does anyone know what’s going on? Has anyone deployed llama-cpp-python successfully on a fly.io GPU server? Thanks!

Topic		Replies	Views
Build and deploy llama.cpp server on fly.io Show & Tell gpu	0	248	June 20, 2024
Can't find cuda libraries Questions / Help gpu	3	257	June 27, 2024
Impossible to run Llama 3.1 405b? machines , gpu	2	181	August 3, 2024
CUBLAS on Fly.io GPUs Build debugging gpu	1	520	March 5, 2024
Getting error when trying to run Llama2ChatModel in GPU machine elixir , gpu	4	293	April 16, 2024

Attempting to run llama-cpp-python on an a100-40GB GPU server (SIGILL)

Related topics