CUBLAS on Fly.io GPUs

nsalina · March 5, 2024, 6:24pm

I’ve been trying to run LLMs via the llama-cpp-python library, which requires cmake, CUDA, and its other libraries (CUBLAS) in order to run with GPU acceleration. Below is my Dockerfile (borrowed heavily from the Fly GPU quickstart docs:

FROM ubuntu:22.04

RUN apt update -q \
    && apt install -y ca-certificates cmake gcc-11 g++-11 git parallel wget ffmpeg python3.10 python3-pip
    && wget -qO /cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb \
    && dpkg -i /cuda-keyring.deb \
    && apt update -q \
    && apt install -y cuda-nvcc-12-2 libcublas-12-2 libcudnn8 cuda-libraries-12-2

# Install pip pkgs to run python script to download, transcribe, and upload videos
WORKDIR /app
COPY . .

ENV CUDA_DOCKER_ARCH=all
ENV LLAMA_CUBLAS=1
RUN pip install -r requirements.txt \
    && pip install flash-attn --no-build-isolation \
    && CUDACXX=/usr/local/cuda-12.2/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

CMD ["python3", "transcribe_yt_videos.py"]

However, this fails to run because I run into this error:

Target "ggml" links to:
CUDA::cublas
but the target was not found.  Possible reasons include:
* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.
CMake Generate step failed.  Build files cannot be regenerated correctly.

I thought it was an issue with the library but when I sshed into the VM and ran ls /usr/local/cuda-12.2/include | grep cublas to verify CUBLAS was installed, there were no CUBLAS header files! That and this issue indicates to me there was an issue with the installation process.

I ended up getting this working by using a Nvidia 12.2.2 CUDA on Ubuntu 22.04 base image, but I want to understand why this Dockerfile setup doesn’t install CUBLAS despite explicit commands in the Dockerfile to install the CUDA toolkit and other libraries.

rubys · March 5, 2024, 7:47pm

Posting your requirements.txt would help, but looking at stack overflow, I see:

find_package(CUDA) is deprecated. Use find_package(CUDAToolkit) instead if you have at least CMake 3.17.

So it looks like the package was renamed at some point, but in your dependencies you have listed an older version of something that depends on the original name. Updating your dependencies should resolve this issue?

Topic		Replies	Views
Can't find cuda libraries Questions / Help gpu	3	237	June 27, 2024
Attempting to run llama-cpp-python on an a100-40GB GPU server (SIGILL) Build debugging	0	59	October 4, 2024
GPU Quickstart describes outdated dependencies gpu	1	16	September 9, 2024
Build and deploy llama.cpp server on fly.io Show & Tell gpu	0	239	June 20, 2024
Getting error when trying to run Llama2ChatModel in GPU machine elixir , gpu	4	289	April 16, 2024

CUBLAS on Fly.io GPUs

Related topics