Can't find cuda libraries

Hi all, I’ve been trying to build and run the llama.cpp server binary. Here’s my dockerfile:

FROM ubuntu:22.04 as base

RUN apt update -q && apt install -y ca-certificates wget && \
  wget -qO /cuda-keyring.deb && \
  dpkg -i /cuda-keyring.deb && apt update -q

FROM base as builder

RUN apt install -y --no-install-recommends git cuda-nvcc-12-0 libcublas-dev-12-0 libcurl4-openssl-dev
ENV PATH=$PATH:/usr/local/cuda/bin

RUN git clone --depth 1 /llama.cpp
RUN cd /llama.cpp && \
  make LLAMA_CUDA=1 LLAMA_CURL=1  llama-server

FROM base as runtime


RUN apt install -y --no-install-recommends cuda-cudart-12-0 libcudnn8

RUN mkdir -p /models

COPY --from=builder /llama.cpp/llama-server /app/llama-server
COPY ./ /app/

RUN chmod +x /app/

CMD ["/app/"]

So far the build step works fine. However when running it, I run into this error:

/app/llama-server: error while loading shared libraries: cannot open shared object file: No such file or directory

All help I searched online suggested to install the nvidia toolkit or use the nvidia-runtime base image. However, I’m trying to stay close to the guidelines provided by the fly documentation to keep the image as slim as possible. Is there a library that I’m missing that’s giving me that file not found error?

Hi @hazelnutcloud, will be automatically “injected” in the machine when it starts, but only if it’s a Fly GPU machine.

A normal machine does not have these files auto-added, so you would have to install them manually via the corresponding .deb - but it won’t make a lot of sense to do so because on non-GPU machines the drivers won’t work. And on GPU machines the files are already there :slight_smile:

You’re right. I had set gpu_kind in my fly.toml but when I ran fly deploy it had overwritten the toml file and erased the gpu_kind set, and my app ended up booting a machine with no GPU set. Thank you!

