We have a 10GB layer from poetry install
in our docker image
Its installing python packages for AI usage. It seems to be restarting at around 3-4GB of data pushed.
Is there a way to let the full 10GB get pushed?
We have a 10GB layer from poetry install
in our docker image
Its installing python packages for AI usage. It seems to be restarting at around 3-4GB of data pushed.
Is there a way to let the full 10GB get pushed?
“You are doing it wrong”, I would say. Even though image size has been increased from 2G->8G (ref Docker image size limit raised from 2GB to 8GB)
Do not load models on build stage, download those to persistent volume (as models probably do not change often) and use those from there.
There’s actually no models loaded.
It’s just the nvidia cublas portion of the pip install torch.
I isolated it by installing each package individually and saw this is the big layer:
RUN poetry add nvidia-cublas-cu11==11.10.3.66
pip install torch==2.0.0
Can you share your Dockerfile
, your fly.toml
, and the output of the fly
command you’re running with LOG_LEVEL=debug
?
Sure, I need to do it in a few hours since I got passed it in a hacky way by pip installing our poetry project and forcing pip to install the cpu only version of pytorch with:
RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
It’s a bit hacky though as now poetry doesn’t have it automatically installed. This reduced the layer down to 1.2GB
So that was the fix to reduce image size. I’ll grab the old Dockerfile in a few hours and get prior version of those files and outputs.
I believe I am hitting the same issue, anywhere from 3.5gb to 3.8gb the layer push retries. Also using pytorch for machine learning purposes.
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/graphtestrun]
0b7921a299c9: Pushed
ef6492d8b2c5: Pushed
b03f84644f36: Pushed
ea2cbc668e9d: Pushing 7.181GB/7.181GB
38b7afa4b510: Pushed
fe055d693f15: Pushed
45edac8e009c: Pushed
d82a965980ed: Pushed
9364cfd0203d: Pushed
b9044eea833a: Pushed
a2d7501dfb35: Pushed
Error: failed to fetch an image or build from source: error rendering push status stream: unknown: unknown error
it in fact did not push the entire 7.181gb
maybe I can try the hacky solution referred above
dockerfile
# For more information, please refer to https://aka.ms/vscode-docker-python
FROM python:3.10-slim
EXPOSE 5002
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE=1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED=1
# Install system dependencies for building rhino3dm
RUN apt-get update && apt-get install -y \
build-essential \
cmake \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install pip requirements
COPY requirements.txt .
RUN python -m pip install -r requirements.txt
WORKDIR /app
COPY . /app
# Creates a non-root user with an explicit UID and adds permission to access the /app folder
# For more info, please refer to https://aka.ms/vscode-docker-python-configure-containers
RUN adduser -u 5678 --disabled-password --gecos "" appuser && chown -R appuser /app
USER appuser
# During debugging, this entry point will be overridden. For more information, please refer to https://aka.ms/vscode-docker-python-debug
CMD ["gunicorn", "--bind", "0.0.0.0:5002", "server:app"]
fly.toml
# fly.toml app configuration file generated for graphtestrun on 2023-09-10T09:34:34-07:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = "graphtestrun"
primary_region = "sea"
[build]
[http_service]
internal_port = 5002
force_https = true
auto_stop_machines = false
auto_start_machines = true
min_machines_running = 0
processes = ["app"]
[http_service.concurrency]
type = "requests"
soft_limit = 200
hard_limit = 250
for me, the hacky solution also seems to have worked
I edited my dockerfile with these lines, as I didn’t need the other libraries only torch:
# Install pip requirements
COPY requirements.txt .
RUN python -m pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN python -m pip install -r requirements.txt
I also had to run:
flyctl scale memory 2048 -a graphtestrun
within the venv in vscode