Is this too big? I’m deploying the app with flyctl from a digitalocean vps. Once done it’s uploading it just restarts. I think I can reduce it by around 500MB if put some work into it. Anything else that could be the problem?
Should be fine! You might need to set a higher grace_period
on the health check in fly.toml, it’s possible it’s just taking too long for the check to start passing.
Docker continues to try and fails to push the image in a loop. There was some problems that would have caused my app not to deploy correctly. I seemed to have fixed them.
I changed the services.tcp_checks to this:
[[services.tcp_checks]]
grace_period = "30s"
interval = "15s"
port = "5005"
restart_limit = 6
timeout = "2s"
There’s nothing in “flyctl logs”
Can you show us your flyctl deploy
logs?
2.2 GB should be fine.
Are you saying you can’t upload it or the app won’t start?
Let us know which app it is (its name). Either here or via private message
the docker push keeps just keeps restarting and then crashes. I removed the docker build part, it built just fine and. I also removed the app name.
Deploying <the app name>
==> Validating app configuration
--> Validating app configuration done
Services
TCP 80/443 ⇢ 5005
==> Creating build context
--> Creating build context done
==> Building image with Docker
<removed docker build info>
got a message { <nil> 0 0 <nil> 0xc000292300}
Successfully built d723ec7f4e8e
Successfully tagged registry.fly.io/<app name>:deployment-1617043397
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/<app name>]
4d92a0b0fbb9: Layer already exists
4ee7ba2d5386: Pushing [==================================================>] 2.227GB
5783b15f2d8c: Layer already exists
4f8c60dc2e48: Layer already exists
b65352c298b1: Layer already exists
7a654b19c23c: Layer already exists
885f75ddced3: Layer already exists
50036f2235cb: Layer already exists
606dfd710fed: Layer already exists
6f018c30ff57: Layer already exists
75d16189e3f8: Layer already exists
2346bf194641: Layer already exists
8bdee3f404a1: Layer already exists
e7f6ccee1295: Layer already exists
90864f0e5b9f: Layer already exists
59840d625c92: Layer already exists
da87e334550a: Layer already exists
c5f4367d4a59: Layer already exists
ceecb62b2fcc: Layer already exists
193bc1d68b80: Layer already exists
f0e10b20de19: Layer already exists
It looks like the VM is failing to start after deploying:
Running: `python3 main.py` as root
2021/03/29 19:39:34 listening on [***]:22 (DNS: [fdaa::3]:53)
INFO:root:process
Traceback (most recent call last):
File "main.py", line 16, in <module>
import process
File "/workspace/process.py", line 9, in <module>
import tflite_runtime.interpreter as tflite
ModuleNotFoundError: No module named 'tflite_runtime'
Main child exited normally with code: 1
Does the VM start locally with docker?
That was without the build-arg arguments. Which I would have expect it not to start. I’m going to build a version that doesn’t require the build-arg arguments and see if that helps.
I did “flyctl deploy -i image_name” I don’t think it anything to do with the build-arg. The full image size is 4.7GB, I think not 2.2GB.
I’m going to try to reduce the docker image size.
Are you able to run the docker image locally without getting the ModuleNotFoundError: No module named 'tflite_runtime'
error?
A 4gb+ image should be fine but a single 2gb+ layer is likely the problem. I tried to reproduce but docker keeps crashing on my machine
What is the app doing? What type of data are you putting in the image?
The working version of the docker image doesn’t upload. I’m running neutral networks on faces. How small do I need to get it?
Yes I can run the working version of the docker image locally without problem
How big is the working image vs not working? I’m not sure why one image would fail to upload and another wouldn’t unless the size was radically different.
We recommended images be as small as reasonably possible, and we have many multi-GB images running from other customers now. This is the first image that is failing to push though. Would it be possible to save the large content to s3 or minio on fly then pull down into the image on launch? Storing the content in a volume would be great too if there’s a good way to get it there.
There’s like around 1 - 1.5 GB of model data in the docker image. I was grabbing most of it from backblaze and some from github in the build. The one that uploaded didn’t have model data and didn’t have the required python modules to run. One of the python modules is also quite large. I’ll look into volumes. Downloading it all of it from backblaze or some other 3rd party on launch didn’t seem like a good idea to me.
I got the deployment working. I’m now having it download the models from a minio instance. For some reason the internal domain for it isn’t working. Right now I’m using the fly.dev domain for it. Is internal bandwidth charged when it’s over fly.dev or it free? Do .internal domains have higher bandwidth speeds? If the bandwidth speeds are the same and internal bandwidth of fly.dev domains is free then I won’t bother fixing it.
Using .internal
should be faster since it doesn’t have to go through an extra layer of proxying. You don’t have access to it from your local computer or from your instance? The latter is abnormal. It doesn’t resolve?
Both internal bandwidth and external bandwidth cost something. Connections can span multiple datacenters and even infrastructure providers, meaning transit is not free for us.
Eventually, we will be able to optimize costs and adjust pricing for our users. Our pricing structure is based on our own costs. As we get better deals, we can pass on the savings!
Some of this is logistics too. We don’t currently differentiate bandwidth within the same host or within the same datacenter vs. bandwidth that goes across the globe to a different provider. We just charge for the bandwidth we measure from your VM’s network interface.
So what you’re saying is that internal traffic in the same center is charged? So it would be cheaper to download it from a source with really cheap bandwidth, then to download it from an instance in the same datacenter? If that’s the case then that needs to be fixed.
I’m using the minio client to download it. Both are in the ewr region. I get this error:
mc: Unable to initialize new alias from the provided credentials. Get “http://ewr.long-smoke-1174.internal/probe-bucket-sign-2rw06mzlskg9/?location=”: dial tcp [fdaa:0:f41:a7b:ab2:0:14cc:2]:80: connect: connection refused.
I have this in the fly.toml for both instances:
[experimental]
private_network=true
Is Minio listening on ipv6? I think you have to pass --address [::]:80
to get it to listen on all addresses.
yeah that worked.