Image aware placements

dangra · March 22, 2024, 2:30am

Machines with large images can really impact deployment times and the overall user experience. This is a huge problem for GPU use-cases as the image sizes can range anywhere from 2G to 100G+.

To address this, we are making our placement logic “image-aware”, which would work to prioritize placement based on whether the target image has already been cached on the host or not.

But it doesn’t stop there, a well crafted Docker image can have many layers but only a few change on application updates. Or think about the many applications that share base layers from images like Ubuntu’s, Nvidia’s, Go, Elixir, Python … and so many others public official images.

By making our machine placement logic aware of what layers are available across the fleet, we can reduce image pull times, and by extension, the time to launch your machines.

As a last note, FLAME pattern fans will get the benefits out of the box as their ephemeral machines will quickly pick the best host to start faster than ever before.

What do I need to do?

Nothing as long as you are using flyctl v0.2.20+

This is now enabled in all regions. We are monitoring how start times change for newly created machines after this changes, if you are interested to try them for your application no matter the region just let us know.

flyctl is ready to propagate the image hints on the following commands:

fly launch and fly deploy even when [[mounts]] section are set
fly scale count even when it has to create new volumes
fly volume fork will consider the image if the source volume is attached to a machine
fly postgres create will do the right thing too
fly machine commands family even when volumes are involved

What about machines that use volumes?

The “CreateVolume” API endpoint already accepts compute requirements hints (cpu, mem, …) that helps the placement logic pick a host with enough free compute capacity to run a machine with the volume attached.

We extended the compute requirements to also include an optional machine image field; machines are usually launched seconds after creating the volume, so it is likely the image for that machine is on the host ready to be used, meaning less wait time to start the machine and have the application serving requests or whatever.

In case you wonder, matching a host by image doesn’t mean it won’t respect the require-unique-zone constraint; volume placement always enforce constraints first and then score candidate hosts by available capacity.

How does it look for Volumes API

In case you are interacting directly with the API, here is how to pass the image hint.

curl -i -X POST "https://api.machines.dev/v1/apps/my-app-name/volumes" \
  -H "Authorization: Bearer ${FLY_API_TOKEN}" \
  -H "Content-Type: application/json" \
      -d '{
      "name": "my_app_vol",
      "region": "ord",
      "size_gb": 10,
      "compute_image": "ollama/ollama:latest"
    }'

What metric should I be watching at?

We are working on adding more start time related machine metrics and expose that as application metrics, but for now you look for application logs like (scroll right if you can’t see the time it took to prepare the image):

[info]Pulling container image registry.fly.io/appname@sha256:f2217...
[info]Successfully prepared image registry.fly.io/appname@sha256:f2217... (88.648733ms)

And that is all for now!

smorimoto · March 28, 2024, 6:02pm

Does this only work with images built with flyctl? Or is it available for images built with docker/build-push-action in GHA? In that case, I would like to see this on SIN and NRT!

kurt · March 28, 2024, 10:10pm

It’ll work for images built with docker actions too.

smorimoto · March 28, 2024, 10:55pm

Fantastic!

chrismccord · March 29, 2024, 7:29pm

To give some numbers on how much this can help – I have an Elixir app starting headless chrome, so the image is heavier than a stock 180mb Phoenix image at ~1GB. Under best-case conditions, a stock Phoenix image can be launched in 3.5-4.5s. For this app, I’m getting cold machine launches close to ideal numbers in ord and fra, even at 5x the image size:

ord: 4739ms
fra: 5363ms
syd: 6930ms
nrt: 7197ms

nina · April 1, 2024, 3:06pm

Updating to add that this is now enabled in all regions!

Topic		Replies	Views
Docker image size limit raised from 2GB to 8GB Fresh Produce	12	1672	February 15, 2024
Scaling Laravel to 0 with Machines Laravel	2	574	January 21, 2023
VM size planning	7	1716	August 1, 2022
Fly GPUs Are Here Fresh Produce gpu	15	2026	February 20, 2024
Beefier `fly machine` size for game server? (dedicated CPU? >2gb memory?) Questions / Help wishlist	2	595	May 16, 2022

Image aware placements

What do I need to do?

What about machines that use volumes?

How does it look for Volumes API

What metric should I be watching at?

Related Topics