Podman and gVisor

mbund · November 28, 2025, 1:11am

I am trying to execute untrusted user code as a sidecar container on Fly. My understanding is that Fly machines are running as Firecracker VMs, so while there is no KVM there should be a full Linux machine available. I would like to use podman to run containers, then change the runtime to gVisor for enhanced security (though I am indifferent on the podman vs docker/etc).

To do some initial testing, I’ve launched a ubuntu:24.04 image on Fly. I then got a shell on the machine with fly ssh console, and installed podman with apt-get -y install podman and gVisor with their instructions.

Podman works “normally”

root@185924c4565078:/# podman run -it alpine ash
/ #

But for my workload I need to restrict the amount of CPU time (and memory) each container can use. If I do that, it doesn’t work (with these two errors):

root@185924c4565078:/# podman run --cpus 0.01 -it alpine ash
Error: crun: open `/sys/fs/cgroup/cpu/libpod_parent/libpod-3aa0ac5688232d1c677c591a49dad08825867eb2833c666b02bdb30a40705598`: No such file or directory: OCI runtime attempted to invoke a command that was not found

root@185924c4565078:/# podman run --cpus 0.01 -it alpine ash
Error: container create failed (no logs from conmon): conmon bytes "": readObjectStart: expect { or n, but found , error found in #0 byte of ...||..., bigger context ...||...

Similarly if I just try to use gVisor’s runsc directly, it has a similar error:

root@185924c4565078:/# runsc do ls
creating container: cannot set up cgroup for root: configuring cgroup: stat /sys/fs/cgroup/cpu: no such file or directory

root@185924c4565078:/# runsc do ls
creating container: open /sys/fs/cgroup/cpu/runsc-385983/cgroup.procs: no such file or directory

I imagine that the post Docker without Docker, now with containers is related, and that the containers uploaded to Fly are not unpacked into a Firecracker VM (anymore?). And that is why it doesn’t have permission to do a privileged action like create a cgroup.

Also I am aware of Fly’s multi-container machines but I need to run user containers, so I do not think it is a good fit.

Does anyone have any ideas on how to run a container with gVisor within a Fly Machine?

PeterCxy · November 28, 2025, 2:25am

I don’t have an answer for running containers / VMs nested inside Fly machines, but your use case sounds very similar to

You can run untrusted user code in separate Fly Machines via the Machines API and even isolate them in independent apps. Your Fly Machines also have access to the Machines API, so you can do this programmatically all in an “orchestrator” Fly App.

lillian · November 28, 2025, 3:40pm

that is the same as the multi-container machines, and it does still run in Firecracker with runc managing each container in a machine. the new Pilot runtime is also not enabled unless you create a machine specifically with containers.

I was able to create a cgroup limiting CPU to 10% with these commands:

48e434db00ed38:/# cgcreate -g cpu:/limit01
48e434db00ed38:/# cgset -r cpu.cfs_period_us=100000 limit01
48e434db00ed38:/# cgset -r cpu.cfs_quota_us=10000 limit01

admittedly I’m not very familiar with cgroups, but what you’re seeing might be a difference between cgroups v1 / v2?

mbund · November 28, 2025, 9:27pm

I see, thanks for the callout on the Pilot not being enabled by default.

Using cgcreate family commands which you gave seems to work to create a cgroup v1.

root@185924c4565078:/# ls /sys/fs/cgroup/cpu,cpuacct/limit01/
cgroup.clone_children  cpu.cfs_burst_us   cpu.cfs_quota_us  cpu.rt_period_us   cpu.shares  cpu.stat.local  cpuacct.usage      cpuacct.usage_percpu	cpuacct.usage_percpu_user  cpuacct.usage_user  tasks
cgroup.procs	       cpu.cfs_period_us  cpu.idle	    cpu.rt_runtime_us  cpu.stat    cpuacct.stat    cpuacct.usage_all  cpuacct.usage_percpu_sys	cpuacct.usage_sys	   notify_on_release

It looks like on the Ubuntu image cgroupv2 is on /sys/fs/cgroup/unified

root@185924c4565078:/# mount | grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)

I can make a cgroupv2 on it:

root@185924c4565078:/# mkdir /sys/fs/cgroup/unified/mycgroup
root@185924c4565078:/# ls /sys/fs/cgroup/unified/mycgroup/
cgroup.controllers  cgroup.freeze  cgroup.max.depth	   cgroup.pressure  cgroup.stat		    cgroup.threads  cpu.pressure  cpu.stat.local  memory.pressure
cgroup.events	    cgroup.kill    cgroup.max.descendants  cgroup.procs     cgroup.subtree_control  cgroup.type     cpu.stat	  io.pressure

So it looks like podman and gVisor are trying to use cgroupv1 on /sys/fs/cgroup/cpu.

I switched to the quay.io/podman/stable image for testing which is an official Fedora based image for running podman inside a container. Now when I run it, it does enter the container, but it gives a warning.

[root@185924c4565078 /]# podman run -it --cpus 0.01 alpine ash
-bash: /etc/machine-id: No such file or directory
WARN[0000] Using cgroups-v1 which is deprecated in favor of cgroups-v2 with Podman v5 and will be removed in a future version. Set environment variable `PODMAN_IGNORE_CGROUPSV1_WARNING` to hide this warning.
/ #

Also experimentally, it was unable to limit the cpu time of the container. According to this issue comment on podman, if hybrid cgroupv1/v2 exist, then it doesn’t work.

I tried setting a kernel parameter to force cgroupv2 in fly.toml:

[[vm]]
  kernel_args = ["cgroup_no_v1=all"]

But this causes the machine to immediately crash:

2025-11-28T21:11:26.700 app[185924c4565078] ord [info] 2025-11-28T21:11:26.700124084 [01KB64GX1YAD3BKBDZB5KRHX1X:main] Running Firecracker v1.12.1
2025-11-28T21:11:26.700 app[185924c4565078] ord [info] 2025-11-28T21:11:26.700300048 [01KB64GX1YAD3BKBDZB5KRHX1X:main] Listening on API socket ("/fc.sock").
2025-11-28T21:11:27.627 app[185924c4565078] ord [info] INFO Starting init (commit: 6f59af0a)...
2025-11-28T21:11:27.681 app[185924c4565078] ord [info] [ 0.892285] cgroup: Disabled controller 'net_cls'
2025-11-28T21:11:27.683 app[185924c4565078] ord [info] ERROR Error: couldn't mount cgroup onto /sys/fs/cgroup/net_cls,net_prio, because: EINVAL: Invalid argument
2025-11-28T21:11:27.683 app[185924c4565078] ord [info] [ 0.894214] reboot: Restarting system
2025-11-28T21:11:27.766 app[185924c4565078] ord [warn] Virtual machine exited abruptly
2025-11-28T21:11:27.849 runner[185924c4565078] ord [info] machine has reached its max restart count of 10

Is there a way to only use cgroupv2? Does Fly somehow require that Linux has cgroupv1 enabled?