Deploying root file system applications

Hello,

We have our application packaged in a discoverable GPT partition image and typically boot it using systemd-nspawn. We’re now exploring how to deploy multiple instances of this application in the cloud and are looking for orchestration options. Note that our application is not fitting docker principles. It is a multi-process application.

I came across Fly.io and found some similarities with what we’re trying to achieve. I’m wondering if your platform could support our use case.

From what I understand, Fly.io extracts the root filesystem from a Docker image and boots it using Firecracker. Would it be possible to start our custom root filesystem in this way?

We’re open to packaging our setup as a Docker container if Fly.io can then extract the root filesystem. But we’re specifically wondering:

  • Can Fly.io boot a root filesystem using systemd as the init process?
  • Is there support for running systemd-nspawn within that environment?

Thank you for your support—looking forward to hearing your thoughts.

Best regards,

Umut

Hi,

What you need might be possible but it’s not necessarily supported - it’s uncharted territory and you’d be basically on your own.

You cannot override the init that Fly machines boot with; we inject a custom init process that then spawns a few other processes, notably the Dockerfile’s ENTRYPOINT and CMD entries (though both can be overridden). So if you find a way to make systemd happy as not-pid-1, you may be able to get it running. I can’t tell you if systemd-nspawn would work once you get systemd itself working; I think everything needed should be there as it’s really a VM running on top of a real kernel, not something already-confined like Docker proper.

So if you can package your system/application as a Docker image and ensure the ENTRYPOINT or CMD run systemd, it might just work - but you’ll have to experiment on your side to see if it’s feasible.

For completeness, the way most folks run multiple processes in a single machine is by using something like supervisord.

  • Daniel
1 Like

@utezduyar, as @wjordan mentioned in System has not been booted with systemd as init system (PID 1). Can't operate. - #3 by wjordan, you can start systemd in its own namespace using unshare:

CMD ["unshare", "--pid", "--fork", "--mount-proc", "/lib/systemd/systemd"]

Then you can use nsenter to enter the namespace to run systemctl commands. For example:

nsenter --target $(pidof systemd) --mount --uts --ipc --net --pid -- bash -c "systemctl start <your service>"

However, I’m not sure how reliable your application would be using this approach. As @roadmr said, it needs to be tested.

I’d appreciate your feedback because I’m in a similar situation to yours. If you find a working solution, please share your experience as it would be very helpful for others facing the same issue.

Thank you both for your time and response. I think using a tool that is not meant to be is going to bite me back in the future. However, let me take this opportunity and discuss with you (fly) about business opportunities with a technical propose.

The reason why we use nspawn is because we use the same software stack that we deploy to our embedded devices as in the cloud. The cloud version mocks hardware features. Our stack has start up order among services. I think this kind of testing is quite new in the systemd/embedded community but I am hoping it will pick up.

From technical point of view, systemd has a concept of systemd extensions. Currently nspawn doesn’t support systemd extensions but it has been in the wish list in upstream. I don’t believe it is complicated to implement the missing piece. One thought would be that the fly’s initialization and communication processes would be packed in an extension which gets added on top of the customer’s bootable root fs.

Our containers should support running applications as PID 1. Perhaps that would work here?

1 Like

Thank you all again.

Would fly like to help me go through the process (requesting access to Pilot based machine), APIs etc so we give this setup a whirl and hopefully write a blog on fly.io called “how to start root fs packed as Discoverable GPT” with it’s current state?

My understanding is that we have quietly enabled Pilot for everyone. If your use the machine API and you include containers in your request per the blog post I linked, you should be able to run systemd as pid 1. If this is not the case, let me know and I’ll find out more.

As to the rest, I’ll be honest and say that I don’t know what Discoverable GPT is or, why it would be useful, or if it could work on fly; I’m just responding to say that if running systemd as pid 1 was the roadblock, there now is a way to do exactly that.

Creating an app using machine API is kinda complicated. I opened Fly Machines API and felt perplexed.
Is there a simple tutorial or an example of how to get started with Machines API?
And I don’t understand should we do some extra steps for systemd to work as PID 1 or would it work automatically right away after we specify containers?

systemd should work as PID 1 mathematically is you specify containers.

I plan to look into some usability enhancements for working with containers this week - if you can tell me a bit more about what you are trying to accomplish, that would both be helpful to me and would enable me to provide more helpful answers to your question.

1 Like

thanks, I appreciate your help.
eventually I figured out how to create an app using API.
I looked here: Machines · Fly Docs and created a simple example app:

curl -i -X POST \
  -H "Authorization: Bearer ${FLY_API_TOKEN}" -H "Content-Type: application/json" \
  "${FLY_API_HOSTNAME}/v1/apps/my-test-app-name/machines" \
  -d '{
    "config": {
      "init": {
        "exec": [
          "/bin/sleep",
          "inf"
        ]
      },
      "containers": [
        {
          "name": "ubuntu",
          "image": "registry-1.docker.io/library/ubuntu:latest",
          "cmd": [
            "/bin/sleep",
            "inf"
          ]
        }
      ],
      "guest": {
        "cpu_kind": "shared",
        "cpus": 1,
        "memory_mb": 256
      }
    }
  }'

but when I try fly ssh console I get:

ERROR unexpected error executing command error=“FLY_SSH_CONTAINER env required”

using FLY_SSH_CONTAINER=ubuntu fly ssh console doesn’t help

Have you tried the --container parameter on fly ssh console ?

For completes, I have tried below to run systemd as pid 1 but failed to go in to the machine. Eventually the machine got killed with oom killer (exit_code=0,oom_killed=false,requested_stop=false)

flyctl apps create foo-my-test-app-name
export FLY_API_TOKEN=$(fly tokens deploy -a foo-my-test-app-name)
export FLY_API_HOSTNAME="https://api.machines.dev"
curl -i -X POST \
  -H "Authorization: Bearer ${FLY_API_TOKEN}" -H "Content-Type: application/json" \
  "${FLY_API_HOSTNAME}/v1/apps/my-test-app-name/machines" \
  -d '{
    "config": {
      "init": {
        "exec": [
          "/lib/systemd/systemd"
        ]
      },
      "containers": [
        {
          "name": "ubuntu",
          "image": "registry-1.docker.io/library/ubuntu:latest",
          "cmd": [
            "/lib/systemd/systemd"
          ]
        }
      ],
      "guest": {
        "cpu_kind": "shared",
        "cpus": 1,
        "memory_mb": 2048
      }
    }
  }'
fly ssh console -a foo-my-test-app-name --container ubuntu

Connecting to fdaa:16:e37c:a7b:3e9:5a9d:c6c2:2... complete

2025-04-23T10:29:42.280452Z: error opening file `/run/crun/ubuntu/status`: No such file or directory

Logs indicate that the systemd is not found

		2025-04-23 12:52:42.359	
ubuntu container exited with code 1
2025-04-23 12:52:42.359	
 INFO Scheduling start of ubuntu in 32.512s
2025-04-23 12:52:42.358	
 INFO restart policy is 'on failure', restarting if possible. current restart count is 8/10
2025-04-23 12:52:42.357	
 INFO container exited exit status: 1, determining restart based on policy 'OnFailure { count: 10 }' name=ubuntu
2025-04-23 12:52:42.355	
executable file `/lib/systemd/systemd` not found in $PATH: No such file or directory

After I SSHed:

Connecting to fdaa:11:82d1:a7b:404:62fd:5c0:2… complete
root@ubuntu:/# systemctl status
bash: systemctl: command not found

After I entered apt update && apt install systemd && systemctl status:

System has not been booted with systemd as init system (PID 1). Can’t operate.
Failed to connect to bus: Host is down

You need to build or find an image that already has systemd in it. Try the search bar in https://hub.docker.com/

I’ve got a pull request that makes fly ssh console better: Validate, default, and allow container to be selected by rubys · Pull Request #4325 · superfly/flyctl · GitHub

I’m starting to take a look at fly machine run to make this process easier: fly machine run w/containers · Issue #4328 · superfly/flyctl · GitHub

2 Likes

Nice to see!

I have tried to use this image: https://hub.docker.com/r/jrei/systemd-ubuntu
The result is:

root@ubuntu:/# systemctl status
System has not been booted with systemd as init system (PID 1). Can’t operate.
Failed to connect to bus: Host is down

I tried this:

curl -i -X POST \
  -H "Authorization: Bearer ${FLY_API_TOKEN}" -H "Content-Type: application/json" \
  "${FLY_API_HOSTNAME}/v1/apps/my-test-app-name/machines" \
  -d '{
    "config": {
      "containers": [
        {
          "name": "ubuntu",
          "image": "jrei/systemd-ubuntu",
          "cmd": [
            "/usr/bin/systemd"
          ]
        }
      ],
      "guest": {
        "cpu_kind": "shared",
        "cpus": 1,
        "memory_mb": 256
      }
    }
  }'

Note the cmd.

Then I ran fly ssh console --container ubuntu and see systemd as pid 1:

root@localhost:/# ps -elf
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S root         1     0  0  80   0 -  5029 ep_pol 13:27 ?        00:00:00 /usr/bin/systemd
4 S root        36     1  0  80   0 -  4562 ep_pol 13:27 ?        00:00:00 /usr/lib/systemd/systemd-journald
4 S root        45     0  0  80   0 -  1147 do_wai 13:27 pts/0    00:00:00 /bin/bash
4 R root        48    45  0  80   0 -  1984 -      13:27 pts/0    00:00:00 ps -elf
2 Likes

Big thanks!
It’s working like a charm

It worked for me too, thanks. Looking at the systemd-journald logs, I believe the container section is through OCR. Is the software stack of a machine something like this: Firecracker → Fly host software → Containers ?

Bonus, I am struggling keeping the machines alive. They time out after 5 minutes. I tried autostop in Fly Machines API in the json config but no luck.

What is the message in the logs (fly logs) when they time out? The 7-day trial has an undocumented 5-minute limit, from what I’ve heard…

Right… The “Fly host software” in the middle of that sequence is the Pilot that @rubys linked to.

(This is init in the sense of PID 1.)

Thanks, yes it was the trial limitation why my machine is shutting down.