Restartless volume extension

We are glad to announce we just shipped a new feature that will let volumes to be extended without needing a machine restart!

$ fly vol extend vol_915grny71zp4n70q -s 4 -a APP_NAME
        ID: vol_915grny71zp4n70q
      Name: pg_data
       App: APP_NAME
    Region: scl
      Zone: d32e
   Size GB: 4
 Encrypted: true
Created at: 12 Jul 23 14:54 UTC
Your machine got its volume size extended without needing a restart

No one likes downtime, period. Unfortunately, whenever you extended your volumes you’d be asked to do a machine restart to apply changes to its file system. Until now, at least. Any machine version being spawned from now on will be able to handle volume extensions with online file system resizing.

Make sure you update your flyctl to fly v0.1.55 so you can see the success message, older versions don’t know about this so they will always tell you to restart the machine.

How does it work?

We tweaked our machines init process to be able to handle online file system resizing and our orchestrator, flyd, will ensure to call that when a volume extension happens.

Internally we use resize2fs(8) to extend volumes inside your machine file system. Not only that, since we use Firecracker, we also call its internal API to patch the drivers inside the guest VM which can have some points of concern.

What happens to old machines?

Since this feature relies on our updated init process, machine versions older than today will not have this feature. All you need is to trigger a new version update. Here are some examples of how to trigger that:

  1. Do a new deployment on apps V2 to apply to all machines.
  2. Update a machine doing no changes and using --yes. Example: fly machine update MACHINE_ID -a APP_NAME --yes.

If you try the second option on a Postgres cluster, make sure you do this on replicas first then to the primary.

You can verify the version got updated by using fly machine status before and after:

$ fly machine status MACHINE_ID -a APP_NAME
Instance ID: 01H55MKAMH66MZWVX7WFM4R9GD <== this is the version
State: started

  ID            = MACHINE_ID
  Instance ID   = 01H55MKAMH66MZWVX7WFM4R9GD <== this is the version
  State         = started