For the past year or so, we’ve operated physically-segregated hardware to support security functions like Fly Macaroon tokens, and OIDC/OAuth tokens. We do this because we know that at any moment there could be a new AMD or Intel microarchitectural vulnerability, and we want customer secrets isolated as much as possible from customer workloads. We love you, everyone running code with Fly Machines, we really do. But we don’t trust you.
We’ve used that same hardware-segregated cluster to roll out a big internal change, one we should write about soon.
Since 2020, we’ve used Hashicorp Vault to store customer app secrets. That means that when you set a secret, like your database DSN or API key, we’ve hot-potatoed it off our API servers as quickly as possible to a segregated cluster of Vault servers, and arranged for those secrets to be available to our infrastructure only on physical servers your app actually runs on.
Vault is great. No notes. Recommend without reservations.
But we’ve pushed its limits. Vault is designed to manage secrets for a single enterprise, and we run half a million different applications, on six continents. So we replaced it, with a system we call Pet Sematary (all the other cool crypt, coffin, vault, and sepulcher names were taken).
For many months now, your secrets have been maintained both in our Vault cluster and in “Petsem”, which means that if Vault experiences disruption, we’re fine, because Petsem has its back. This is a change you haven’t needed to know about. But this next bit, you will.
Today, Petsem is available only to our internal tooling. But soon, you’ll have access to it too. This is a Fresh Produce post about Fly KMS.
If you’re familiar with AWS or GCP, you know where this is going, but still: hold on to your butts.
Fly KMS makes it easy to encrypt and decrypt arbitrary blobs of data. Obvious examples: columns in a Postgres database, or uploaded files from users. Think of it as PGP, but for applications, easy to use, and with actually good cryptography.
The keys for Fly KMS are stored on isolated hardware, inside of Pet Semetary. Once they’re set, they never leave Petsem. You encrypt something with Fly KMS, you get a ciphertext blob. It will never (say never) be possible to decrypt that blob outside of Fly.io.
Fly KMS exposes just a few simple operations. We did this deliberately, so you don’t have to think about algorithms and block cipher modes. When this rolls out (shortly), it’ll support:
- Authenticating and verifying data with a private signing key (currently using NaCL’s “auth” primitives).
- Encrypting and decrypting blobs with a private encryption key (currently using NaCL’s “secretbox” primitives).
- Fly.io manages which primitives to use for encryption and signing keys and will pick the latest and greatest each time a new key is generated.
Authenticate or encrypt blobs of data. Just like every other cloud KMS (though: we have good taste in cryptography). Yadda yadda yadda.
Stop! Hold your yaddas. Here comes the fun bit.
What we don’t like about the idea of building a KMS is that it’s yet another API that you have to pull down libraries for and integrate into your app. If you were going to do that, why not just install and run Hashicorp Vault? We needed to do better. Here’s what we came up with:
Fly KMS is exposed directly as a Linux filesystem. You can drive it from a shell script. You can drive it from a shellscript without installing any extra tooling. If KMS-style cryptography had been understood in 1976 when the Lions Commentary on 6th Edition Unix was published, this is what it would have looked like:
You app starts, and /.fly/kms
is mounted automatically, with a view of available keys. Now, Fly KMS is new, so you don’t have any of those. Create one with flyctl
:
customer$ flyctl secrets keys gen encrypting myencrkey
Setting myencrkeyv0 encrypting (nacl_secretbox)
customer$ flyctl secrets keys ls
LABEL NAME VERSION TYPE
myencrkeyv0 myencrkey 0 encrypting (nacl_secretbox)
Now, ls /.fly/kms
:
appmachine# ls /.fly/kms
myencrkey
appmachine# ls /.fly/kms/myencrkey
decr encr info
appmachine# find /.fly/kms |xargs ls -ld
dr-xr-x--- 2 root root 0 Sep 13 08:50 /.fly/kms
dr-xr-x--- 2 root root 0 Sep 13 09:32 /.fly/kms/myencrkey
-rw-rw---- 1 root root 0 Sep 13 09:32 /.fly/kms/myencrkey/decr
-rw-rw---- 1 root root 0 Sep 13 09:32 /.fly/kms/myencrkey/encr
-rw-rw---- 1 root root 0 Sep 13 09:32 /.fly/kms/myencrkey/info
appmachine# cat /.fly/kms/myencrkey/info
label: myencrkey
type: encrypting
ops: decr encr info
latest version: 0
version 0: label=myencrkeyv0 secrettype=nacl_secretbox
Want to encrypt the string hello world
? Write it (probably base64’d) to /.fly/kms/myxaeskey/encr
. Read the ciphertext back out. Decrypt it? Write the ciphertext to decr
, read the plaintext out.
Behind the scenes, these filesystem endpoints proxy to calls, authenticated through flyd
with Macaroon tokens, to Pet Sematary. But you don’t need to know anything about that.
All our keys, for all our operations, are versioned. Team member leaves the team? Rotate all your keys: just generate a new key with the same flyctl
command:
customer$ flyctl secrets keys gen encrypting myencrkey
Setting myencrkeyv1 encrypting (nacl_secretbox)
customer$ flcytl secrets keys ls
LABEL NAME VERSION TYPE
myencrkeyv0 myencrkey 0 encrypting (nacl_secretbox)
myencrkeyv1 myencrkey 1 encrypting (nacl_secretbox)
The new key will be available for use in the machine within minutes of being created. Since the old key version is still available, decryption of data previous encrypted with it is still possible, but all new encryption operations will use the new key version. When you no longer need the old version, you just delete it:
customer$ flyctl secrets keys rm myencrkeyv0
? delete secrets key myencrkeyv0? Yes
Deleted myencrkeyv0
You can create lots of keys, for different purposes, and they’ll all show up dynamically on your Machine filesystem. Driving this API from Typescript, Elixir, Rails, or Python is a snap.
Some of the details of this API are going to change within the next week! We ran this design past people who have built other KMS schemes and got amazing feedback. We’ll be generating ciphertexts with key-ids, easy to grep for, that make it apparent from the ciphertext which keys and algorithms were used. We’re abstracting away a lot of the details of the algorithms we’re using (under the hood, this will let us ratchet up security without you having to know or care). We may simplify the operations we expose.
But all this stuff has been up and running in our staging environment for a couple weeks, and it’s past time for us to let you know about it. Have you ever needed to encrypt something in a Fly app? Tell us about it. If Fly KMS doesn’t work for it, we want to know.