If you haven’t checked out the other post yet, go do that first!
There is some (I think) kinda neat engineering that went into those new features, and I have a feeling you nerds might find it kind of interesting too. So I’m introducing a new type of fresh produce accompaniment post called “How it’s Made” where we can geek out about the technology behind the features. Not every fresh produce will have them, but we’ll see how it goes :).
We broke the Machine config mutability model (but that’s OK)
The mutability model for Machines is supposed to be: “it isn’t [mutable].” This works out pretty well. When you make a request to update a Machine by, say, changing an environment variable:
- a new config version gets built and saved in
flyd
with your changed env - the Machine record in our state stores get pointed at the new version
- your Machine is magically replaced with one which matches the new version spec
What this means is that generally a given Machine config version never changes once it’s created. This is great!
Until it isn’t.
Up until now if you wanted to update the metadata
field you’d update it like any other part of the Machine config, and the regular update process would happen, including a VM restart.
For dynamic metadata we obviously do not want to trigger a Machine restart. So we just… broke the Machine mutability model.
The metadata update endpoints go through a separate update process which writes directly to our state stores and updates the current config directly.
Macaroons! (how we made this secure)
Macaroons are a special type of access credential which in them cary a list of “caveats” that restrict the power of the token. A macaroon starts off as all powerful, and each caveat restricts it further. For example a token that grants user A readonly access to app B in organization O, might have the caveats:
- Must be logged in as user A
- RWX organization O
- R app B
The interesting thing about Macaroons is that anybody holding one can add additional caveats, further restricting what it can do. An added caveat can only reduce the scope of a token (every caveat must evaluate true, independently).
We’re gradually rolling out Macaroon tokens inside of Fly.io, and we already have an infrastructure for handling them.
The tokens we already issue always include a “must be logged in as user” caveat (this caveat means that our Macaroons are safe to pass around on insecure channels, because they’re not simple bearer tokens and can’t do anything by themselves).
But for our use case though we don’t want any “Must be logged in as user” caveats. A Machine isn’t a user. On the flip side of this, we don’t want to be issuing Macaroons without some kind of safeguard to keep them from being pure bearer tokens.
So we came up with a notion of “service tokens”.
Where ordinary authentication tokens are granted to a user, service tokens are granted to an internal Fly.io service. Like all Macaroon tokens, they’re constrained to a specific set of actions on specific resources. The Metadata service tokens, for instance, are limited to a specific application.
For features like these, we want a token we could safely inject into VMs to allow them to only update metadata, and that would also be totally safe if exfiltrated from the VM. To make this happen we invented two new caveats:
- A
Machine Feature
caveat, in this case one that says only themetadata
feature - A
From Machine
caveat, which requires that a token be used from inside a specific Machine
The From Machine
caveat is kind of cool. It works because we own the whole stack, so we know exactly where requests are coming from. When you make a request to our api via _api.internal
in a Machine, flaps
(our per host API gateway) knows which IPv6 address the request is coming from on the host. Because we assign the IPv6 addresses, we can easily map that to a Machine ID which we can use to ensure the caveat passes.
Magic proxy
As cool as our magic token is, it didn’t feel seamless enough to just put it in an environment variable for people to use. So… we built an authenticated proxy into our init process that runs as PID 1 on every Machine. That proxy opens a unix socket at /.fly/api
which you can proxy requests through. Every request that gets made through it gets the auth token added.
Questions?
I think this stuff is pretty cool, but I’m biased
Regardless let us know if you have any questions! We’d be happy to answer them.