Incoming! SSH Support For Instances

thomas · February 4, 2021, 12:19am

New stuff!

This week we’re rolling out a feature that makes it possible to quickly pop a shell on your instances. I’m going to write a lot more about what exactly we’re doing sometime next week, but since it’ll start working this evening, I want to give you all a heads up.

Instances launched tonight will be running a tiny SSH server bound to their internal 6PN addresses.

As a practical matter, what this means is that you can only reach the SSH server by connecting with WireGuard, which you can do with the flyctl wireguard command after you install WireGuard on your host (it’s super easy, and the app store version works great on macOS).

Once you can reach your instances with WireGuard, you can use flyctl to mint SSH credentials. You’ll want to update: flyctl version update.

There are two commands you want to know about right now:

flyctl ssh establish creates a root SSH certificate for your organization. All SSH authorization is (currently) done on an organization-by-organization basis. You can just run that command, it’ll prompt you, and you don’t need to save the output.
flyctl ssh issue issues a new 24-hour SSH certificate based on your root certificate. By default, it’ll save your certificate in a pair of files (an id_foo and an id_foo-cert.pub; you’ll need both) which you can pass to ssh -i.

But handling SSH certificates by hand is tedious and I don’t recommend it; instead, make sure you’re running an SSH agent (a trivial way to do that is to run something like ssh-agent bash) and then run flyctl ssh issue -a. We’ll add the SSH credentials to your current agent and you don’t have to think about them.

You can log into a host as root or fly; we don’t currently do anything with usernames (not everyone runs a container that has them) but certainly will be adding that in the near future.

An obvious question you’ll have is, “how do I find addresses to log into”. The answer right now is clunky! Your WireGuard configuration, the one we generated for you, includes a private DNS server; what we do in practice is just use the dig command to find 6PN addresses. For instance, if your app is drastic-cobweb-39, you can dig aaaa drastic-cobweb-39.internal @your-dns-ip +short to find addresses to log into.

So many caveats!

This is a prelease feature. It will be especially janky tonight (in particular, give flyctl ssh establish a minute or two to propagate). It will get less janky over time.
The SSH implementation is right now pretty limited; you can get a shell, and you can run commands, but agent forwarding, port forwarding, rsync, all that stuff, I wouldn’t count on right now.
In the relatively near future, most of you won’t need WireGuard installed to do simple SSH commands, and you won’t have to manually look up IPv6 addresses. But right now, you do.

Let us know what you think or what questions you have or what you might want features-wise going forward. Thanks as always, fly-friends!

julia · February 4, 2021, 1:44am

This is awesome! I’d love to be able to set the same environment variables that my app has when it’s running (secrets, env vars from fly.toml, the same PATH, etc).

chr-s · February 4, 2021, 2:06am

This is wonderful! I logged into an app and gave it some swap. stress-ng suggests it’s only ~1000x slower than RAM

Two small things: I wish flyctl ssh establish output ended with a new line the first time it’s run. The other thing, “Email address for user to issue cert” is an awkward phrase.

What’s the reasoning for writing hallpass? I can think of an interesting use case for X-forwarding (on demand localish dev env w/ pycharm or some gui editor) but I doubt you’re ever going to want to implement it.

The wireguard networking is really slick once it works. For me, on Ubuntu 20.10 to make DNS work I added some PostUp commands to my config:

chris@chi2:~/work/fly$ sudo cat /etc/wireguard/fly.conf

[Interface]
PrivateKey = <omitted>
Address = fdaa:0:eb3:<omitted>
PostUp = systemd-resolve -i %i --set-domain internal
PostUp = systemd-resolve -i %i --set-dns fdaa:0:eb3::3

[Peer]
PublicKey = <omitted>
AllowedIPs = fdaa:0:eb3::/48
Endpoint = ord1.gateway.6pn.dev:51820
PersistentKeepalive = 15

Then, after I sudo wg-quick up fly, I can see the systemd gods are happy:

chris@chi2:~/work/fly$ systemd-resolve --status fly
Link 4 (fly)
      Current Scopes: DNS          
DefaultRoute setting: yes          
       LLMNR setting: yes          
MulticastDNS setting: no           
  DNSOverTLS setting: no           
      DNSSEC setting: no           
    DNSSEC supported: no           
  Current DNS Server: fdaa:0:eb3::3
         DNS Servers: fdaa:0:eb3::3
          DNS Domain: internal

thomas · February 4, 2021, 2:30am

That’s a good ask. Kurt and Jerome, opinions? It’s not hard to do; I can pass them on from init to SSH.

thomas · February 4, 2021, 2:33am

As opposed to OpenSSH? I would bet all the money that there’s never going to be a preauth vulnerability in OpenSSH again, but it’s still C code, and it’s sort of complicated. hallpass is the most minimal conceivable application of Golang’s x/crypto/ssh; basically just barely enough to grok an SSH certificate and, if needed, kick off a pty.

We’re also probably going to do more with hallpass — users and authorization for starters, but also audit trail stuff.

Do you really have port forwarding use cases, even with WireGuard set up? I’ve been wondering about that. I’m willing to implement port forwarding if it’s useful! It’s not hard to do.

jerome · February 4, 2021, 2:36am

Yes. I’d like it to do that too

chr-s · February 4, 2021, 2:47am

Do you really have port forwarding use cases, even with WireGuard set up? I’ve been wondering about that. I’m willing to implement port forwarding if it’s useful! It’s not hard to do.

Y’know, I don’t think so. fly is really exciting and you’re right to leave OpenSSH out of it.

thomas · February 4, 2021, 2:47am

My feet are like wings! It does on dev now; it will everywhere tomorrow.

thomas · February 4, 2021, 5:15am

IT HAS BEEN AN EVENING HERE.

Just a quick note that I managed to accidentally ship a hallpass that was built with CGO, meaning, since hallpass uses the DNS, that it was dynamically linked.

Many of your containers are, sensibly, stripped to bare minimum binaries, and don’t have a full complement of dynamic libs in /lib. So a lot of you probably saw hallpass errors, instead of a working SSH server (our init knows hallpass might be janky, because it’s brand new, and keeps on chugging even if it can’t start the SSH server, so it shouldn’t have destabilized anything, except by giving you some annoying log lines).

bdd · February 5, 2021, 7:20am

For reference, if using NetworkManager for configuring network connectivity, importing generated WireGuard config and also binding .internal name resolution to given DNS server (instead of routing everything over it) is rather easy.

Maybe this will be helpful for others:

# Get the WireGuard connection details, saving them to 'fly-wg0.conf'.
% flyctl wg create personal sjc workstation5
Creating WireGuard peer "workstation5" in region "sjc" for organization personal
[...]
? Filename to store WireGuard configuration in, or 'stdout':  fly-wg0.conf
Wrote WireGuard configuration to 'fly-wg0.conf'; load in your WireGuard client

% nmcli connection import type wireguard file fly-wg0.conf
Connection 'fly-wg0' (12a938ca-1b45-4326-b3bb-c160f3840d2a) successfully added.

# Define a "routing domain" for systemd-resolved.
# Only this connection's DNS server will be used for resolving names under '.internal' TLD.
% nmcli connection modify fly-wg0 ipv6.dns-search '~internal'
# Good to go.
% nmcli connection up fly-wg0
# Give us a hallpass.
% ssh fly@drastic-cobweb-39.internal

bdd · February 5, 2021, 7:38am

If hallpass was open source, I’d happily send a pull request for this but I think it’d be nice to give explicit error messages when session creation fails. It can save a some debugging and head scratching time. For example, if there’s no /bin/sh found in the container image–e.g. FROM scratch containers–tell the client what’s up instead of silently hanging up.

thomas · February 5, 2021, 8:03am

There’s a lot of code that we have that would be public if it wasn’t so specific to what we were doing, and thus sort of seemingly useless for everyone else. In hallpass’s case, it’s actually doing a little bit less than gliderlabs/ssh; the only thing it’s doing that gliderlabs hello-world doesn’t is pulling the per-org certs.

I’ll give some thought to posting the code for it, but it’s underwhelming, I promise.

thomas · February 5, 2021, 11:27pm

Just a quick note: your SSH sessions on instances started this evening should now see the same environment variables as your entrypoint does.

kurt · February 6, 2021, 12:13am

I was getting really good at dropping export $(cat /proc/518/environ | strings | xargs) into my sessions.

julia · February 13, 2021, 4:14pm

It looks like the SSH server isn’t respecting the user’s shell in /etc/passwd – I just sshed into an instance that has /bin/bash set as root’s shell, but it started /bin/sh instead.

$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
$ cat /proc/$$/cmdline
/bin/sh

thomas · February 13, 2021, 4:35pm

It indeed doesn’t! We can’t assume a libc (or, I guess, for that matter, a password file) so we don’t get username lookups. I can fix that, though I’ve just been using ssh -t <hostname> bash.

julia · February 13, 2021, 4:39pm

It’s really not a big deal to just run bash, but when I first sshed in I spent a few minutes being Very Confused about why my shell was “broken” (Ctrl+L and arrow keys didn’t work etc) until I realized what shell was running.

thomas · February 13, 2021, 4:41pm

I should probably just fix it, it’s not like /etc/passwd is hard to parse if it’s there. I could even… support uids for non-root users, and issue non-root certificates.

ykd · August 1, 2022, 12:52pm

haha,

you can’t go wrong with assuming sh in containerised environment
especially since purpose of firecracker VMs is to be as lightweight as possible. tbh I didn’t even know they have bash