No handler doesn't seem to mean no intermediary

% cat fly.toml
app = "h2c"

[build]
  image = "registry.fly.io/h2c:latest"

[[services]]
  internal_port = 808
  protocol = "tcp"

  [[services.ports]]
    port = "80"
% curl --http2 -6 http://h2c.fly.dev
Proto: HTTP/2.0
RemoteAddr: 136.144.55.141:36696

The IP 136.144.55.141 is certainly not mine, I’m guessing it’s a Fly host from Packet, located in Equinix SV15 in Santa Clara.

I think I misunderstood from the documents that lack of a handler for an externally exposed port just meant direct routing to the application, with no Layer 4 or above intermediary in between.

I stopped and thought about if this expectation is even realistic. First I thought it passes the smell test. Assuming IPs (v4 and v6) are dedicated to applications, they can be announced from Firecracker hosts hosting them and attract traffic from whatever edge a client ingresses from. But wait, in the docs it says TLS termination is done at the ingress edge. One can have multiple external ports with different set of handlers. So it’s not as straightforward as just speaking BGP from all hosts and anycast all the way up. Then I realized, hey I’m not paid to think about this, in fact I’m trying to pay for the service, so here I am… :slight_smile:

Is it possible to terminate a TCP connection from the internet on the VM? If so, how? If not, do you intend to support this?

For the curious, this is the placeholder code running behind h2c.fly.dev.

package main

import (
	"fmt"
	"log"
	"net/http"

	"golang.org/x/net/http2"
	"golang.org/x/net/http2/h2c"
)

func main() {
	var h http.HandlerFunc = func(w http.ResponseWriter, req *http.Request) {
		w.Header().Set("content-type", "text/plain")
		w.WriteHeader(200)
		fmt.Fprintln(w, "Proto:", req.Proto)
		fmt.Fprintln(w, "RemoteAddr:", req.RemoteAddr)
	}

	log.Fatal(http.ListenAndServe(":8080", h2c.NewHandler(h, new(http2.Server))))
}

We have some experimental stuff setup for routing TCP direct with no intermediary, but it’s not self service yet. This is how we’d like to do it in the future when you remove handlers. But it’s a fringe need (so far).

If all you’re after is client IPs, you can use the proxy_proto handler.

2 Likes

Just because there’s I think no reason to keep any of this under wraps:

We’ve got experimental-grade direct termination of TCP at VMs built in XDP (I think I call the feature “TCP cut-through”) and if you have an application for it that you want to play with, we can get it set up.

It works similarly to the way UDP does: the actual network termination is happening on an edge host; we re-encapsulate individual TCP segments with tunnel headers and relay them to the fly-global-services address in the VM. This eats a dozen or so bytes of every segment; we clamp MSS in XDP (which was a project to get working).

I think this works fine right now, but it hasn’t been tested the way the UDP stuff has. I’m happy to have someone testing it for us! If you want to do something with it, we’d be happy to make sure you don’t pay to get it working.

1 Like

Neat. I’m familiar with the approach as we did something essentially the same with GitHub - facebookincubator/katran: A high performance layer 4 load balancer

Katran is direct-response, right? I wonder about this a lot; we go through some trouble to route responses back over the same path and my degree of certainty that it’s the right call is low.

Yes. The things that terminate the L7 (or if we’re being pedantic, L5, because TLS) protocol behind Katran are the ones that does the return. DSR (direct server return) side steps a bottleneck on the return path both in terms of data rate (also wrongly known as bandwidth) and packet rate. Take this way, the machines we run Katran on will have 25GbE, 50GbE, or 100GbE cards but the amount of full cone traffic they balance is much greater than 100 Gbps. The other aspect is what you already mentioned. You either need to tag the responses from the application so that they get routed to an IP-in-IP tunnel–which now you cannot define as external– or you start assigning DSCP tags to packets, ending up implementing a make shift source based routing for reply packets ¯\_(ツ)_/¯. Gnarly, overall. The only winning move is not to play :nerd_face:.

DSR ends up decreasing the “moving parts surface area”. What I mean by this is, as you distribute packet acrobatics across the distributed system you increase the odds of a dumb breakage somewhere taking things more than itself down. Additionally, the mental model gets more complicated, resulting an expensive to debug/operate system.

1 Like

Right. That was the next thing I wanted to try but not because I want to Client IP, just because I’m curious about how you laid things out.

I wanted to give this one a try as well so posting here in case it may help someone else.

Added "proxy_proto" to handlers and while I’m here also wanted to cascade it with the TLS handler.

% cat fly.toml
app = "h2c"

[build]
  image = "registry.fly.io/h2c:latest"

[[services]]
  internal_port = 8080
  protocol = "tcp"

  [[services.ports]]
    handlers = ["tls", "proxy_proto"]
    port = "443"

  [[services.ports]]
    handlers = ["proxy_proto"]
    port = "80"

…And this is what the application looks like now:

package main

import (
	"fmt"
	"log"
	"net"
	"net/http"

	
	"golang.org/x/net/http2"
	"golang.org/x/net/http2/h2c"
	pp "github.com/pires/go-proxyproto"
)

func main() {
	var h http.HandlerFunc = func(w http.ResponseWriter, req *http.Request) {
		w.Header().Set("content-type", "text/plain")
		w.WriteHeader(200)
		fmt.Fprintln(w, "Proto:", req.Proto)
		fmt.Fprintln(w, "RemoteAddr:", req.RemoteAddr)
	}

	l, _ := net.Listen("tcp", ":8080")
	defer l.Close()

	ppl := &pp.Listener{Listener: l}
	defer ppl.Close()

	log.Fatal(http.Serve(ppl, h2c.NewHandler(h, new(http2.Server))))
}

I’m curious about why TLS handler end up picking http/1.1 in ALPN when h2 is an option. I think this is contrary to what’s said in the docs (Reference > Network Services > TLS).

% curl -v https://h2c.fly.dev
*   Trying 2a09:8280:1:5b2c:a7b2:24cb:642a:6ba4:443...
* Connected to h2c.fly.dev (2a09:8280:1:5b2c:a7b2:24cb:642a:6ba4) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
[...cut...]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=*.fly.dev
[...cut...]
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: h2c.fly.dev
> User-Agent: curl/7.71.1
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Date: Sat, 09 Jan 2021 23:06:21 GMT
< Content-Length: 75
<
Proto: HTTP/1.1
RemoteAddr: [~~REDACTED~~]:51336
* Connection #0 to host h2c.fly.dev left intact

We don’t advertise h2 with ALPN when you have the http handler off. The HTTP servers that people run on Fly mostly don’t do h2c, and the simplest fix was just to drop it. We’ll make the ALPN configurable for the TLS handler someday!

1 Like

I’m a bit confused about this statement and the conclusion.

In this example, I’m also running something that supports H2C but because TLS handler wouldn’t advertise h2, it not possible to speak H2 from client to the service. If one needs to enable http handler to speak HTTP/2 from client to the service, well, then I don’t understand where running a service with h2c support becomes relevant.

Whoops, I left out an important “don’t”. :smiley:

Most peoples HTTP servers don’t do h2c. So when we advertised h2 with TLS, but then didn’t terminate h2 for people, requests failed.

Haha :laughing:. I re-read your reply like 10 times before replying because brevity was getting lost on me.

OK. Now everything makes sense. Looking forward to configurable ALPN support landing.

I just spend a lot of time trying to figure out why h2c to my container does not work. Now I know why :smiley:
I’d really love if that could be added (even as a org-level feature flag).

I’d be quite interested in testing this out. We’ve got a service we’re exposing over TCP directly for which it is quite important to have the actual remote IP.

@thomas following up on this – should binding TCP on fly-global-services just work for this, or is there something on your end that would need to be set up?

It won’t just work; the API for doing direct TCP routed around our proxy isn’t current exposed. But, if it were enabled, yes, that’s how it’d work for you: your container would just bind to the global service address.

Before we go down the road of lighting up that feature: the problem you’re talking about is why we have the proxy_proto handler — it forwards connections wrapped in haproxy’s proxy protocol, which there’s a bunch of libraries for. This is a much cleaner solution to the source address problem than the cut-through connections are; does that work for you?

The major win for cut-through TCP is for connections that need to never break — long-run video feeds are the big one. The problem is that every once in a blue moon we’ll roll one of our fly-proxies to a newer version and it’ll break some connections — that, too, is a problem we’re working on fixing, and we’ll probably resolve that before we surface TCP cut-through.

We can talk through doing TCP cut-through if you still think it’s important! I’m OK (happy, actually) to do something experimental to test through the feature with someone else’s traffic. But that’s what we’d be doing. :slight_smile:

Oh that will totally work. I think when I read through proxy_proto earlier I saw haproxy, my eyes glazed over, and then I assumed it’d only work for http and similar protocols rather than being a relatively generic solution.

Thanks!

I wrote this BPF TCP feature a year ago and have successfully spent a year fending off any attempts by our users to actually use it. :wink: