Frequent 17s lag spikes over Wireguard

preston · February 5, 2025, 9:07pm

When I connect to Wireguard and ping the internal IP of my Fly machine, I see frequent 17s lag spikes.

# ping6 <redacted> | ts '[%Y-%m-%d %H:%M:%.S]'
[2025-02-05 12:50:07.797727] 16 bytes from <redacted>, icmp_seq=2156 hlim=62 time=22.678 ms
[2025-02-05 12:50:08.800706] 16 bytes from <redacted>, icmp_seq=2157 hlim=62 time=20.664 ms
[2025-02-05 12:50:09.807612] 16 bytes from <redacted>, icmp_seq=2158 hlim=62 time=23.210 ms
[2025-02-05 12:50:26.857059] 16 bytes from <redacted>, icmp_seq=2175 hlim=62 time=21.002 ms
[2025-02-05 12:50:27.855351] 16 bytes from <redacted>, icmp_seq=2176 hlim=62 time=18.700 ms
[2025-02-05 12:50:28.864771] 16 bytes from <redacted>, icmp_seq=2177 hlim=62 time=27.760 ms

I don’t see the same problem pinging IPs on other networks, so I don’t think the problem is with the network connection to my office. If I’m pinging multiple Fly hosts, when there’s one of these lag spikes, the pings to all hosts are affected, but the pings between hosts are not. Is this a problem with Fly’s internal networking? I don’t have a ton of networking experience and I’m not sure how to debug it further.

lillian · February 5, 2025, 11:59pm

this is odd, could you share the peer name or IP address so I can take a look?

preston · February 6, 2025, 12:21am

Thanks! I sent it by email.

I tried debugging with Claude. It had me run:

sudo tcpdump -i en0 -n host sjc1.gateway.6pn.dev

Then said:

Looking at the UDP traffic between ports 52307 and 51820 (Wireguard), there’s consistent packet exchange every second, plus some bursts of larger packets. During a lag spike, UDP traffic continues but ICMP stops.

The packet sizes:

Regular keepalive: 96 bytes

Handshake/rekey bursts: 112-384 bytes

Responses get through during “lag” periods

This suggests Wireguard connection stays active but something is selectively blocking ICMP.

Then we tried:

sudo tcpdump -i en0 -vvn 'udp port 51820 or icmp6'

And Claude said:

The trace shows:

Regular UDP keepalive packets (96 bytes) exchanging normally

Some ICMPv6 neighbor discovery traffic

Periodic larger UDP packets (128-192 bytes) - likely Wireguard handshakes

Since Wireguard traffic continues during the lag, this suggests something on Fly.io’s end is dropping ICMPv6 packets for 17s periods.

Other things tried:

Adjusting MTU and PersistentKeepalive in the wireguard config
Using ipv4 address for the endpoint.

lillian · February 6, 2025, 1:09am

followed up by email. one note: tcpdump -i en0 'udp port 51820 or icmp6' won’t show you the inner ICMP packets, you want tcpdump -i wg0 icmp6 (or tcpdump -iany if that works on macOS)

system · February 13, 2025, 1:10am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
wireguard is ... directional somehow? Questions / Help	4	631	August 20, 2021
Getting flyctl ssh console timeout	3	881	May 2, 2021
Wireguard on linux must be restarted daily Questions / Help	3	342	April 28, 2022
Possible network problem (periodic spiking latency) in LHR/on host 81b8?	13	706	November 26, 2022
(Again) Experiencing abnormally high network response times Questions / Help	6	52	October 16, 2024

Frequent 17s lag spikes over Wireguard

Related topics