Frequent 17s lag spikes over Wireguard

When I connect to Wireguard and ping the internal IP of my Fly machine, I see frequent 17s lag spikes.

# ping6 <redacted> | ts '[%Y-%m-%d %H:%M:%.S]'
[2025-02-05 12:50:07.797727] 16 bytes from <redacted>, icmp_seq=2156 hlim=62 time=22.678 ms
[2025-02-05 12:50:08.800706] 16 bytes from <redacted>, icmp_seq=2157 hlim=62 time=20.664 ms
[2025-02-05 12:50:09.807612] 16 bytes from <redacted>, icmp_seq=2158 hlim=62 time=23.210 ms
[2025-02-05 12:50:26.857059] 16 bytes from <redacted>, icmp_seq=2175 hlim=62 time=21.002 ms
[2025-02-05 12:50:27.855351] 16 bytes from <redacted>, icmp_seq=2176 hlim=62 time=18.700 ms
[2025-02-05 12:50:28.864771] 16 bytes from <redacted>, icmp_seq=2177 hlim=62 time=27.760 ms

I don’t see the same problem pinging IPs on other networks, so I don’t think the problem is with the network connection to my office. If I’m pinging multiple Fly hosts, when there’s one of these lag spikes, the pings to all hosts are affected, but the pings between hosts are not. Is this a problem with Fly’s internal networking? I don’t have a ton of networking experience and I’m not sure how to debug it further.

this is odd, could you share the peer name or IP address so I can take a look?

Thanks! I sent it by email.

I tried debugging with Claude. It had me run:

sudo tcpdump -i en0 -n host sjc1.gateway.6pn.dev

Then said:

Looking at the UDP traffic between ports 52307 and 51820 (Wireguard), there’s consistent packet exchange every second, plus some bursts of larger packets. During a lag spike, UDP traffic continues but ICMP stops.

The packet sizes:

  • Regular keepalive: 96 bytes
  • Handshake/rekey bursts: 112-384 bytes
  • Responses get through during “lag” periods

This suggests Wireguard connection stays active but something is selectively blocking ICMP.

Then we tried:

sudo tcpdump -i en0 -vvn 'udp port 51820 or icmp6'

And Claude said:

The trace shows:

  1. Regular UDP keepalive packets (96 bytes) exchanging normally
  2. Some ICMPv6 neighbor discovery traffic
  3. Periodic larger UDP packets (128-192 bytes) - likely Wireguard handshakes

Since Wireguard traffic continues during the lag, this suggests something on Fly.io’s end is dropping ICMPv6 packets for 17s periods.

Other things tried:

  • Adjusting MTU and PersistentKeepalive in the wireguard config
  • Using ipv4 address for the endpoint.

followed up by email. one note: tcpdump -i en0 'udp port 51820 or icmp6' won’t show you the inner ICMP packets, you want tcpdump -i wg0 icmp6 (or tcpdump -iany if that works on macOS)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.