When I connect to Wireguard and ping the internal IP of my Fly machine, I see frequent 17s lag spikes.
# ping6 <redacted> | ts '[%Y-%m-%d %H:%M:%.S]'
[2025-02-05 12:50:07.797727] 16 bytes from <redacted>, icmp_seq=2156 hlim=62 time=22.678 ms
[2025-02-05 12:50:08.800706] 16 bytes from <redacted>, icmp_seq=2157 hlim=62 time=20.664 ms
[2025-02-05 12:50:09.807612] 16 bytes from <redacted>, icmp_seq=2158 hlim=62 time=23.210 ms
[2025-02-05 12:50:26.857059] 16 bytes from <redacted>, icmp_seq=2175 hlim=62 time=21.002 ms
[2025-02-05 12:50:27.855351] 16 bytes from <redacted>, icmp_seq=2176 hlim=62 time=18.700 ms
[2025-02-05 12:50:28.864771] 16 bytes from <redacted>, icmp_seq=2177 hlim=62 time=27.760 ms
I don’t see the same problem pinging IPs on other networks, so I don’t think the problem is with the network connection to my office. If I’m pinging multiple Fly hosts, when there’s one of these lag spikes, the pings to all hosts are affected, but the pings between hosts are not. Is this a problem with Fly’s internal networking? I don’t have a ton of networking experience and I’m not sure how to debug it further.
Looking at the UDP traffic between ports 52307 and 51820 (Wireguard), there’s consistent packet exchange every second, plus some bursts of larger packets. During a lag spike, UDP traffic continues but ICMP stops.
The packet sizes:
Regular keepalive: 96 bytes
Handshake/rekey bursts: 112-384 bytes
Responses get through during “lag” periods
This suggests Wireguard connection stays active but something is selectively blocking ICMP.
Then we tried:
sudo tcpdump -i en0 -vvn 'udp port 51820 or icmp6'
followed up by email. one note: tcpdump -i en0 'udp port 51820 or icmp6' won’t show you the inner ICMP packets, you want tcpdump -i wg0 icmp6 (or tcpdump -iany if that works on macOS)