Internal routing flakiness (getaddrinfo ENOTFOUND)

hpx7 · September 4, 2022, 2:23pm

We have been observing flakiness when making TCP requests between regional instances of the same app.

Log snippet:

2022-09-04T13:50:37.030 app[24d8907eb49587] mia [info] store unsubscribeUser {"stateId":"XXXX","userId":"XXXX"}
2022-09-04T13:51:01.945 app[591854ea39d836] ewr [info] AxiosError: getaddrinfo ENOTFOUND 24d8907eb49587.vm.hathora-games-coordinator.internal

This shows a connect failure from the ewr instance to the mia instance. This connection succeeds 90% of the time but 10% of the time it randomly fails with the above error. The mia instance definitely seems up and available when we see the flakiness.

Please advise on whether we should be doing something different (like using IPV6 address instead of internal hostname) or how we can debug this further.

Topic		Replies	Views
6PN failing?	8	410	October 8, 2021
Poor edge response time in MIA region	2	288	February 19, 2022
dfw not working as expected	9	499	May 13, 2022
DNS resolution fails inside app (part 3)	6	296	January 8, 2024
server was unreachable until re-deploy	10	429	January 21, 2022

Internal routing flakiness (getaddrinfo ENOTFOUND)

Related Topics