Say I have 3 openresty instances which proxy to 3 varnish instances, spread across 3 regions. Varnish proxies to a Rails app.
Under normal circumstances, openresty should route to its region-local varnish pair. That can be done using the current_region.varnish.internal address. What happens if the current region’s varnish is unhealthy?
If the answer is that no IPs would be returned, would it smart to get all healthy records from varnish.internal and ensure the regional one is weighted first?
Naturally, my next question: could this logic could be handled somehow by service discovery directly? If not, I suppose it could be generalized into a script that returns the correctly ordered IPs.
One way to handle this weighting that would be compatible with Varnish vmod_dynamic is adding support for DNS SRV records. Is this something Fly would consider? The weighting could be done by placing the current region first, then the next nearest, etc.
This would be great for services like Varnish which do not require any additional configuration to refresh backends dynamically. Nginx also appears to support this in their upstream module.
fly restart just cycles through each VM and restarts the process. It’s not very smart. If you want better control you can fly vm stop <id> one by one, or do a deploy.
An event stream would make a ton of sense, it should be pretty quick to add this to the NATs endpoint we’re using for longs. I don’t know when we’ll get to it but we’re doing a lot of related replacing-of-parts.