Chasing the fast-writes-in-all-regions dream

Hey folks, I’m trying to explore the problem space around deploying an app that uses Websockets, involves a lot of writes, and needs low latency in multiple regions.

For this use-case the single-primary multi-region Postgres setup via Stolon is “close but no cigar”, as you only get fast writes in one region and slow writes in all the others.

I’ve been trying to look at what options might be available. I started looking at multi-master-replication with Postgres, but it resulted in a lot of expensive/proprietary dead-ends. I’m also considering kicking data out of Postgres and into a distributed active-active Redis cluster or possibly switching to CockroachDB to take advantage of it’s multi-active features.

Has anyone else been exploring potential solutions in this area, and does anyone have any hot tips?

:pray:

Multiple writer nodes is a hard problem for sure. Cockroach is likely your best bet if you want a multi-primary model. We have an example app to get you started: GitHub - fly-apps/cockroachdb

We’ve had decent success with it ourselves in the past. It works best if the same indexes written to are always in the same region. Cockroach splits indexes into ranges. Well, don’t take my word for it, I’m no expert there, maybe it’s only important for reads?

If you want to use redis, KeyDB has a multi-primary feature.

It also comes down to what kind of writes you want to do. Is it append-only or do you also need to update? If it’s the former, you may not need such a setup.

1 Like

Is it append-only or do you also need to update?

Updates for sure :+1:

I totally missed the fly-app for CockroachDB! Thanks, I’ll take a look.

Have you tried out YugabyteDB as well? I just realised Cockroach doesn’t support all Postgres SQL syntax for grouping / aggregate functions, and I just wrote a feature last week that uses them heavily :see_no_evil:

We haven’t tried Yugabyte but I’m assuming it’ll work fine with a similar setup! There may be subtleties with how it replicates and keeps stats consistent, but our private network is simple enough that it shouldn’t matter. As long as it can talk ipv6.

We’re wary of their claims, but it’s probably worth giving them a shot :slight_smile:

Have you checked out DynamoDB Global Tables? Can write to every AWS region, and they’ll replicate to each other with eventual consistency and last write wins in case of conflicts. I have a Redis-ish adapter for it as well.

Or Azure’s CosmosDB. It has multi-master, global, low-latency etc. Its cost varies from free to completely ludicrous. Which will depend on what you are doing with it.

+1 for DynamoDB Global Tables, it has very reasonable pricing. I’m currently set up with:

iad ~= 'us-east-2'
lhr ~= 'eu-west-2'
sin ~= 'ap-southeast-1'
sjc ~= 'us-west-1'
syd ~= 'ap-southeast-2'

(I would prefer to use ‘ap-south-1’ but Fly doesn’t have India yet)

On some ad-hoc GetItem requests via lhr I had ~5ms response times at the backend.

There’s also Google Spanner, which is much more expensive but SQL based.

I’ll also take a look at fly-apps/cockroachdb which jerome posted

Fly does have India as of a few days ago, the MAA region. It’s about 900km away from ap-south-1 (Mumbai) so not ideal, but it should work.

How’re you deciding which region to connect to? Request header - region mapping?

1 Like

Fly does have India as of a few days ago, the MAA region. It’s about 900km away from ap-south-1 (Mumbai) so not ideal, but it should work.

Oh great, thanks for the tip! India is fricken huge. Still, for users in India that would be the better pairing than travelling out to Singapore for ap-southeast-1. So my updated “Anglophile” mapping looks like:

iad ~= 'us-east-2'
lhr ~= 'eu-west-2'
maa ~= 'ap-south-1'
sjc ~= 'us-west-1'
syd ~= 'ap-southeast-2'

How’re you deciding which region to connect to? Request header - region mapping?

I’ve not actually implemented this yet, but I think lookup based on the environment variable FLY_REGION would be the right option from the perspective of the backend.

If you’re on Go check out GitHub - sudhirj/aws-regions.go: Find the AWS region closest to your server, especially when deploying on systems like Fly.io - it starts a goroutine that pings all the DynamoDB regions once every few minutes and tells you which one is fastest from your server (and you can subset it to the regions you want).

Let me know which stack you’re on? Can build a lib for JS/Deno, or Ruby, or might be nice to use this as an Elixir learning project.

I’m on a JS stack atm, but I’ve also selected the regions to pair up like above, so any other pairing would be guaranteed to give terrible latency.

I think that concept might be helpful as a paranoid failover mechanism in case an entire DynamoDB region goes offline or becomes flaky, but my app isn’t at that stage of perfection yet.

Hmm. You might be surprised. Even for the India example it’s possible that MAA → Singapore will be faster than MAA → Mumbai (better fiber optic cabling). And if there’s more than one sequential DynamoDB call, the overall request time will be faster when served from SIN with a SIN DDB Table. And running a dynamic test would actually guarantee equal or better latency, not worse. If this is the best paring it’s what will be selected automatically.

Great point, network conditions do change over the course of time despite what I hoped. Clearly I still need to have ‘the fallacies of distributed computing’ drilled into my head :slight_smile:

CockroachDB, YugaByte and Spanner all need consensus before considering writes successful. Because of that they’re much better suited to using locations which are close together, eg. London, Amsterdam, Paris, or AWS us-east-2a, us-east-2b, us-east-2c, than far apart.

They’ll still work across longer distances but throughput will be lower and latency include at least one round-trip to the next furthest away region.

Therefore in my opinion applications which need SQL are better off to use a typical PostgreSQL setup on a large server in the East Coast, with fail-over to a replica in a nearby data centre. The latency will typically be quite reasonable due the East Coast having great connectivity with the rest of the world (albeit more like 50-200ms than 5-50ms).

DynamoDB Global Tables, Cloud Bigtable, KeyDB, and CosmosDB can all do low latency global writes using a last-write-wins strategy, albeit losing the benefits of SQL. But some apps could potentially mix-and-match SQL and NoSQL.

It’s possible to host your own Cassandra or KeyDB multi-region but I wouldn’t recommend it; such deployments are best done with 3 AZ in each region, and at that point you’re better off using one of the SaaS solutions. It’s fine to have a PostgreSQL read replica in ‘syd’, but to have 3 Cassandra nodes as a ‘syd’ region leaves you with a heck of a lot less redundancy than if you’d used DynamoDB’s ap-southeast-2(a,b,c).

1 Like

It looks like YugabyteDB supports that “mix-and-match” approach by default? So you could have a three-node cluster in each region (with synchronous replication / Raft-based consensus within the cluster), but you can also add multi-active asynchronous replication between regional clusters (with last write wins): xCluster replication | YugabyteDB Docs

1 Like

Doh, sorry for the misinformation!

That seems to put YugaByte in a class of its own then :smiley: (caveat Chasing the fast-writes-in-all-regions dream - #4 by jerome)

The Jepsen analysis is usually what I check out to see the failure conditions of any distributed database. Everything fails at some point, I guess, so these analyses are helpful to see where and when they fail and if the failure case is manageable. Jepsen: YugaByte DB 1.3.1

2 Likes