Multi region database guide

Happy Sunday! We wrote a guide for running multi-region read replicas and routing write requests properly.

It’s relatively straightforward to run ready heavy, PostgreSQL backed apps all over the world on Fly. You should try it out, a Rails app needs ~10 lines of code and then it just works: GitHub - fly-apps/rails-on-fly

We are going to ship libraries to make this almost transparent for various frameworks. Right now we have plans for Elixir, Ruby, and Node (Express, particularly). If you’re working with another framework, post here and we’ll write some code for you.

3 Likes

Cool, could you post an Phoenix based example. Also the guide says it’s not suitable for long running connections like websockets, so can it be used with Phoenix Liveview ?

Nice work!

For those of us who are scared by using exception handling to replay a request, would there be a downside just redirecting all POST/PUT/DELETE requests right out of the gate?

For situations where you don’t want stale data to potentially be a problem, would it make sense to allow the client to request a region? For example, after creating a record and redirecting to that record’s permalink, the redirect URL might have a region parameter appended. The app could then ask for a regional replay of this request as well (or so could the router interpret it directly).

Joshua

Perhaps as part of future “routing rules” work we might do :slight_smile:.

Right now it is possible to target a specific region via a request by using the Fly-Prefer-Region header. This will tell our proxy to look for instances in a specific regions first and if none exist or are healthy, we’ll use our normal load balancing algorithm.

Ah nice, I didn’t know about that header. For now, this logic can be handled within the application, so maybe something useful to add to a Fly adapter.

@Mark has been working on Elixir instructions.

This won’t work with LiveView if you write to your DB – but it’s relatively simple to run all writes through the primary region using Elixir clustering. If you use LiveView with something like pgnotify or Phoenix pubsub, it’ll work just great though!

You can send this header to us for any reason. It would be pretty simple to add a middleware that tells us to replay all POSTs

1 Like

Would something like Multiple Databases with Ecto and Phoenix · edmz work for Phoenix ?

Having two Ecto Repos, one for Write and one for readonly. For the write only we could use the port 5432 and for read only Repo 5433.It’s a little bit more involved when writing code, but gives opportunity to fine tune the repos and incrementally move queries to Read Replicas.

Since they are handled at the write level it should in theory work for LiveView right ?

1 Like

That’s actually pretty close to the approach I’m taking! The Ecto approach uses two repos, a primary (with write access) and a replica, which is assumed to be a local regional replica.

Then I’ve wrapped the Repo functions so you can pretty much ignore it. Then, when you have those cases where a LiveView page does an insert/update and then a separate read, you can tell it, “this time, use the primary for the read”. So it’s not 100% transparent, but has an easy escape hatch when you need to do something explicit.

2 Likes

Can you please share the Repo code for reference ?

1 Like

Fastify for Node please!

2 Likes

Impressive! Would love to see code for Django.

Something for Go would be awesome!

1 Like

Django and Go and Fastify are on our wishlist! Just need to find someone to build 'em. :slight_smile:

4 Likes

Hey everyone. I wrote a starter guide for Flyio multi-region databases with Phoenix/Elixir (includes code for a working app):

https://nathanwillson.com/blog/posts/2021-09-25-fly-multi-db/

Hopefully it helps!

2 Likes

@nbw thanks for writing that up and sharing your guide! I’ve been working on a library that I hope to share with the community here fairly soon. I’ll be presenting it at ElixirConf as well in Oct. It helps provide a more transparent approach for working with Postgres in a read-replica/primary setup.

Hey! That’s good timing. We should talk and I look forward to seeing your talk. I decided to put together what I wrote in that post into a library while I still had everything in my head. Maybe we landed on a similar solution:

https://hexdocs.pm/fly_multi_region/0.0.1/FlyMultiRegion.html

Just message to those at Fly, if for some reason you want to use fly-multi-region as a library name and my repo is preventing that then just let me know. I’m not planning on actively maintaining it and would gladly take it down.


Unrelated by related:

I set up a server in Singapore with databases in Japan (primary), Chicago, and Amsterdam. Weirdly Japan takes the longest time. Odd eh?

# Instances
ID       TASK VERSION REGION DESIRED STATUS            HEALTH CHECKS      RESTARTS CREATED
3e8c62e4 app  4       ams    run     running (replica) 3 total, 3 passing 0        1h49m ago
46027f7d app  4       ord    run     running (replica) 3 total, 3 passing 0        2021-09-28T01:38:41Z
10f73492 app  4       nrt    run     running (replica) 3 total, 3 passing 0        2021-09-28T00:17:49Z
c88615ea app  4       nrt    run     running (leader)  3 total, 3 passing 0        2021-09-28T00:17:44Z

# My benchmarks:

Name           ips        average  deviation         median         99th %
ams           6.19      161.49 ms     ±0.06%      161.49 ms      161.80 ms
ord           3.96      252.32 ms     ±0.05%      252.32 ms      252.54 ms
nrt           2.56      390.27 ms     ±0.05%      390.24 ms      390.57 ms

Depends on what the benchmarks are doing and where they are run from. By default, you’ll connect to the nearest app to you… so the benchmarks might be running from there?

But to answer your question, our approach goes in a different direction. My first attempt used a Primary and Replica repo… so each instance has a connection to the Primary and chooses to use that for writes. The problem with this is you end up having a LOT of DB connections open to your primary… they reach across the globe, and you still have problems where you do a create on the primary, then read from the replica, and your newly inserted data isn’t there! (Yet) It becomes a race condition with an async process of data replication.

Our current approach solves a lot of those issues. I’m very excited about it! Will be ready for community feedback soon I hope.

3 Likes

I wrote this up about the topic of taking Phoenix distributed with Postgres in multi-region deployments.

3 Likes