Multi region database guide

You can send this header to us for any reason. It would be pretty simple to add a middleware that tells us to replay all POSTs

1 Like

Would something like Multiple Databases with Ecto and Phoenix · edmz work for Phoenix ?

Having two Ecto Repos, one for Write and one for readonly. For the write only we could use the port 5432 and for read only Repo 5433.It’s a little bit more involved when writing code, but gives opportunity to fine tune the repos and incrementally move queries to Read Replicas.

Since they are handled at the write level it should in theory work for LiveView right ?

1 Like

That’s actually pretty close to the approach I’m taking! The Ecto approach uses two repos, a primary (with write access) and a replica, which is assumed to be a local regional replica.

Then I’ve wrapped the Repo functions so you can pretty much ignore it. Then, when you have those cases where a LiveView page does an insert/update and then a separate read, you can tell it, “this time, use the primary for the read”. So it’s not 100% transparent, but has an easy escape hatch when you need to do something explicit.

2 Likes

Can you please share the Repo code for reference ?

1 Like

Fastify for Node please!

2 Likes

Impressive! Would love to see code for Django.

Something for Go would be awesome!

Django and Go and Fastify are on our wishlist! Just need to find someone to build 'em. :slight_smile:

3 Likes

Hey everyone. I wrote a starter guide for Flyio multi-region databases with Phoenix/Elixir (includes code for a working app):

https://nathanwillson.com/blog/posts/2021-09-25-fly-multi-db/

Hopefully it helps!

2 Likes

@nbw thanks for writing that up and sharing your guide! I’ve been working on a library that I hope to share with the community here fairly soon. I’ll be presenting it at ElixirConf as well in Oct. It helps provide a more transparent approach for working with Postgres in a read-replica/primary setup.

Hey! That’s good timing. We should talk and I look forward to seeing your talk. I decided to put together what I wrote in that post into a library while I still had everything in my head. Maybe we landed on a similar solution:

https://hexdocs.pm/fly_multi_region/0.0.1/FlyMultiRegion.html

Just message to those at Fly, if for some reason you want to use fly-multi-region as a library name and my repo is preventing that then just let me know. I’m not planning on actively maintaining it and would gladly take it down.


Unrelated by related:

I set up a server in Singapore with databases in Japan (primary), Chicago, and Amsterdam. Weirdly Japan takes the longest time. Odd eh?

# Instances
ID       TASK VERSION REGION DESIRED STATUS            HEALTH CHECKS      RESTARTS CREATED
3e8c62e4 app  4       ams    run     running (replica) 3 total, 3 passing 0        1h49m ago
46027f7d app  4       ord    run     running (replica) 3 total, 3 passing 0        2021-09-28T01:38:41Z
10f73492 app  4       nrt    run     running (replica) 3 total, 3 passing 0        2021-09-28T00:17:49Z
c88615ea app  4       nrt    run     running (leader)  3 total, 3 passing 0        2021-09-28T00:17:44Z

# My benchmarks:

Name           ips        average  deviation         median         99th %
ams           6.19      161.49 ms     ±0.06%      161.49 ms      161.80 ms
ord           3.96      252.32 ms     ±0.05%      252.32 ms      252.54 ms
nrt           2.56      390.27 ms     ±0.05%      390.24 ms      390.57 ms

Depends on what the benchmarks are doing and where they are run from. By default, you’ll connect to the nearest app to you… so the benchmarks might be running from there?

But to answer your question, our approach goes in a different direction. My first attempt used a Primary and Replica repo… so each instance has a connection to the Primary and chooses to use that for writes. The problem with this is you end up having a LOT of DB connections open to your primary… they reach across the globe, and you still have problems where you do a create on the primary, then read from the replica, and your newly inserted data isn’t there! (Yet) It becomes a race condition with an async process of data replication.

Our current approach solves a lot of those issues. I’m very excited about it! Will be ready for community feedback soon I hope.

3 Likes

I wrote this up about the topic of taking Phoenix distributed with Postgres in multi-region deployments.

3 Likes

retired the hex package I put up with a notice to refer to official fly ones. :love_you_gesture:

1 Like

Well @kurt I think I’ve got the fly-replay approach to work in Laravel :slight_smile: To compare I tried various config:

  1. reads and writes handled by the primary
  2. reads handled by a read-replica, writes handled by the primary
  3. reads handled by a read-replica, writes handled by the primary by using the fly-replay header to replay writes-to-a-read-replica in its location

… using a suitably distributed vm (in lhr) and database (in scl) to be sure to have a large amount of latency due to the enormous distance. And it appears to work: 3, the fly-replay approach, reduces the time for a write (from lhr) and is the fastest. Based on the guide it uses a read-replica (port 5433) unless the request is coming from the primary region. And so when it doesn’t (e.g from lhr), an exception is thrown (due to the write to a read-replica) and that exception is caught, triggering the replay in the primary region.

I can write about it if you like?

And/or can have a look at Fastify if nobody has done that yet.

Wow that’s amazing. We’d love to have you write about it, I think the first thing to do is create an example repository with a README. We’re happy to pay for it, too.

1 Like

@kurt Awesome. Ok, great, I’ll put something together over the weekend.

Hi @kurt

As discussed I have written a guide for how I deployed a Laravel application to Fly.

In the end I divided it into two parts: one repo to explain how to get a demo Laravel application deployed, and another repo which builds upon that to describe the changes I made to use the fly-replay header to improve database performance. I figured there will likely be people who will only need one or the other. And a single one became way too long!

They are:

Let me know what you think whenever you get a chance (email, or here, wherever!) and I can add/edit/delete whatever you want.

1 Like

Hey guys,

I got multiregion working for Typeorm, I followed this https://typeorm.io/multiple-data-sources#replication

here is a gist of my code

      const databaseUrl = process.env.DATABASE_URL as string;
      let options: ConnectionOptions = {
        type: 'postgres',
        name: 'default',
        logging: false,
        synchronize: false,
        entities: [__dirname + '/../modules/**/*.js'],
        migrations: [__dirname + '/../migration/*.js'],
      };
      if (process.env.PRIMARY_REGION !== process.env.FLY_REGION) {
        options = {
          ...options,
          replication: {
            master: {
              url: databaseUrl,
            },
            slaves: [
              {
                url: databaseUrl.replace('5432', '5433'),
              },
            ],
          },
        };
      } else {
        options = {
          ...options,
          url: databaseUrl,
        };
      }
      
        return createConnection(options)

do it for production environment.