Multi region database guide

Hey everyone. I wrote a starter guide for Flyio multi-region databases with Phoenix/Elixir (includes code for a working app):

https://nathanwillson.com/blog/posts/2021-09-25-fly-multi-db/

Hopefully it helps!

2 Likes

@nbw thanks for writing that up and sharing your guide! I’ve been working on a library that I hope to share with the community here fairly soon. I’ll be presenting it at ElixirConf as well in Oct. It helps provide a more transparent approach for working with Postgres in a read-replica/primary setup.

Hey! That’s good timing. We should talk and I look forward to seeing your talk. I decided to put together what I wrote in that post into a library while I still had everything in my head. Maybe we landed on a similar solution:

https://hexdocs.pm/fly_multi_region/0.0.1/FlyMultiRegion.html

Just message to those at Fly, if for some reason you want to use fly-multi-region as a library name and my repo is preventing that then just let me know. I’m not planning on actively maintaining it and would gladly take it down.


Unrelated by related:

I set up a server in Singapore with databases in Japan (primary), Chicago, and Amsterdam. Weirdly Japan takes the longest time. Odd eh?

# Instances
ID       TASK VERSION REGION DESIRED STATUS            HEALTH CHECKS      RESTARTS CREATED
3e8c62e4 app  4       ams    run     running (replica) 3 total, 3 passing 0        1h49m ago
46027f7d app  4       ord    run     running (replica) 3 total, 3 passing 0        2021-09-28T01:38:41Z
10f73492 app  4       nrt    run     running (replica) 3 total, 3 passing 0        2021-09-28T00:17:49Z
c88615ea app  4       nrt    run     running (leader)  3 total, 3 passing 0        2021-09-28T00:17:44Z

# My benchmarks:

Name           ips        average  deviation         median         99th %
ams           6.19      161.49 ms     ±0.06%      161.49 ms      161.80 ms
ord           3.96      252.32 ms     ±0.05%      252.32 ms      252.54 ms
nrt           2.56      390.27 ms     ±0.05%      390.24 ms      390.57 ms

Depends on what the benchmarks are doing and where they are run from. By default, you’ll connect to the nearest app to you… so the benchmarks might be running from there?

But to answer your question, our approach goes in a different direction. My first attempt used a Primary and Replica repo… so each instance has a connection to the Primary and chooses to use that for writes. The problem with this is you end up having a LOT of DB connections open to your primary… they reach across the globe, and you still have problems where you do a create on the primary, then read from the replica, and your newly inserted data isn’t there! (Yet) It becomes a race condition with an async process of data replication.

Our current approach solves a lot of those issues. I’m very excited about it! Will be ready for community feedback soon I hope.

3 Likes

I wrote this up about the topic of taking Phoenix distributed with Postgres in multi-region deployments.

3 Likes

retired the hex package I put up with a notice to refer to official fly ones. :love_you_gesture:

1 Like

Well @kurt I think I’ve got the fly-replay approach to work in Laravel :slight_smile: To compare I tried various config:

  1. reads and writes handled by the primary
  2. reads handled by a read-replica, writes handled by the primary
  3. reads handled by a read-replica, writes handled by the primary by using the fly-replay header to replay writes-to-a-read-replica in its location

… using a suitably distributed vm (in lhr) and database (in scl) to be sure to have a large amount of latency due to the enormous distance. And it appears to work: 3, the fly-replay approach, reduces the time for a write (from lhr) and is the fastest. Based on the guide it uses a read-replica (port 5433) unless the request is coming from the primary region. And so when it doesn’t (e.g from lhr), an exception is thrown (due to the write to a read-replica) and that exception is caught, triggering the replay in the primary region.

I can write about it if you like?

And/or can have a look at Fastify if nobody has done that yet.

Wow that’s amazing. We’d love to have you write about it, I think the first thing to do is create an example repository with a README. We’re happy to pay for it, too.

1 Like

@kurt Awesome. Ok, great, I’ll put something together over the weekend.

Hi @kurt

As discussed I have written a guide for how I deployed a Laravel application to Fly.

In the end I divided it into two parts: one repo to explain how to get a demo Laravel application deployed, and another repo which builds upon that to describe the changes I made to use the fly-replay header to improve database performance. I figured there will likely be people who will only need one or the other. And a single one became way too long!

They are:

Let me know what you think whenever you get a chance (email, or here, wherever!) and I can add/edit/delete whatever you want.

1 Like

Hey guys,

I got multiregion working for Typeorm, I followed this https://typeorm.io/multiple-data-sources#replication

here is a gist of my code

      const databaseUrl = process.env.DATABASE_URL as string;
      let options: ConnectionOptions = {
        type: 'postgres',
        name: 'default',
        logging: false,
        synchronize: false,
        entities: [__dirname + '/../modules/**/*.js'],
        migrations: [__dirname + '/../migration/*.js'],
      };
      if (process.env.PRIMARY_REGION !== process.env.FLY_REGION) {
        options = {
          ...options,
          replication: {
            master: {
              url: databaseUrl,
            },
            slaves: [
              {
                url: databaseUrl.replace('5432', '5433'),
              },
            ],
          },
        };
      } else {
        options = {
          ...options,
          url: databaseUrl,
        };
      }
      
        return createConnection(options)

do it for production environment.

I think I just hit a variation of this in Rails when doing OmniAuth authorization. The request is replayed at the primary, but then fails with bad_verification_code (Troubleshooting OAuth App access token request errors - GitHub Docs).

I will try the Fly-Prefer-Region header and report back!

I’m starting to look into this again.

So if I use port 5433 for the connection URL this will always automatically connect the client to the nearest replica and 5432 to the primary node? What if there are no replicas available at a particular moment?

BTW has anyone implemented the request replay in Node? The docs mention adding an HTTP header:

Once caught, just send a fly-replay header specifying the primary region. For chaos-postgres , send fly-replay: region=scl , and we’ll take care of the rest.

I’m not sure I understand where this header needs to be added. The example seems like a very specific Rails implementation.

@pier That’s correct. I’m not sure what would happen if a region was down (and so a replica you’ve created is not available). I’d assume the same as if it didn’t exist at all - the connection would be routed to the primary.

As for Node, yep, I had a go a while back at using the technique with Fastify, with Prisma as the ORM. Check out:

Hopefully the readme explains how it works, but let me know if not :slightly_smiling_face:

That replay approach sends all queries to the nearest database. Of course if that results in a write being sent to a read-only replica, well of course Postgres will fail, with an error. That exception/error is caught. Since you know the reason, you replay that whole http request in the region that the primary database is in. Which does allow writes, and so it works.

Or you could avoid replays and instead use a separate read and write connection, where the ORM decides whether to use 5433 or 5432.

The only issue I recall I had (with that replay approach) was ensuring there was an app vm in the same region as the primary database vm. Since with auto-scaling, it wasn’t possible to enforce the regions the app’s vms were in, only suggest a region pool. And if not, writes will fail. I’m not sure if that has been resolved since.

1 Like

Thanks @greg I will check this out in detail!

I’m not using an ORM (Prisma in particular is pretty slow) but I was planning on just having two PG client instances.

But if the replicas are down for some reason, would read queries made to 5433 be sent to the primary instance? :thinking:

1 Like

@pier Hmm :thinking: … that I don’t know. I would assume/hope their proxy would be smart enough to do that, but that is a total guess. It would need someone from Fly to confirm.

As for the ORM, yep, makes sense. I’d think you could still do it e.g with a readClient and writeClient (set with the respective port) rather than simply one client. The replay bit is independent of that anyway - that’s just catching an exception/error, which pg or whatever would also throw if you try and write to a read-replica.

1 Like

Thanks I hope someone from Fly can confirm.

The docs only mention:

Port 5433 is direct to the PostgreSQL member, and used to connect to read replicas directly.

1 Like

I did a little test in my dev environment using a primary client and a replica client using 5433 port.

Something like this:

import postgres from 'postgres';

const sql = postgres(process.env.DATABASE_URL);
const sqlReplica = postgres(process.env.DATABASE_URL_REPLICA);

When using the replica client, the queries were sent to the primary (and only) instance. So I guess my question has been answered.

So my PG instances are all in AMS. I cloned a v2 app into LAX and also cloned a PG replica into LAX.

I’ve also created a new PG client instance using the port 5433 to use the replicas.

It made no difference in performance in a page that has multiple queries reading from the replica. Actually, some requests randomly take 3-5x longer now compared to when the LAX app was reading from AMS.

I restarted the LAX app machine in case “have you tried turning it off and on” or something… Same result.

This is the status of the new replica so I guess it should be working:

started	replica	lax   	3 total, 3 passing

My DB is very small. I doubt it’s still copying the data to the volume.

Not sure if I’m missing something to make the whole thing work. Is there anything else I could check?

For the time being it makes more sense to just run everything in AMS.

Hopefully someone from Fly can chime in and let me know if I’m doing something obviously wrong or maybe something else is failing.