Multi region database guide

nbw · October 12, 2021, 12:54pm

retired the hex package I put up with a notice to refer to official fly ones.

greg · April 1, 2022, 7:40pm

Well @kurt I think I’ve got the fly-replay approach to work in Laravel To compare I tried various config:

reads and writes handled by the primary
reads handled by a read-replica, writes handled by the primary
reads handled by a read-replica, writes handled by the primary by using the fly-replay header to replay writes-to-a-read-replica in its location

… using a suitably distributed vm (in lhr) and database (in scl) to be sure to have a large amount of latency due to the enormous distance. And it appears to work: 3, the fly-replay approach, reduces the time for a write (from lhr) and is the fastest. Based on the guide it uses a read-replica (port 5433) unless the request is coming from the primary region. And so when it doesn’t (e.g from lhr), an exception is thrown (due to the write to a read-replica) and that exception is caught, triggering the replay in the primary region.

I can write about it if you like?

And/or can have a look at Fastify if nobody has done that yet.

kurt · April 1, 2022, 10:48pm

Wow that’s amazing. We’d love to have you write about it, I think the first thing to do is create an example repository with a README. We’re happy to pay for it, too.

greg · April 1, 2022, 11:04pm

@kurt Awesome. Ok, great, I’ll put something together over the weekend.

greg · April 4, 2022, 1:43pm

Hi @kurt

As discussed I have written a guide for how I deployed a Laravel application to Fly.

In the end I divided it into two parts: one repo to explain how to get a demo Laravel application deployed, and another repo which builds upon that to describe the changes I made to use the fly-replay header to improve database performance. I figured there will likely be people who will only need one or the other. And a single one became way too long!

They are:

Let me know what you think whenever you get a chance (email, or here, wherever!) and I can add/edit/delete whatever you want.

pollux · April 6, 2022, 6:37pm

Hey guys,

I got multiregion working for Typeorm, I followed this https://typeorm.io/multiple-data-sources#replication

here is a gist of my code

      const databaseUrl = process.env.DATABASE_URL as string;
      let options: ConnectionOptions = {
        type: 'postgres',
        name: 'default',
        logging: false,
        synchronize: false,
        entities: [__dirname + '/../modules/**/*.js'],
        migrations: [__dirname + '/../migration/*.js'],
      };
      if (process.env.PRIMARY_REGION !== process.env.FLY_REGION) {
        options = {
          ...options,
          replication: {
            master: {
              url: databaseUrl,
            },
            slaves: [
              {
                url: databaseUrl.replace('5432', '5433'),
              },
            ],
          },
        };
      } else {
        options = {
          ...options,
          url: databaseUrl,
        };
      }
      
        return createConnection(options)

do it for production environment.

julianrubisch · September 8, 2022, 5:41pm

I think I just hit a variation of this in Rails when doing OmniAuth authorization. The request is replayed at the primary, but then fails with bad_verification_code (Troubleshooting OAuth App access token request errors - GitHub Docs).

I will try the Fly-Prefer-Region header and report back!

pier · March 29, 2023, 4:12am

I’m starting to look into this again.

So if I use port 5433 for the connection URL this will always automatically connect the client to the nearest replica and 5432 to the primary node? What if there are no replicas available at a particular moment?

BTW has anyone implemented the request replay in Node? The docs mention adding an HTTP header:

Once caught, just send a fly-replay header specifying the primary region. For chaos-postgres , send fly-replay: region=scl , and we’ll take care of the rest.

I’m not sure I understand where this header needs to be added. The example seems like a very specific Rails implementation.

greg · March 29, 2023, 3:30pm

@pier That’s correct. I’m not sure what would happen if a region was down (and so a replica you’ve created is not available). I’d assume the same as if it didn’t exist at all - the connection would be routed to the primary.

As for Node, yep, I had a go a while back at using the technique with Fastify, with Prisma as the ORM. Check out:

Hopefully the readme explains how it works, but let me know if not

That replay approach sends all queries to the nearest database. Of course if that results in a write being sent to a read-only replica, well of course Postgres will fail, with an error. That exception/error is caught. Since you know the reason, you replay that whole http request in the region that the primary database is in. Which does allow writes, and so it works.

Or you could avoid replays and instead use a separate read and write connection, where the ORM decides whether to use 5433 or 5432.

The only issue I recall I had (with that replay approach) was ensuring there was an app vm in the same region as the primary database vm. Since with auto-scaling, it wasn’t possible to enforce the regions the app’s vms were in, only suggest a region pool. And if not, writes will fail. I’m not sure if that has been resolved since.

pier · March 29, 2023, 3:48pm

Thanks @greg I will check this out in detail!

I’m not using an ORM (Prisma in particular is pretty slow) but I was planning on just having two PG client instances.

But if the replicas are down for some reason, would read queries made to 5433 be sent to the primary instance?

greg · March 29, 2023, 4:05pm

@pier Hmm … that I don’t know. I would assume/hope their proxy would be smart enough to do that, but that is a total guess. It would need someone from Fly to confirm.

As for the ORM, yep, makes sense. I’d think you could still do it e.g with a readClient and writeClient (set with the respective port) rather than simply one client. The replay bit is independent of that anyway - that’s just catching an exception/error, which pg or whatever would also throw if you try and write to a read-replica.

pier · March 29, 2023, 4:59pm

Thanks I hope someone from Fly can confirm.

The docs only mention:

Port 5433 is direct to the PostgreSQL member, and used to connect to read replicas directly.

pier · March 29, 2023, 6:20pm

I did a little test in my dev environment using a primary client and a replica client using 5433 port.

Something like this:

import postgres from 'postgres';

const sql = postgres(process.env.DATABASE_URL);
const sqlReplica = postgres(process.env.DATABASE_URL_REPLICA);

When using the replica client, the queries were sent to the primary (and only) instance. So I guess my question has been answered.

pier · March 29, 2023, 6:52pm

So my PG instances are all in AMS. I cloned a v2 app into LAX and also cloned a PG replica into LAX.

I’ve also created a new PG client instance using the port 5433 to use the replicas.

It made no difference in performance in a page that has multiple queries reading from the replica. Actually, some requests randomly take 3-5x longer now compared to when the LAX app was reading from AMS.

I restarted the LAX app machine in case “have you tried turning it off and on” or something… Same result.

This is the status of the new replica so I guess it should be working:

started	replica	lax   	3 total, 3 passing

My DB is very small. I doubt it’s still copying the data to the volume.

Not sure if I’m missing something to make the whole thing work. Is there anything else I could check?

For the time being it makes more sense to just run everything in AMS.

Hopefully someone from Fly can chime in and let me know if I’m doing something obviously wrong or maybe something else is failing.

greg · March 29, 2023, 7:29pm

@pier That’s strange Assuming there is nothing else affecting it (like the ongoing issue, but that should be unrelated) yep, I’m not sure why a request would take longer from an app in LAX to a database in LAX compared to a database AMS. Internet routing can be weird, but given that’s thousands of miles away, that’s certainly unexpected. I recall in my experiments I’d get the response to include the FLY_REGION to see where the request was being handled, and also check the logs (where I’d log when a vm was handling or replaying a request). That would show which vms were getting involved.

I actually just noticed they have marked the docs as legacy (Multi-region Postgres (Legacy) · Fly Docs). That may be just because of the old database using Nomad however I wonder if that also means that fly-replay trick is legacy too? It would still be supported by Fly’s proxy however I wonder if there is a different approach when using the new database. Since if it makes the request take longer, indeed there is no point in adding the additional complexity and consistency issues of the replica at all.

pier · March 29, 2023, 8:16pm

I think that my app in LAX probably was not connecting to the replica in LAX for some reason.

Maybe something failed when cloning the replica or maybe some network shenanigans?

The HTTP request includes this header:

fly-request-id: 01GWQF...KHY-lax

AFAIK this is the entry point into Fly’s network although yeah it’s possible the request wasn’t actually going to the LAX app for some reason.

Do you have any idea how I could check which replica is being used by a PG client?

greg · March 29, 2023, 8:47pm

@pier Hmm … yes if your request was going to the AMS vm or the LAX vm was actually connecting to the AMS database, that would explain slowness. As for how to see which database vm the app’s vm is actually using for a read (for writes, you know of course!) I’m not sure . I don’t know if the logs for the database (after all, a database on Fly is just another app behind the scenes - you can even clone their repo and make your own version of it … at least you could with v1, not sure if that’s the case with the v2) show.

I was just looking how I revealed debugging info when trying using Planetscale’s read replicas (this time with an Express app, also Node, and since they use MySQL-only, with the mysql2 client). The same idea, connecting to the closest one, except instead of using Fly’s internal magic to decide which is the closest one, here it does some logic Ignore that though - you can see here I made a read and write route, and after doing a query they return the region they were served from:

github.com

fly-apps/nodejs-planetscale-read-replicas/blob/main/server.js#L21


      
              usingPrimaryRegion,
              usingDatabaseHost
          } = require('./database');  // database.js
          
          
app.disable('x-powered-by');
          
          
app.get('/', (req, res) => {
              res.send('hello world')
          })
          
          
app.get('/read', (req, res, next) => {
              var startTime = new Date().getTime();
          
          
    db.query('SELECT * FROM fruits ORDER BY id DESC LIMIT 3', function (err, rows) {
                  if (err) {
                      return next(err);
                  }
          
          
        // show what's going on:
                  res.json({
                      time:  new Date().getTime() - startTime, // how long the query took (ms)

The Fly region being got from an environment variable that Fly apps provide you with at run-time:

github.com

fly-apps/nodejs-planetscale-read-replicas/blob/main/database.js#L7


      
          'use strict';
          
          
const url = require('url')
          const mysql = require('mysql2')
          
          
// 1. where is the app running?
          var usingFlyRegion = process.env.FLY_REGION ? process.env.FLY_REGION : "unknown" // e.g "lhr"
          
          
// 2. is the app running in the same general location as the primary database? If so don't waste time
          // picking a database from a list of preferences. Use the first (possibly only) one (which is the primary):
          var usingPrimaryRegion = (process.env.FLY_REGION && process.env.PRIMARY_REGION && process.env.FLY_REGION === process.env.PRIMARY_REGION);
          
          
// 3. which database should be connected to?
          var databaseUrl = process.env.DATABASE_URL; // first: assume no choice and there's only one
          if (databaseUrl.includes(',')) {
              // ah, there's more than one (we expect a comma-separated sting). So pick the closest one
              databaseUrl = nearestDatabaseUrl(usingFlyRegion, usingPrimaryRegion, databaseUrl.split(','))

The PRIMARY_REGION is manually provided in the fly.toml, as of course you know the region your primary database is in.

Here I returned JSON however you could return those region values in headers instead.

pier · April 7, 2023, 6:37pm

Well not sure if Fly changed something on their end… but I tried to connect to a close replica in a Node test app and it worked on the first try.

If anyone is curious here’s the Node project using Fastify and the postgres client:

Initially I used the atdatabases client/ORM which has lots of cool features. Unfortunately it does not support listen/notify which I can’t live without (and I really don’t want to implement this myself).

bwoodlt · September 3, 2023, 10:16pm

@pier were you able to resolve this in your application? I’m having the exact same issue.

Apps deployed to lhr for example will rather go to the Postgres instance in sjc and completely ignore the Postgres instance in lhr.

pier · September 4, 2023, 2:55pm

I never implemented it in my application because of data residency and GDPR headaches.

But in my previous comment I made it work with a dumb Node project.

Topic		Replies	Views
Early look at Elixir packages for globally distributed deployments with Postgres DBs Phoenix elixir , distributed , postgres , live_view	15	3616	February 15, 2023
Are there any Managed Postgres providers located in the same data centres as Fly’s servers?	1	216	September 1, 2023
Price of running Phoenix LiveView in multiple regions – Fly vs AWS live_view	2	1212	August 2, 2022
Multi-region postgres + elixir using fly_postgres and fly_rpc Phoenix	11	1181	November 17, 2022
Django in multiple regions with Postgres Django postgres	3	514	June 11, 2023

Multi region database guide

Related topics