After watching @Mark’s presentation I got inspired to try out this multiregion setup on Fly (for those who didn’t see it: https://www.youtube.com/watch?v=IqnZnFpxLjI). However, I’m not able to set it up properly by just following the guides / readme’s - it keeps on failing to connect to the database. This is what I did:
I used the getting started guide, added libcluster and got the app running on in two regions:
This all seems to work fine. It also had setup a database cluster in a single region with that. After that I started with changing to code to support this multiregion db setup using the fly_postgres readme and implemented every step. Than I looked at the scaling the postgres cluster to multiple regions with this guide. Once the app was deployed with new code and fly_postgres I added a new volume (so that I have 1 in ams and 1 in sea) and scaled the instances of the db cluster to multiple regions (but with max 1 for every region). This works and the PRIMARY_REGION is set to ams for both the app cluster as the db cluster.
However, the sea app cluster can’t connect to the database:
Does fly status -a <db cluster app name> say that sea is a replica region? The “status” column should look something like this:
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
6dad6fd9 app 2 ord run running (replica) 3 total, 3 passing 0 2022-01-04T09:55:39Z
016d6f8f app 2 fra run running (leader) 3 total, 3 passing 0 2022-01-04T09:53:17Z
8b93c566 app 2 fra run running (replica) 3 total, 3 passing 0 2022-01-04T09:53:17Z
I’m getting the following, without leader/replica in its status…
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
ec7c8d09 app 4 sea run running 2 total, 2 passing 0 14h44m ago
9a05dd5b app 4 ams run running 2 total, 2 passing 0 2022-01-06T10:23:38Z
Not sure how I can define which one is leader/replica?
I also have no idea if there’s a way to transform such a deployment into a cluster - ran into a similar issue recently and my workaround was to create a new postgres app, clustered from the get-go.
When I originally ran fly launch for my Phoenix project, there was no option to choose which kind of postgres deployment you want (clustered or not), so I also initially had a single-instance postgres. So when I added a new region to it and configured my app with fly_postgres_elixir lib, I got the exact same error - the app instance in my replica region attempted to connect to a postgres instance in the right region, but that instance listened on 5432, not 5433 (which is the port fly_postgres_elixir tries to connect to in replica regions, and which postgres replica instances listen on in a clustered deployment).
So I ran fly pg create --name <new pg cluster app name> --region <my primary region> and chose a production configuration, which created an app with two instances in the primary region: one leader, one replica; then added a new region and scaled the pg cluster app to 3 instances (2 replicas, 1 leader).
Then I removed DATABASE_URL secret from my app* and ran fly pg attach --postgres-app <new pg cluster app name> to attach this new pg cluster to the app (and later simply deleted the old postgres app w/o any data migrations because nothing of value was lost in my case, but YMMV), and after redeployment it worked - app instances in both regions connected successfully.
* actually, I tried fly pg detach first, but it wouldn’t let me, probably because I had some data in the db, didn’t investigate any further.
You’re right that you need the production instance to accomplish this. I will try to run some tests to see if everything is correctly hooked up to each other.
Yes @kkonstant, a recent change within Fly was to create a non-clustered PG database so it could be a “free” database. That works well for traditional single-server Phoenix apps, but when you go multi-region and need the replicas… yeah.
Looks like we can certainly do something to improve the visibility, guides or choices here. This is why the guide didn’t work for @noegip going multi-region. It was written before the Free PG DB config was introduced!
As of right now, the easiest way to transition from a standalone Postgres to a clustered Postgres is to just provision a new HA Postgres app from a snapshot.
# List volumes of Standalone PG
fly volumes list --app <postgres-app>
# List the snapshots associated with the Volume
fly volumes snapshots list vol_xxx
# Provision a new PG while specifying the snapshot obtained from your standalone PG.
fly pg create --snapshot-id <standalone-pg-snapshot-id>
The Standalone Postgres configuration just runs vanilla PG and does not come configured with Stolon. I 100% agree we need better documentation conveying the differences and how to transition from Standalone → Clustered.
Just a quick update on this. We did tweak this a bit and new provisions of the Standalone PG will now come configured with Stolon. This will allow you to horizontally scale up in the event you need HA.
I am trying to implement the same and i am having one issue that what is my next step here.
Do i need to create a new app in the replica region and attach this db to the new app?
Each added instance has the same credentials. Using fly_postgres, it updates the DB URL to find the nearest one using internal Fly.io network DNS queries.