Request for code: a Fly multi region Ruby gem

Fly lets you run app servers and read replicas in multiple regions. We also give you a tool for getting requests to the right place. You can read all about it here: Multi-region PostgreSQL · Fly

The example is Rails based, we’d like to deliver this as a Ruby gem. The Gem needs to do two things:

  1. Modify the DATABASE_URL environment variable before Rails boots.
  2. Install Rack middleware and modify the ApplicationController to handle Postgres readonly errors and send back the fly-replay header. This is easy to implement in a controller, but it might be a little tricky to do as middleware.

If the gem doesn’t detect these environment variables, it should print a warning to logs and do nothing:

  • DATABASE_URL
  • FLY_REGION
  • FLY_PRIMARY_REGION

It would be nice to make these environment variables configurable, or at least make it easy to add config options for v2.

We think this is a ~4 hour project for experience Rails and Ruby gem developers. We’ll pay you $1,000 to build the first version for us! Post here if you’re interested, link to previous Ruby / Rails / Rack code and we’ll let you know when to go.

I’m interested in this, as I had already started testing this behavior in a Rails app. Questions:

Why do you prefer middleware to hooking into controllers for exception handling?
Why not also offer a simpler option to redirect on any non-idempotent request (PUT/PATCH/DELETE)? For most apps, this would avoid unnecessary code paths.

Here’s some ridiculously old Rails plugin code, and a simple example of a helper module I use across projects.

It’s all yours if you want it!

To answer your questions:

I’d like to do both, but I’m ok starting with just controller hooks. We have a lot of mounted Rack apps in our Rails projects.

We might have a proxy that lets you make those kinds of choices someday. The problem we ran into is way too many apps write to the DB on GET requests, and people need to be ready to handle it anyway.

OK, here’s a first attempt!

This runs as a middleware at the end of the stack, which should avoid exception handlers from grabbing it. This would need to be tested with tools like Sentry.

Gem source: GitHub - soupedup/fly-rails: Ruby gem for handling requests within a Fly.io multiregion database setup
Rails app source: https://github.com/soupedup/fly-rails-example
Demo app: https://js-fly-multiregion.fly.dev

I will work later on making it configurable. What sort of configuration did you have in mind regarding env vars?

Dang you’re fast. This looks like what we were after!

The config I can imagine is “use a different env var for database url”. Other config options are, I think, premature scope creep. But I can imagine wanting to hard code the FLY_PRIMARY_REGION too.

OK, I’ll add a config interface to be used in an initializer. I named the gem fly-rails as I think it would make sense to add other enhancements in the same place, such as a prometheus metrics exporter, rather than split it all up into smaller gems. What do you think?

Otherwise, a weakness with this approach would come up when a replica automatically gets promoted to primary, as I believe is the case with your Postgres cluster today. This is why I would add an option to send all POST/PUT/DELETE requests to the primary region, and only rely on read-only exceptions as a last resort, or not at all (IMO) by offering another out for forcing GET requests to redirect.

The way our Postgres clusters are configured, only nodes in the primary region should ever be elected leader. Secondary nodes aren’t electable or used in synchronous replication.

I do agree that having a Fly gem is a good idea. One thing we might want to add to this is simulated synchronous replication. You can actually query postgres for replica status, if the write request knows which follower a user will hit next time, it could actually wait for replication there to finish before returning. But I think this’ll need some experimentation. :slight_smile:

Would this be to prevent something like ‘create and redirect to show’ from failing?

For this case, we could do something mentioned in another thread: force that subsequent read request to the primary region. The middleware could append a fly_region=iad param to the redirect URL, then replay any requests with this param.

That said, checking for replication status could end up being faster, if more platform-specific. But, in a high traffic app it’s not clear you could always rely on this technique.

Yes, exactly. One simple thing to start would be to set a "read from primary region until now + 5 seconds" in the session. The Gem could look for that and send a fly-replay response like it does for errors.

I do think we can come up with something to solve in the Gem, though, it’s a good thing to add to it. We’ll probably end up porting any decisions here to other frameworks.

OK, this is similar to the default handler for automatic switching between replicas and primaries in Rails. There, also, non-idempotent verbs are automatically sent to the primary. So, it might be smart to build this in a similar fashion.

Just to clarify, what do we want here for a ‘first version’?

I think a session based equivalent of that “delay” setting is good for a first version. The simplest thing to do might be to just set this value on every request to the primary region, like:

if ENV['FLY_REGION'] == ENV["FLY_PRIMARY_REGION"] 
    session[:fly_primary_until] = 4.seconds.from_now
end

This is more than I expected to solve in the first version so I’m pretty excited. :smiley:

1 Like