Custom deployment strategies

We are currently developing a Phoenix application on fly. It’s a multiplayer backend that stores state in memory and dispatches updates to clients through WebSockets, clustered across the globe to ensure optimal ping response times.

The in-memory state is persisted to S3. We use the bluegreen deployment strategy, and it’s important to save all the state to S3 before routing traffic to the new cluster, so we don’t lose anything. The process we came up with goes roughly like this:

  1. release_command adds an entry in DB to signal the old cluster a deployment is happening
  2. new cluster boots but its /health responds with a 503 as long as the DB entry is present, so traffic does not get routed to it yet
  3. old cluster sees the DB entry and starts a “pre shutdown” sequence
    a. close all channels
    b. save all projects to S3
    c. remove DB entry
  4. new cluster’s /health responds 200, traffic is routed to the new cluster and clients start reconnecting
  5. old cluster nodes get a SIGTERM and finish to shutdown

This is a bit too complex for our taste, we would like to get rid of the pre-shutdown sequence and DB state. Our ideal deployment would go like this:

  1. boot new cluster
  2. (optional) run some quick integration tests on the new cluster via a private tunnel
  3. gracefully shutdown the old cluster with SIGTERM
  4. route traffic to the new cluster

Is it something that could be achieved with the current APIs? If not, we would be happy to start a discussion around this subject :slight_smile:

Have you considered storing up data in Fly-managed Redis instead to keep the clusters in-sync? Quite expensive and still in preview, but using it might simplify the pre/post deploy ceremonies, otherwise.

Alternatively, you can consider using disks that persist across deploys (though, be wary of zombie disks). Could even run SeaweedFS atop it, if you’re adventurous.

The problem with failing the health check (steps 2 to 4) for longer time is, Fly might rollback the deployment (which is another scenario the app would have to handle).

The issue is more about being able to customize the bluegreen strategy than which tech we use to workaround the lack of cutomizability :slight_smile:

1 Like

Gotcha, but I’d avoid it if I were you (there is a bunch that can go wrong, as you know).

Is it something that could be achieved with the current APIs?

Anyways, for Machine apps, both release_command (code) and rolling strat (there’s no blue-green) are driven client-side by flyctl (code). So, if you have a new strat that you want to impl or customize the existing one, it is pretty straight-forward to do so (don’t quote me on it, I’ve never had to do it ;)).

For regular apps, the deployments are handled server-side by Fly, and so short of them implementing a new strat or modifying an existing one, I don’t see how it would be a worthwhile endeavour…