App appears to move between data centers

I have an app that I originally set up in the atl datacenter, and a database running in the dfw datacenter. I realized these were not in the same place, so deleted the app and re-created it in dfw under the same name.

Now it appears that my app will spin up in either of the two data centers. My app has been crashing due to some bugs, and when it is spun up after a crash it appears to switch back to the original data center.

Is this because I used the same name for the app, and there’s a bug in flyctl which surfaces when the same app name is used with different data centers?

Basic repo case: (not specifically tested, but this is what I think is happening)

  1. Create an app in atl data center
  2. Deploy
  3. Delete the app
  4. Create an app in the dfw data center with the same name as in step 1
  5. Deploy the app
    The app is in the dfw data center, as expected
  6. Let it crash for some reason (in my case it was memory exhaustion)
  • Sometimes the app appears back in the original data center from step #1

In the logs below, we see the app in atl, then a deploy, then the app in dfw. I don’t have logs showing it jumping from dfw to atl, but I can definitively say that I didn’t modify the app config between the first and second part of the log. (This log is contiguous output from fly logs, not a cut and paste.)

2022-01-03T02:37:56.595 app[c3baf174] atl [info]02:37:56.594 request_id=FsaiODnErUYc-AIAAAGh [info] GET /
2022-01-03T02:37:56.596 app[c3baf174] atl [info]02:37:56.595 request_id=FsaiODnErUYc-AIAAAGh [info] Sent 200 in 747µs
2022-01-03T02:37:57.848 app[c3baf174] atl [info]02:37:57.847 request_id=FsaiOIR5me5_eu8AAAGx [info] GET /favicon.ico
2022-01-03T02:37:57.849 app[c3baf174] atl [info]02:37:57.848 request_id=FsaiOIR5me5_eu8AAAGx [info] Sent 404 in 490µs
2022-01-03T02:39:54.968 runner[074cf0c6] dfw [info]Starting instance
2022-01-03T02:39:54.998 runner[074cf0c6] dfw [info]Configuring virtual machine
2022-01-03T02:39:55.000 runner[074cf0c6] dfw [info]Pulling container image
2022-01-03T02:39:58.090 runner[074cf0c6] dfw [info]Unpacking image
2022-01-03T02:40:01.846 runner[074cf0c6] dfw [info]Preparing kernel init
2022-01-03T02:40:02.289 runner[074cf0c6] dfw [info]Configuring firecracker
2022-01-03T02:40:02.290 runner[074cf0c6] dfw [info]Starting virtual machine
2022-01-03T02:40:02.437 app[074cf0c6] dfw [info]Starting init (commit: 7943db6)...
2022-01-03T02:40:02.454 app[074cf0c6] dfw [info]Preparing to run: `/app/entry eval Inboxhero.Release.migrate` as nobody

Here’s some more log weirdness, notice the app changing from DFW to ATL on the last 10 lines or so:

2022-01-03T03:15:48.245 app[3fdfc04b] dfw [info]03:15:48.245 [info] CONNECTED TO Phoenix.LiveView.Socket in 47µs
2022-01-03T03:15:48.245 app[3fdfc04b] dfw [info]  Transport: :websocket
2022-01-03T03:15:48.245 app[3fdfc04b] dfw [info]  Serializer: Phoenix.Socket.V2.JSONSerializer
2022-01-03T03:15:48.245 app[3fdfc04b] dfw [info]  Parameters: %{"_csrf_token" => "ZyocQ3pZFVxmGiRSImI6NCUnHWEfDVhgQdLs9ny-2TnjJ1jwFAU0zOm2", "_mounts" => "0", "vsn" => "2.0.0"}
2022-01-03T03:18:08.147 runner[f976ffab] dfw [info]Starting instance
2022-01-03T03:18:08.175 runner[f976ffab] dfw [info]Configuring virtual machine
2022-01-03T03:18:08.176 runner[f976ffab] dfw [info]Pulling container image
2022-01-03T03:18:10.808 runner[f976ffab] dfw [info]Unpacking image
2022-01-03T03:18:12.696 runner[f976ffab] dfw [info]Preparing kernel init
2022-01-03T03:18:13.034 runner[f976ffab] dfw [info]Configuring firecracker
2022-01-03T03:18:13.035 runner[f976ffab] dfw [info]Starting virtual machine
2022-01-03T03:18:13.182 app[f976ffab] dfw [info]Starting init (commit: 7943db6)...
2022-01-03T03:18:13.197 app[f976ffab] dfw [info]Preparing to run: `/app/entry eval Inboxhero.Release.migrate` as nobody
2022-01-03T03:18:13.212 app[f976ffab] dfw [info]2022/01/03 03:18:13 listening on [fdaa:0:3b60:a7b:2203:f976:ffab:2]:22 (DNS: [fdaa::3]:53)
2022-01-03T03:18:16.185 app[f976ffab] dfw [info]03:18:16.182 [info] Migrations already up
2022-01-03T03:18:16.207 app[f976ffab] dfw [info]Main child exited normally with code: 0
2022-01-03T03:18:16.208 app[f976ffab] dfw [info]Reaped child process with pid: 563 and signal: SIGUSR1, core dumped? false
2022-01-03T03:18:16.208 app[f976ffab] dfw [info]Starting clean up.
2022-01-03T03:18:25.597 runner[a83f2ce7] atl [info]Starting instance
2022-01-03T03:18:25.639 runner[a83f2ce7] atl [info]Configuring virtual machine
2022-01-03T03:18:25.640 runner[a83f2ce7] atl [info]Pulling container image
2022-01-03T03:18:29.253 runner[a83f2ce7] atl [info]Unpacking image
2022-01-03T03:18:32.932 runner[a83f2ce7] atl [info]Preparing kernel init
2022-01-03T03:18:33.337 runner[a83f2ce7] atl [info]Configuring firecracker
2022-01-03T03:18:33.422 runner[a83f2ce7] atl [info]Starting virtual machine
2022-01-03T03:18:33.643 app[a83f2ce7] atl [info]Starting init (commit: 7943db6)...
2022-01-03T03:18:33.660 app[a83f2ce7] atl [info]Preparing to run: `/bin/sh -c /app/entry start` as nobody
2022-01-03T03:18:33.677 app[a83f2ce7] atl [info]2022/01/03 03:18:33 listening on [fdaa:0:3b60:a7b:ac0:a83f:2ce7:2]:22 (DNS: [fdaa::3]:53)
2022-01-03T03:18:34.671 app[a83f2ce7] atl [info]Reaped child process with pid: 548, exit code: 0
2022-01-03T03:18:36.285 app[a83f2ce7] atl [info]03:18:36.284 [info] Running InboxheroWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2022-01-03T03:18:36.286 app[a83f2ce7] atl [info]03:18:36.286 [info] Access InboxheroWeb.Endpoint at
2022-01-03T03:18:36.675 app[a83f2ce7] atl [info]Reaped child process with pid: 569 and signal: SIGUSR1, core dumped? false
2022-01-03T03:19:11.804 runner[3fdfc04b] dfw [info]Shutting down virtual machine
2022-01-03T03:19:11.893 app[3fdfc04b] dfw [info]Sending signal SIGTERM to main child process w/ PID 510
2022-01-03T03:19:11.894 app[3fdfc04b] dfw [info]Main child exited with signal (with signal 'SIGTERM', core dumped? false)
2022-01-03T03:19:11.895 app[3fdfc04b] dfw [info]Starting clean up.
2022-01-03T03:21:05.954 app[a83f2ce7] atl [info]03:21:05.953 [info] CONNECTED TO Phoenix.LiveView.Socket in 57µs
2022-01-03T03:21:05.954 app[a83f2ce7] atl [info]  Transport: :websocket
2022-01-03T03:21:05.954 app[a83f2ce7] atl [info]  Serializer: Phoenix.Socket.V2.JSONSerializer
2022-01-03T03:21:05.954 app[a83f2ce7] atl [info]  Parameters: %{"_csrf_token" => "ZyocQ3pZFVxmGiRSImI6NCUnHWEfDVhgQdLs9ny-2TnjJ1jwFAU0zOm2", "_mounts" => "0", "vsn" => "2.0.0"}

From what I’ve seen, Fly moves apps depending on demand and capacity.

I’ve read other people saying you can pin an app to a region by mounting a volume to it there. So you would create a volume in the region you want the app to stay in, e.g in dfw, mount that to the app, and then in theory the app should not be moved from there.

But maybe there is now a better/official way to enforce the region.

Thanks - I was mainly trying to keep the database and app server in the same data center because of 1) performance and 2) data costs between app server and database. Do you know if there’s any documentation about exactly what counts as data? Is all data free within in the fly.IO network regardless of data center?

I also noticed that the app builder was in a different region than my app, meaning the docker build process has to traverse data centers. To keep build times optimal (I deploy several times a day, and want that feedback loop to be tight), I was hoping to keep it all in the same data center.

Lastly, I’ll just point out that it’s a little confusing/misleading that the preferred data center entry when spinning up a new app is in fact a preference, or recommendation, or hint, as opposed to a hard and fast rule.

I still think it’s possible that there’s a bug here because I used the same app name in different data centers at different times. I hope someone from fly reads this and can comment.

When you create an app, we default to giving it backup regions. This was a feature when we first launched that makes less sense now.

If you run fly regions list you should see a list of primary and backup regions. Since you’re running a database in dfw, you’ll want to run fly regions backups dfw to set the backup region the same as the primary region. Does that make sense?

Absolutely makes sense, this is great!

Is there a help doc on this? I thought I read things fairly carefully, but I missed these details.

Also, looks like the command is fly regions backup dfw (no “S” on backup):

It appears to have worked swimmingly. Thank you!

➜  fly regions list
Region Pool:
Backup Region:

➜ fly regions backup dfw
Region Pool:
Backup Region:

So it looks like it removed the backup regions, which I assume is what’s desired.

As a followup question, will this ensure that my runner for app deployment also runs in dfw? If not, is there anything I can do to keep the deployment runner in dfw so the docker layers don’t have to move between data centers?

There’s not a great doc on backup regions, no. It’s a bit of a misfeature we’re working really hard to get away from.

Are you asking about the release_command runner? Release commands will run in the same regions an app process is restricted to. So I think it will do what you want!

I was talking about an app I have in my dashboard (fly-builder-red-darkness-7135) which is “Dead”, I assumed this was the runner that gets spun up when I fly deploy. But if this is in fact something old that I can delete, and the deployment VM gets spun up in the target region and then destroyed after, then I’m all good.

Thanks for the thorough responses here, I really appreciate it! As a Heroku (and ruby/rails) refugee, I am finding to be a wonderful new home for my Elixir projects.