--regions flag is not picked up on `fly deploy`

mql · February 13, 2025, 7:46pm

Hi there!

I’m puzzled because on a freshly created app on Fly when I run the following command…

fly deploy -a my-new-app --regions ams --vm-size shared-cpu-1x --vm-memory 512

… machine and volume are created within the “fra” region (some default?). I don’t have any region set in fly.toml as I want to be explicit about where each deployment should go.

Creating volumes and machines separately and providing a --region flag works, but fly deploy always gets me a machine+volume in “fra”.

What am I doing wrong?

Thank you for your help.

khuezy · February 13, 2025, 7:54pm

Either it’s a bug w/ the fly cli, or maybe since you don’t have a default region, it defaults to the one closest to you? You can try the fly scale count 1 --region to see if that works.

mql · February 13, 2025, 8:19pm

fly scale count 1 --region ams works. I just have to delete the old volume+machine in the fra region manually afterwards.

How can I set a default region (other than in fly.toml?). Agree, it smells a bit like a bug in fly deploy. I’m aiming to have very simple instructions to for self-hosting the app, so every additional step is bad. Hope that it can be done with a clean fly deploy command.

Generally I’d expect that setting flags on fly deploy explicitly will override any default values. It’s also surprising that if I have provided

[[vm]]
  memory = "256mb"

in fly.toml, running fly deploy with --vm-memory 512 it will still use the 256. Explicit should always overrule implicit in my opinion. Or is there some other reasoning behind that I’m overlooking?

mql · February 17, 2025, 7:26pm

Can anyone from Fly.io confirm if this is a bug or expected behavior (if yes -why?) of fly deploy?

Thank you.

rubys · February 17, 2025, 9:49pm

There may be confusion as to what this flag means. I use it periodically when I want to only deploy a change to a subset of the regions to which I have deployed an application.

Looking at the code, --regions is an alias for --only-regions.

github.com/superfly/flyctl

internal/command/deploy/deploy.go

b657dad03


      
          	flag.StringSlice{
          		Name:        "regions",
          		Aliases:     []string{"only-regions"},
          		Description: "Deploy to machines only in these regions. Multiple regions can be specified with comma separated values or by providing the flag multiple times.",

Later in the code, this flag is used to remove machines from the list to deploy to:

github.com/superfly/flyctl

internal/command/deploy/machines.go

b657dad03


      
          	machines = slices.DeleteFunc(machines, func(m *fly.Machine) bool {
          		if len(md.onlyRegions) > 0 {
          			filtersApplied["--regions"] = struct{}{}
          
          			if !md.onlyRegions[m.Region] {
          				return true
          			}
          		}

I see no evidence that this flag was intended to be used to determine where the initial deployment should be done to if there are no machines.

Disclaimer: I am a Fly.io employee, but I did not write this code.

mql · February 18, 2025, 12:25pm

Thanks very much for the investigation. Let me describe my use case really quick to add context:

We’re deploying live-editable dynamic websites for clients. Each deployment is a single machine + volume with an sqlite.db. It’s a SvelteKit app running on the Node adapter.

The current deployment workflow is this:

Create a new app.

fly apps create sams-website

Set the env vars.

fly secrets set -a sams-website \
  DB_PATH='/data/db.sqlite3' \
  ORIGIN='https://sams-website.fly.dev'

Do the deploy.

fly deploy -a sams-website --regions ams --vm-size shared-cpu-1x --vm-memory 512 --volume-initial-size 2

We have one codebase for many clients, hence we can not put configuration like region or volume size in fly.toml. So that’s why we’d like to set the initial parameters with fly deploy explicit flags.

I have a few questions:

Would it be reasonable to consider the --regions also determining the initial deployment region? (this would our lives the easiest I think, could you check if your team could make that change?)
Is there an alternative on how to specify to which regions the initial deployment should go? Setting primary_region in fly.toml works, but since we want to deploy to different regions but with the same codebase we need an explicit way to set those (via CLI?)

Or asking more generally, what’s the officially suggested way when you can’t have once fly.toml per app (but multiple variations of the config for different deployments)? Should we consider managing one fly.toml file per client that we don’t check into version control? (bit worried about maintanance here, as the shared settings would need to be kept in sync)

For more context, this is our fly.toml file with generic settings.

swap_size_mb = 512  # Allocates 512MB of swap memory (make sure --volume-initial-size is set to 2GB or more)

[build]

[experimental]
  cmd = ["/app/scripts/start-fly.sh"]
  entrypoint = ["sh"]

[mounts]
  source = "data"
  destination = "/data"
  auto_extend_size_threshold = 80
  auto_extend_size_increment = "1GB"
  auto_extend_size_limit = "5GB"

[http_service]
  internal_port = 3000
  force_https = true
  # automatically stop Machines when the app is idle for several minutes to reduce costs
  auto_stop_machines = false
  auto_start_machines = true
  min_machines_running = 0
  processes = ["app"]

Thanks a lot. Appreciate your help!

rubys · February 18, 2025, 1:59pm

I can make the change, but first I want to explore your use case to see if there is a better solution. In particular, I want to make sure that the CLI we provide isn’t getting in your way.

It sounds to me like you have written code to orchestrate the machines you are deploying, written in perhaps bash or Node.js, and that code is shelling out commands. If you have written an orchestrator, you probably should consider our machines API: Machines API · Fly Docs, with that you can create apps and start machines, no fly.toml required at all. You can even build Docker images yourself and push them to the registry of your choice, and reference those images using our API.

With that as a baseline, what conveniences does our CLI provide that you would miss if you were to take that approach? If you identify something important to you that our CLI does that our Machines API doesn’t, I’ll look into making the change you requested. Otherwise you may find that the Machines API is actually more convenient for you…

mql · February 18, 2025, 3:58pm

I’d actually be interested in both approaches.

Short-term (within the next 12 months) we’ll have less than 30 clients and prefer to make the initial deploy (with the 3 commands I mentioned) and update (with fly deploy -a sams-website) them manually. There are even clients with technical staff who want to do this themselves. So this is a great baseline setup for us.

Long-term: Absolutely interested to put in place some automation/orchestration scripts (maybe even building an “app that manages the apps”) so we can manage 100+ sites (e.g. rolling out an update to all apps and reusing a Docker image). Just can’t invest too much into that in the coming weeks (we’re a 2-men bootstrapped biz).

For us using only the CLI and a fly.toml (containing the generic config) is just the fastest way to achieve our goals. We’re already there (the only thing that needs a workaround atm is setting the original deployment region). I’m eager to study the more granular Machines API, and use that in the future. However short-term, being able to do the initial fly deploy and specifying the region would be wonderful.

Side question: Just to reassure I understand the intended philosophy behind your tools: The CLI’s main purpose is to get started fast (deploy the first app), for more complex workflows the API is recommended. That said, the CLI is a subset of the Machines API. Those assumptions correct?

rubys · February 18, 2025, 5:10pm

That’s a good approximation. Internally to Fly, we have a different approximation which I will share:

There are two types of apps: framework apps (which run fly launch once, ideally use PostgreSQL, and have a pool of machines for high availability and/or or local responsiveness), and machines apps (which have a pool of machines that are started potentially as often as when a new request comes in).

Both approximations are at best, approximations. I claim that there is a third type, an Sqlite3 app with one machine per user. You can read more at Shared Nothing Architecture. My implementation is different than yours in that I use a single Fly app and dynamic request routing, which means for my use case fly clone is how I provision a new machine.

It is unlikely that I will get to your request today, but I should be able to look into this this week. It likely is a small change, but if I see any issues I’ll report back here.

rubys · February 18, 2025, 6:49pm

I lied. Pull request: support fly deploy --regions when no machines exist (first deploy) by rubys · Pull Request #4229 · superfly/flyctl · GitHub

Seems to work if exactly one region is specified, an no machines or volumes exist for the current app.

mql · February 18, 2025, 7:28pm

Amazing. Thanks for that PR, and for the thorough explanations in your previous message!

Interesting read on the Shared Nothing Architecture.

rubys · February 18, 2025, 7:36pm

Cool. If I can stress just one point from those pages it would be: take responsibility for your own backups. You have clients. Their data is on a volume. Given enough time and clients, assume that you will eventually experience a volume failure. We have snapshots, but read the blue box near the top of this page: Manage volume snapshots · Fly Docs

Realistically, it is unlikely to happen to you, but we have a lot more users and over time we have had users lose data and not notice until the snapshots have expired. It is not a pleasant experience.

I implemented an rsync strategy before we had Tigris, and I’m still using it. But these days I recommend litestream and Tigris. I encourage you to give it a try.

mql · February 18, 2025, 8:00pm

Thanks for the warning on volumes. I put hourly backups in place 2 weeks ago. We do full backups to Tigris and keep the latest hourly backup of the current day and keep 30 daily backups (right before midnight).

The reason I didn’t try Litestream (yet) was that we give clients access to the S3 bucket, so they own their backups too (they can access them via the Tigris dashbaord) It would be confusing to them to see all the extra files (WAL logs etc.). Since they have the app source code and the sqlite3 file they can redeploy on a new fly machine (or on any other platform even) without needing us (given that someone can follow the steps in the README).

Just one quick question: I use WAL mode in production, and in order to avoid creating a full copy of the DB on disk (few clients have 500MB sqlite files, as we store WebP images in the DB as well) I flush the WAL log and temporarily disable auto-checkpointing to safely upload the db.sqlite3 file. Is this a legit approach? (see code below)

#!/bin/sh
# We assume the current directory is the root of the project (e.g. /app on Fly.io)
LOG_FILE="/data/backup.log"
touch $LOG_FILE # make sure log file exists

echo "$(date -u):" | tee -a  $LOG_FILE

# Flush the WAL log to the database file before uploading a backup
echo "Temporarily disable auto-checkpointing and flushing WAL log to /data/db.sqlite3 and... " | tee -a $LOG_FILE
sqlite3 /data/db.sqlite3 'PRAGMA wal_autocheckpoint = 2147483647;'
sqlite3 /data/db.sqlite3 'PRAGMA wal_checkpoint(FULL);'

# Upload to S3
node sqlite/backup.js 2>&1 | tee -a $LOG_FILE

echo "Re-enabling auto checkpointing..." | tee -a $LOG_FILE
sqlite3 /data/db.sqlite3 'PRAGMA wal_autocheckpoint = 1000;'

system · February 25, 2025, 8:00pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.