Bug? Deploy of multi-process app reactivates process groups that had been set to 0

Initial state: A multi-process group app where some processes have been manually scaled down to zero (fly scale count app=1 worker=0)

Action: fly deploy

Expected result: worker remains scaled to zero / inactive.

Actual result: worker is scaled back up / reactivated. Excerpt:

Process groups have changed. This will:
 * create 1 "worker" machine and 1 standby machine for it
No machines in group worker, launching a new machine
  Machine [...] [worker] was created
[...]

I couldn’t find this behavior documented (and in our case it’s particularly unwanted). Is this a bug?

I’d consider this expected - the scale commands don’t really save any desired state, when we need to show the state of an app (like with scale show), we derive it by looking at all the machines.

Because of this, there’s no difference internally between scaling a group down to zero and having just added the group to the toml but not deployed it yet.

May I ask what you’re trying to do? I want to see if there’s another way to get the same result, and if not, I’ll look for some way (probably command line flags!) to make flyctl support this usecase.
(if you don’t want to talk about that on a public forum, I totally understand though :slight_smile: )

Hi Allison! Happy to talk about it, and thanks for the fast reply.

I’d consider this expected - the scale commands don’t really save any desired state […]

Interesting. I’d say maybe it is less surprising to a Fly’er or someone knowing this detail, but still surprising to us. It’s a bit different than other orchestration environments we know (ecs, k8s, cloud run, heroku). In those places, when you configure a static scale of N, N doesn’t change without you asking! :slight_smile:

May I ask what you’re trying to do? I want to see if there’s another way to get the same result, and if not, I’ll look for some way (probably command line flags!) to make flyctl support this usecase.

Definitely! Actually have two examples.

Example 1: We’re working to bring more of our infra over to Fly (yay!), while also making the fewest changes to how our services are structured. In other words we don’t want to refactor anything now “just for Fly” if we can avoid it.

One service we’re bringing over, from Heroku, has 3-4 process groups over there. One process is our main dashboard, and another is a job queue. We want to be able test the appserver on Fly without the job server spinning up and taking over production jobs. So the idea was to set “scale=0” on that process.

For now, the workaround is to comment out the process group.

Example 2: We have a staging environment with the same complement of process groups. However because staging usage is bursty, we’d rather not leave machines running that we’re not using.

In particular, one process (in prod) listens forever on an open websocket, so scaling to zero when idle can’t be used here. In staging, this process is usually irrelelvant - unless we’re testing something related to it.

So the problem is, a general re-deploy to staging is going to gratuitously bring the listener process back up, until someone notices and scales it back down - which is basically the workaround I have right now.

Thanks for listening and any suggestions!

1 Like

It’s imperfect, but until we have a better solution for this, there’s a new --update-only flag for fly deploy as of flyctl 0.1.58 that doesn’t create machines for empty process groups. I’d like to make this better at some point, but this should unblock you for now :slight_smile:

(this also means, though, that if you add new process groups, that flag will prevent them from getting their machines)

Does that work for your usecase at the moment?

1 Like

Does that work for your usecase at the moment?

Thanks! That looks like a decent workaround, and it is cool to see the fast new addition of this feature.

I think the Fly apps behavior is still perhaps a little unintuitive. Like, I could imagine someone forgetting / not knowing about this flag, putting us back to the original problem.

Perhaps it would be more intuitive/discoverable if the default for deploy is to prompt (or abort) when:

  • machines would change, AND
  • neither --update-only nor --create-machines-if-needed (hypothetical new flag, with meaning the opposite of --update-only) was supplied

Just thinking out loud here. Thanks!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.