Preview: multi process apps (get your workers here!)

kurt · August 25, 2021, 2:11pm

Fly.io apps meow support multiple processes with straightforward config changes. The deploy process for this is likely to have bugs, but once you get an app deployed the feature is stable.

To run multiple processes in a Fly app, you need:

a [processes] block: map of name + command
add <block>.processes to services, mount, and [statics] configurations: these accept an array of process names to match the config block

Here’s an example config

app = "fly-global-rails"

[processes]
web = "bundle exec rails server -b [::] -p 8080"
worker = "bundle exec sidekiqswarm"

[build]
  builder = "heroku/buildpacks:20"
  [build.args]
    RAILS_ENV="production"
    RAILS_SERVE_STATIC_FILES = "true"

[env]
  PORT = "8080"
  RAILS_ENV = "production"
  RAILS_SERVE_STATIC_FILES = "true"
  PRIMARY_REGION = "scl"

[[services]]
  processes = ["web"] # this service only applies to the web block
  http_checks = []
  internal_port = 8080
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 6
    timeout = "2s"

Per process commands

You will need flyctl version v0.0.234-pre-3 or greater to manage apps with multiple process groups.

Change VM counts: fly scale count web=2 worker=1

Change VM size: fly scale vm shared-cpu-1x --group worker

Change regions: fly regions set iad --group worker

vicente · August 25, 2021, 4:01pm

This feature is spot on! Should we run migrations like any other process type? For example, adding a release process, running the usual command.

At the moment I am using the release_command on the web worker to run them. Not ideal, since I have to deploy the web instance if I want to run migrations. But I haven’t found another way to make sure that migrations are executed before web or worker is spun up.

[deploy]
release_command = "rails db:migrate data:migrate"

kurt · August 25, 2021, 5:01pm

The release command will always run before web or worker get spun up. It’s designed for exactly what you’re doing.

The release_command is separate from the process types, nothing in the [process] group applies to it.

klp · August 25, 2021, 6:56pm

Is this multiple processes per ‘VM’ with a process supervisor, or multiple one-process-per-VM but defined in a single config?

The simple app + SQLite/Litestream dream is pretty much here with Litestream supporting a child process but neat to remove an additional layer if multiple in one VM.

kurt · August 26, 2021, 12:54am

This is multiple VM types, one process each. It’s useful for Rails apps where you want to run app servers close to people and workers close to the writable version of your Postgres. It also happens to be useful for several kinds of DB clusters.

I have some ideas for doing sidecar like things for Litestream, I really cannot wait for it to get read replicas.

danwetherald · August 26, 2021, 2:54am

avinashbot · August 26, 2021, 3:19am

Nice! Out of curiosity, how would autoscaling work with multi-process apps, I wonder?

enstyled · August 26, 2021, 1:33pm

This is great! I was recently thinking that it’s a missing piece on Fly and now it is not

TIL about [deploy] release_command, I believe it’s missing from the docs?

arnodirlam · September 1, 2021, 12:17pm

For anyone wanting to scale an app created before this feature existed: The default process/group name is app, so this should work:

fly scale count app=1 -a myapp

kurt · September 2, 2021, 3:34am

release_command is missing from docs, yes, we have a big doc update we need to do.

ajsharp · October 4, 2021, 8:14am

Does [processes] syntax work for both docker built and buildpack apps?

jsierles · October 4, 2021, 8:27am

Yes, this should work. If you have any problems, let us know.

lmzbonack · October 5, 2021, 7:51am

Hello,

This is an awesome feature. I am currently using it to run a Django API, a Celery Worker, and a Celery Beat instance.

fly.toml for references

kill_signal = "SIGINT"
kill_timeout = 5

[processes]
web = "gunicorn appname.wsgi --log-level 'info'"
worker = "celery -A appname worker -l INFO"
beat = "celery -A appname beat -l INFO"

[build]
  builtin = "python"

[deploy]
  release_command = "sh release.sh"

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  processes = ["web"]
  http_checks = []
  internal_port = 8080
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 6
    timeout = "2s"

Is there something special that I need to do to expose Fly Secrets to all of my processes? Does it differ in any way from just setting variables in [env]?

I ask because:

When I specify a SECRET_KEY variables locally on my computer in the env block and then deploy it all three instances get the variable and work fine

With that working, I moved on to trying to deploy via a Github Action, good docs around that too BTW, and moved all my sensitive values from the [env] block into the Fly Secrets.

However, this appears to not work, as my celery beat process cannot find the SECRET_KEY variable now. but my main app instance can.

I sshed into my worker instance and my app instance and listed their environment variables and both had a SECRET_KEY set. I would check the Beat instance in the same way but cannot SSH into as the deployment fails and gets rolled back, I guess I probably could disable health checks and then SSH in?

Thanks for your help!

jsierles · October 5, 2021, 7:56am

Hey, great to see some Python love here! I’m going to steal your fly.toml for our Python guide

Setting secrets should cut a new release. If that release failed for the beat instance, it may not get the secret assigned. With this config, your beat instance should not actually have any health checks. What does fly status look like?

lmzbonack · October 5, 2021, 8:04am

Thanks! Feel free to I just compiled stuff from the existing Flask guide and these forums haha.

Hmm so the secrets all already existed and are being used to deploy a worker and the API just fine. And I tested the API can send tasks to the worker and the worker executes them just fine

For some reason though that envar does not make it to the Beat process

Fly status shows that my worker and API are running fine because of a rollback

App
  Name     = little-frog-6396
  Owner    = personal
  Version  = 91
  Status   = running
  Hostname = little-frog-6396.fly.dev

Deployment Status
  ID          = 0123d165-3eca-3eba-4afa-74dba5fcf96c
  Version     = v91
  Status      = successful
  Description = Deployment completed successfully
  Instances   = 2 desired, 2 placed, 2 healthy, 0 unhealthy

Instances
ID       TASK   VERSION REGION DESIRED STATUS  HEALTH CHECKS      RESTARTS CREATED
529034bc web    91      lax    run     running 1 total, 1 passing 0        29m18s ago
5314c66d worker 91      lax    run     running                    0        29m18s ago

My failure message when I try to deploy the beat scheduler is
The generic unhealthy allocations

***v90 failed - Failed due to unhealthy allocations - rolling back to job version 89 and deploying as v91

When I hop into the logs I see

2021-10-05T07:58:31.206074824Z app[dc0d5ad4] lax [info]     SECRET_KEY = env('SECRET_KEY')
2021-10-05T07:58:31.206079407Z app[dc0d5ad4] lax [info]   File "/usr/local/lib/python3.8/site-packages/environ/environ.py", line 186, in __call__
2021-10-05T07:58:31.206082605Z app[dc0d5ad4] lax [info]     return self.get_value(
2021-10-05T07:58:31.206087637Z app[dc0d5ad4] lax [info]   File "/usr/local/lib/python3.8/site-packages/environ/environ.py", line 367, in get_value
2021-10-05T07:58:31.206091098Z app[dc0d5ad4] lax [info]     raise ImproperlyConfigured(error_msg) from exc
2021-10-05T07:58:31.206098486Z app[dc0d5ad4] lax [info] django.core.exceptions.ImproperlyConfigured: Set the SECRET_KEY environment variable

jsierles · October 5, 2021, 8:25am

You could try setting the command to tail -f /dev/null then logging in.

lmzbonack · October 5, 2021, 8:46am

Hmm, Unsetting the SECRET_KEY secret and then resetting it seems to have fixed it, Maybe it was a weird ordering thing.

Nice trick with tail -f /dev/null I will keep that in mind for future debugging.

Thanks!

jsierles · October 5, 2021, 8:51am

Good catch there. A good way to avoid this in the future is to set the secrets before the first deployment. We’ll think about ways to improve this situation.

lmzbonack · October 8, 2021, 4:35am

I have another couple of questions related to defining multiple processes in one fly.toml file

Is there a way to using the flyctl to easily view the logs only for a specific process
Is there a way to view metrics for each process?
When looking at the metrics page for my app with multiple processes am I only looking at the metrics for the first app I defined?

Thanks!

bbeecher · October 29, 2021, 1:55am

What are people using as a message broker for the worker? Is everyone just passing messages through the fly redis cache?