Having trouble getting machines to autostart

Yardboy · July 16, 2025, 4:07am

I’ve been trying to configure my toml file for autostart but it just doesn’t seem to be working, not sure what I’m missing. I did some testing with k6 and one machine and it seemed to show me that my soft limit should be around 50rps and my hard limit at 70rps (although I’m not at all sure I’m interpreting the k6 results correctly).

I configured as below, then scaled to 3 machines and suspended two of them. I then had k6 send a ton of traffic at the site and the suspended machines never started, even when I started getting 429 errors. I tried lowering the limits in steps, as far down as 20 and 25, and still the suspended machines never start when I start overloading the site.

Appreciate any help.

primary_region = "mia"
console_command = "bin/rails console"

[build]
  dockerfile = "Dockerfile.web"
  build-target = "deploy"

[build.args]
  APP_URL = "https://staging.floridacims.org"
  RAILS_ENV = "staging"
  RACK_ENV = "staging"
  APPUID = "1000"
  APPGID = "1000"

[deploy]
  processes = ["app"]
  release_command = "./bin/rails db:prepare"
  strategy = "bluegreen"

[env]
  RAILS_MAX_THREADS = 5

[http_service]
  processes = ["app"]
  internal_port = 3000
  auto_stop_machines = "suspend"
  auto_start_machines = true
  min_machines_running = 1

[[http_service.checks]]
  processes = ['app']
  grace_period = "10s"
  interval = "30s"
  protocol = "http"
  method = "GET"
  timeout = "5s"
  path = "/up"

[[http_machine.checks]]
  processes = ['app']
  grace_period = "30s"
  image = "curlimages/curl"
  entrypoint = ["/bin/sh", "-c"]
  command = ["curl http://[$FLY_TEST_MACHINE_IP]/up | grep 'background-color: green'"]
  kill_signal = "SIGKILL"
  kill_timeout = "5s"

[http_service.concurrency]
  processes = ['app']
  type = "requests"
  soft_limit = 50
  hard_limit = 70

[[vm]]
  processes = ["app"]
  size = "shared-cpu-2x"
  memory = '2gb'

[[vm]]
  processes = ["worker"]
  size = "shared-cpu-2x"
  memory = '2gb'

[[statics]]
  guest_path = "/rails/public"
  url_prefix = "/"

[processes]
  app = "bundle exec rails s -b 0.0.0.0 -p 3000"
  worker = "bundle exec sidekiq"

khuezy · July 16, 2025, 3:21pm

What’s the intention with your machine check? Try removing it.

Yardboy · July 16, 2025, 3:45pm

As I understand it, the purpose of a machine check is to ensure that the machine is operating correctly behind the proxy before the proxy starts sending requests to it, versus a service check which checks from the public facing side of the proxy. The machine checks pass when the application is deployed, scaled, or when a machine is started manually. But I will give your suggestion a try.

jfent · July 16, 2025, 4:02pm

Yardboy:

[[http_machine.checks]]
  processes = ['app']
  grace_period = "30s"
  image = "curlimages/curl"
  entrypoint = ["/bin/sh", "-c"]
  command = ["curl http://[$FLY_TEST_MACHINE_IP]/up | grep 'background-color: green'"]
  kill_signal = "SIGKILL"
  kill_timeout = "5s"

This is not valid config, refer to the docs: App configuration (fly.toml) · Fly Docs

Presumably you want a [[http_service.machine_checks]] section?

Also, you’re sprinkling keys in various sections that don’t support them (again, refer to docs to understand what keys should appear under what sections), e.g. processes is not a recognised key for [http_service.concurrency], at least as far as docs say.

Yardboy · July 16, 2025, 9:26pm

Thanks so much. I have gone through each section and I believe what I have below is compliant. Please give it a quick review and let me know if you see any other issues. I sort of went scattershot with the processes = [‘app’] line before because I didn’t know where it mattered. Config validated so I assumed it was good.

Anyway, that and the machine checks issues taken care of, autostart does now appear to be working for me. Still not sure about where the limits should be set, trying to find some info out there on interpreting the results from k6, but otherwise good. I appreciate your help!

primary_region = "mia"
console_command = "bin/rails console"

[build]
  dockerfile = "Dockerfile.web"
  build-target = "deploy"

[build.args]
  APP_URL = "https://staging.floridacims.org"
  RAILS_ENV = "staging"
  RACK_ENV = "staging"
  APPUID = "1000"
  APPGID = "1000"

[deploy]
  processes = ["app"]
  release_command = "./bin/rails db:prepare"
  release_command_timeout = "10m"
  strategy = "bluegreen"

[env]
  RAILS_MAX_THREADS = 5

[http_service]
  processes = ["app"]
  internal_port = 3000
  auto_stop_machines = "suspend"
  auto_start_machines = true
  min_machines_running = 1

[http_service.concurrency]
  type = "requests"
  soft_limit = 20
  hard_limit = 25

[[http_service.checks]]
  grace_period = "10s"
  interval = "30s"
  protocol = "http"
  method = "GET"
  timeout = "5s"
  path = "/up"

[[http_service.machine_checks]]
  processes = ['app']
  grace_period = "30s"
  image = "curlimages/curl"
  entrypoint = ["/bin/sh", "-c"]
  command = ["curl http://[$FLY_TEST_MACHINE_IP]/up | grep 'background-color: green'"]
  kill_signal = "SIGKILL"
  kill_timeout = "5s"

[[vm]]
  processes = ["app"]
  size = "shared-cpu-2x"
  memory = '2gb'

[[vm]]
  processes = ["worker"]
  size = "shared-cpu-2x"
  memory = '2gb'

[[statics]]
  guest_path = "/rails/public"
  url_prefix = "/"

[processes]
  app = "bundle exec rails s -b 0.0.0.0 -p 3000"
  worker = "bundle exec sidekiq"

jfent · July 16, 2025, 9:33pm

Yeh our config validation is not as strict as I think it should really be (this has spurred me to work on that so thanks! )

I think sometimes an overlapping config ends up implicitly overwriting it’s counterpart in a way that is totally opaque to the user and is not mentioned by the config validation command

FYI, I don’t think you really need the http_service.machine_checks section - it seems to do basically exactly the same thing that http_service.checks is doing. Our proxy will make a request to /up based on http_service.checks, so your machine check isn’t doing anything different except for grepping for that bit of text.

Yardboy · July 16, 2025, 10:08pm

I certainly don’t want to duplicate effort on the deployment. But the way I understood it was the machine check is looking to see if the machine is responding to requests on its internal IP address behind the proxy, so as to confirm the machine is ready to be added to the pool of available machines, while service checks test from the public side of the proxy to see if the app itself can be accessed on its public route. Is that wrong?

jfent · July 16, 2025, 10:12pm

Apologies, you’re absolutely right, I wasn’t looking carefully enough at docs! Machine checks are run once per deploy, http checks are run periodically and continuously

Yardboy · July 16, 2025, 10:31pm

Excellent - thanks again for all your help yesterday and today.

Topic		Replies	Views
Suspended machines are stopped on new deploy Build debugging machines , autoscaling	1	53	April 28, 2025
auto_start_machines Questions / Help machines , autoscaling	1	110	July 16, 2024
Machines are automatically shutting down immediately JavaScript	12	92	June 3, 2025
Auto stop not working Questions / Help machines , autoscaling , proxy	4	81	February 8, 2025
Fly Machine API is ignoring autostart = false machines , autoscaling	4	32	January 31, 2025

Having trouble getting machines to autostart

Related topics