auto_stop_machines & start: machines keep getting restarted

alyx · July 7, 2023, 11:03am

Hi everyone,

I’ve setup a Mastodon instance with fly.io using this template GitHub - tmm1/flyapp-mastodon: mastodon on fly.io

I’ve added a section for auto_stop_machines and auto_start_machines so the template looks like:

app = "XXX"

kill_signal = "SIGINT"
kill_timeout = 5

[deploy]
  strategy = "bluegreen"

[env]
  LOCAL_DOMAIN = "XXX"
  WEB_CONCURRENCY = "0"
  OVERMIND_FORMATION = "sidekiq=1"
  MALLOC_ARENA_MAX = "2"
  MAX_THREADS = "15"
  RAILS_ENV = "production"
  RAILS_LOG_TO_STDOUT = "enabled"
  RAILS_SERVE_STATIC_FILES = "false"
  REDIS_HOST = "XXX-redis.internal"
  REDIS_PORT = "6379"
  S3_ENABLED = true
  S3_BUCKET = "XXX"
  S3_ALIAS_HOST = "XXX.XXX"
  S3_ENDPOINT = "https://XXX.r2.cloudflarestorage.com/"
  S3_PERMISSION = "private"
  S3_PROTOCOL = "https"

  SMTP_SERVER = "smtp.eu.mailgun.org"
  SMTP_PORT = "587"
  SMTP_ENABLE_STARTTLS = "always"
  SMTP_FROM_ADDRESS = "mastodon@XXX"

[[statics]]
  guest_path = "/opt/mastodon/public"
  url_prefix = "/"

[[services]]
  # processes = ["rails"]
  internal_port = 8080
  protocol = "tcp"

  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 1

  [services.concurrency]
    type = "requests"
    hard_limit = 250
    soft_limit = 200

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

  [[services.http_checks]]
    path = "/health"
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

I saw some machines stopping but instantly restarting. What could cause this? The checks have a restart_limit = 0 so I don’t think it could be this.

I then tried to put a stupidly high hard_limit and soft_limit to make sure the instances would shut down, but it did not work. Some were shutting down and restart just after, some were not even shutting down at all.

Mastodon has a websocket running, and I saw there was an assumption here that it could cause issues. Was it fixed?

I also tried to switch to http_service and http_service.concurrency using type = "requests", but the same thing, even with very high fake values, machines would again restart or not stop at all.

I forgot to save the logs… If needed I can redeploy with these settings and add them to this thread.

Thanks for your help

senyo · July 7, 2023, 12:36pm

We haven’t gotten to looking into this issue just yet. We have a hunch why this is happening but haven’t had the time to reproduce and fix yet. We should in the next few weeks.

rubys · July 7, 2023, 12:36pm

The issue with websockets is that they are opened by the client and are intended to remain open as long as a tab containing your webpage is still open. In the event that the server or network goes down, the javascript client will respond to that event by waiting a short period (normally a small number of seconds) and continuously retry.

alyx · July 7, 2023, 3:19pm

Thanks for your answer, senyo. Anything I could do in the meantime to have some kind of autoscaling? I don’t mind waiting, no worries Thanks for working on that and for the quick answer.

system · July 14, 2023, 3:19pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

ben-io · August 15, 2023, 9:03pm

@alyx we’ve fixed a bug around incorrectly auto-stopping machines, leading to the behavior you saw. Can you try re-enabling autostart & autostop to see if it works better?

Topic		Replies	Views
Fly io app running, despite auto stop set to stop autoscaling	2	124	September 25, 2024
App is being suspended even though I have auto_stop_machines set to "off" in fly.toml Questions / Help machines	3	61	April 21, 2025
Fly.io machines just wont start, even after setting auto-stop to false. I have tried everything I know and now I'm frustrated, why is this so?	5	261	April 30, 2024
auto_stop_machines: true and min_machines_running:0 do not scale down to 0 Questions / Help	4	785	August 16, 2023
Application machine started without me doing so autoscaling	3	21	March 19, 2025

auto_stop_machines & start: machines keep getting restarted

Related topics