I’m starting a new topic because this one was closed automatically: Flame elastic scaling
Thanks for the fix @chrismccord, this did in fact solve the elastic scaling problem. I had three machines for three cast
calls.
Unfortunately I ran into a new issue: very shortly after booting up the machines they time out, in the middle of execution.
These are the error logs I’m seeing repeatedly for each failing function call:
2024-02-21T11:26:44Z app[148e461a10d638] ams [info]11:26:44.592 [error] Task #PID<0.2971.0> started from #PID<0.2866.0> terminating
2024-02-21T11:26:44Z app[148e461a10d638] ams [info]** (stop) exited in: FLAME.Pool.call(PhoenixAlbums.ImageProcessor, #Function<4.5096088/0 in Ph
oenixAlbumsWeb.Upload.resize_and_save_images/2>, [timeout: 30000])
2024-02-21T11:26:44Z app[148e461a10d638] ams [info] ** (EXIT) time out
2024-02-21T11:26:44Z app[148e461a10d638] ams [info] (flame 0.1.9) lib/flame/pool.ex:242: FLAME.Pool.exit!/3
2024-02-21T11:26:44Z app[148e461a10d638] ams [info] (elixir 1.16.0) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
2024-02-21T11:26:44Z app[148e461a10d638] ams [info] (elixir 1.16.0) lib/task/supervised.ex:36: Task.Supervised.reply/4
2024-02-21T11:26:44Z app[148e461a10d638] ams [info]Function: #Function<5.33535024/0 in FLAME.Pool.cast/2>
2024-02-21T11:26:44Z app[148e461a10d638] ams [info] Args: []
In these examples I tried to process about 15 small images, and after a few seconds the errors messages flood the logs, and only about the first two or three images are processed.
These are the relevant Pool config values:
shutdown_timeout: 120_000,
idle_shutdown_after: 120_000,
timeout: 120_000,
min: 0,
max: 10,
max_concurrency: 1,
single_use: true,
log: :debug,
I also added this terminator config:
config :flame, :terminator, shutdown_timeout: :timer.minutes(5)
but these seem to have no effect.
I don’t know where the 30000
timeout config is coming from, it seems to be a default value. Also, the logs refer to FLAME.Pool.call
(which in fact has this exact timeout option from the default Pool configs) but in my code I’m only calling FLAME.Pool.cast
and I have updated timeouts.
I wonder if there are any more configs I can add to this? Please let me know if I can provide any more info or try other methods to resolve this.