Elixir: Infinite Loop when running Config.Provider?

My Elixir/Phoenix app deployed successfully, but after a few hours it seems to constantly restart.

I’ve followed the Elixir & Phoenix guide, latest version as of July 12th.

I’ve tried a few things to fix:

  1. Built my release locally, with MIX_ENV=prod mix release - with associated necessary local environment variables and proper IPv6 local networking. My local server does not seem to run into these issues.
  2. Built my deploy using flyctl deploy --remote-only to cut my local Docker setup out of the equation - I’m running a MBP 16" with an 8-Core Intel Core i9 - the image still has issues when I release this way

How should I proceed to test this?
Something is wrong with my app - but the issue only seems to show itself when running on Fly.

Here’s a sample of the logs:

2021-07-14T01:47:52.330482590Z app[848a90fd] ewr [info] Running: `bin/fitclub start` as nobody
2021-07-14T01:47:52.339742509Z app[848a90fd] ewr [info] 2021/07/14 01:47:52 listening on [fdaa:0:285f:a7b:ab3:848a:90fd:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T01:47:53.336943776Z app[848a90fd] ewr [info] Reaped child process with pid: 547, exit code: 0
2021-07-14T01:47:56.678136687Z app[848a90fd] ewr [info] [info] Running FitclubWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2021-07-14T01:47:56.681211093Z app[848a90fd] ewr [info] [info] Access FitclubWeb.Endpoint at http://fitclub-stage.fly.dev
2021-07-14T01:47:57.342251518Z app[848a90fd] ewr [info] Reaped child process with pid: 568 and signal: SIGUSR1, core dumped? false
2021-07-14T01:47:57.342766577Z app[848a90fd] ewr [info] Reaped child process with pid: 573, exit code: 0
2021-07-14T01:50:16.510121686Z app[848a90fd] ewr [info] Reaped child process with pid: 624, exit code: 0
2021-07-14T01:50:16.510638509Z app[848a90fd] ewr [info] Reaped child process with pid: 626, exit code: 0
2021-07-14T01:50:16.511397512Z app[848a90fd] ewr [info] Reaped child process with pid: 645 and signal: SIGUSR1, core dumped? false
2021-07-14T01:50:17.513662008Z app[848a90fd] ewr [info] Reaped child process with pid: 647 and signal: SIGUSR1, core dumped? false
2021-07-14T01:50:17.514576707Z app[848a90fd] ewr [info] Reaped child process with pid: 571 and signal: SIGUSR1, core dumped? false
2021-07-14T01:50:19.517573716Z app[848a90fd] ewr [info] Reaped child process with pid: 681, exit code: 0
2021-07-14T01:50:19.518132949Z app[848a90fd] ewr [info] Reaped child process with pid: 683, exit code: 0
2021-07-14T01:50:27.527585460Z app[848a90fd] ewr [info] Reaped child process with pid: 702 and signal: SIGUSR1, core dumped? false
2021-07-14T01:50:29.530423737Z app[848a90fd] ewr [info] Main child exited normally with code: 0
2021-07-14T01:50:29.530686025Z app[848a90fd] ewr [info] Starting clean up.
2021-07-14T01:50:31.400285786Z runner[848a90fd] ewr [info] Starting instance
2021-07-14T01:50:31.429738823Z runner[848a90fd] ewr [info] Configuring virtual machine
2021-07-14T01:50:31.430675855Z runner[848a90fd] ewr [info] Pulling container image
2021-07-14T01:50:31.588688542Z runner[848a90fd] ewr [info] Unpacking image
2021-07-14T01:50:31.591985772Z runner[848a90fd] ewr [info] Preparing kernel init
2021-07-14T01:50:32.029536408Z runner[848a90fd] ewr [info] Configuring firecracker
2021-07-14T01:50:32.057381328Z runner[848a90fd] ewr [info] Starting virtual machine
2021-07-14T01:50:32.193265345Z app[848a90fd] ewr [info] Starting init (commit: cc4f071)...
2021-07-14T01:50:32.208863466Z app[848a90fd] ewr [info] Running: `bin/fitclub start` as nobody
2021-07-14T01:50:32.218317434Z app[848a90fd] ewr [info] 2021/07/14 01:50:32 listening on [fdaa:0:285f:a7b:ab3:848a:90fd:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T01:50:33.215583785Z app[848a90fd] ewr [info] Reaped child process with pid: 547, exit code: 0
2021-07-14T01:50:36.518814357Z app[848a90fd] ewr [info] [info] Running FitclubWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2021-07-14T01:50:36.521640581Z app[848a90fd] ewr [info] [info] Access FitclubWeb.Endpoint at http://fitclub-stage.fly.dev
2021-07-14T01:50:37.220921674Z app[848a90fd] ewr [info] Reaped child process with pid: 568 and signal: SIGUSR1, core dumped? false
2021-07-14T01:50:37.221419030Z app[848a90fd] ewr [info] Reaped child process with pid: 573, exit code: 0
2021-07-14T01:51:13.333912150Z runner[848a90fd] ewr [info] Shutting down virtual machine
2021-07-14T01:51:13.390749512Z app[848a90fd] ewr [info] Sending signal SIGINT to main child process w/ PID 507
2021-07-14T01:51:13.391760515Z app[848a90fd] ewr [info] BREAK: (a)bort (A)bort with dump (c)ontinue (p)roc info (i)nfo
2021-07-14T01:51:13.392391454Z app[848a90fd] ewr [info]        (l)oaded (v)ersion (k)ill (D)b-tables (d)istribution
2021-07-14T01:51:24.224924073Z runner[848a90fd] ewr [info] Starting instance
2021-07-14T01:51:24.255400147Z runner[848a90fd] ewr [info] Configuring virtual machine
2021-07-14T01:51:24.256453119Z runner[848a90fd] ewr [info] Pulling container image
2021-07-14T01:51:24.404332917Z runner[848a90fd] ewr [info] Unpacking image
2021-07-14T01:51:24.407531790Z runner[848a90fd] ewr [info] Preparing kernel init
2021-07-14T01:51:24.841398651Z runner[848a90fd] ewr [info] Configuring firecracker
2021-07-14T01:51:24.873218870Z runner[848a90fd] ewr [info] Starting virtual machine
2021-07-14T01:51:25.002611424Z app[848a90fd] ewr [info] Starting init (commit: cc4f071)...
2021-07-14T01:51:25.019309807Z app[848a90fd] ewr [info] Running: `bin/fitclub start` as nobody
2021-07-14T01:51:25.027683611Z app[848a90fd] ewr [info] 2021/07/14 01:51:25 listening on [fdaa:0:285f:a7b:ab3:848a:90fd:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T01:51:26.026085782Z app[848a90fd] ewr [info] Reaped child process with pid: 547, exit code: 0
2021-07-14T01:51:29.406696991Z app[848a90fd] ewr [info] [info] Running FitclubWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2021-07-14T01:51:29.409969123Z app[848a90fd] ewr [info] [info] Access FitclubWeb.Endpoint at http://fitclub-stage.fly.dev
2021-07-14T01:51:30.030815354Z app[848a90fd] ewr [info] Reaped child process with pid: 568 and signal: SIGUSR1, core dumped? false
2021-07-14T01:51:30.031379517Z app[848a90fd] ewr [info] Reaped child process with pid: 573, exit code: 0
2021-07-14T01:51:56.063673726Z app[848a90fd] ewr [info] Reaped child process with pid: 619, exit code: 0
2021-07-14T01:51:56.064415686Z app[848a90fd] ewr [info] Reaped child process with pid: 621, exit code: 0
2021-07-14T01:51:56.065309566Z app[848a90fd] ewr [info] Reaped child process with pid: 640 and signal: SIGUSR1, core dumped? false
2021-07-14T01:51:56.066065894Z app[848a90fd] ewr [info] Reaped child process with pid: 642 and signal: SIGUSR1, core dumped? false
2021-07-14T01:51:56.067224898Z app[848a90fd] ewr [info] Reaped child process with pid: 571 and signal: SIGUSR1, core dumped? false
2021-07-14T01:52:03.074914798Z app[848a90fd] ewr [info] Main child exited normally with code: 0
2021-07-14T01:52:03.075269172Z app[848a90fd] ewr [info] Starting clean up.
2021-07-14T01:52:06.138262359Z runner[848a90fd] ewr [info] Starting instance
2021-07-14T01:52:06.168388547Z runner[848a90fd] ewr [info] Configuring virtual machine
2021-07-14T01:52:06.169328144Z runner[848a90fd] ewr [info] Pulling container image
2021-07-14T01:52:06.529652537Z runner[848a90fd] ewr [info] Unpacking image
2021-07-14T01:52:06.535950593Z runner[848a90fd] ewr [info] Preparing kernel init
2021-07-14T01:52:06.934280787Z runner[848a90fd] ewr [info] Configuring firecracker
2021-07-14T01:52:06.972644620Z runner[848a90fd] ewr [info] Starting virtual machine
2021-07-14T01:52:07.098776121Z app[848a90fd] ewr [info] Starting init (commit: cc4f071)...
2021-07-14T01:52:07.114653544Z app[848a90fd] ewr [info] Running: `bin/fitclub start` as nobody
2021-07-14T01:52:07.124007451Z app[848a90fd] ewr [info] 2021/07/14 01:52:07 listening on [fdaa:0:285f:a7b:ab3:848a:90fd:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T01:52:08.121382630Z app[848a90fd] ewr [info] Reaped child process with pid: 547, exit code: 0
2021-07-14T01:52:11.621162950Z app[848a90fd] ewr [info] [info] Running FitclubWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2021-07-14T01:52:11.624582383Z app[848a90fd] ewr [info] [info] Access FitclubWeb.Endpoint at http://fitclub-stage.fly.dev
2021-07-14T01:52:12.126066024Z app[848a90fd] ewr [info] Reaped child process with pid: 568 and signal: SIGUSR1, core dumped? false
2021-07-14T01:52:12.126650836Z app[848a90fd] ewr [info] Reaped child process with pid: 573, exit code: 0
2021-07-14T01:53:54.248339453Z app[848a90fd] ewr [info] Reaped child process with pid: 648, exit code: 0
2021-07-14T01:53:54.248897774Z app[848a90fd] ewr [info] Reaped child process with pid: 650, exit code: 0
2021-07-14T01:53:54.249740136Z app[848a90fd] ewr [info] Reaped child process with pid: 669 and signal: SIGUSR1, core dumped? false
2021-07-14T01:53:55.252734510Z app[848a90fd] ewr [info] Reaped child process with pid: 671 and signal: SIGUSR1, core dumped? false
2021-07-14T01:53:55.253912630Z app[848a90fd] ewr [info] Reaped child process with pid: 571 and signal: SIGUSR1, core dumped? false
2021-07-14T01:54:04.915436128Z app[848a90fd] ewr [info] ERROR! Got infinite loop when running Config.Provider. Please make sure "/app/releases/0.1.0/sys.config" is writable and accessible or choose a different path
2021-07-14T01:54:04.918893102Z app[848a90fd] ewr [info] {"init terminating in do_boot",{<<"aborting boot">>,[{'Elixir.Config.Provider',boot,2,[]}]}}
2021-07-14T01:54:04.920083185Z app[848a90fd] ewr [info] init terminating in do_boot ({,[{Elixir.Config.Provider,boot,2,[]}]})
2021-07-14T01:54:05.073372721Z app[848a90fd] ewr [info] Crash dump is being written to: erl_crash.dump...done
2021-07-14T01:54:05.264998777Z app[848a90fd] ewr [info] Main child exited normally with code: 1
2021-07-14T01:54:05.265775524Z app[848a90fd] ewr [info] Reaped child process with pid: 674 and signal: SIGUSR1, core dumped? false
2021-07-14T01:54:05.266044946Z app[848a90fd] ewr [info] Starting clean up.
2021-07-14T01:54:08.167341230Z runner[848a90fd] ewr [info] Starting instance
2021-07-14T01:54:08.201459069Z runner[848a90fd] ewr [info] Configuring virtual machine
2021-07-14T01:54:08.202456796Z runner[848a90fd] ewr [info] Pulling container image
2021-07-14T01:54:08.347426891Z runner[848a90fd] ewr [info] Unpacking image
2021-07-14T01:54:08.353660775Z runner[848a90fd] ewr [info] Preparing kernel init
2021-07-14T01:54:08.789044050Z runner[848a90fd] ewr [info] Configuring firecracker
2021-07-14T01:54:08.816168329Z runner[848a90fd] ewr [info] Starting virtual machine
2021-07-14T01:54:08.953194478Z app[848a90fd] ewr [info] Starting init (commit: cc4f071)...
2021-07-14T01:54:08.970114563Z app[848a90fd] ewr [info] Running: `bin/fitclub start` as nobody
2021-07-14T01:54:08.979830489Z app[848a90fd] ewr [info] 2021/07/14 01:54:08 listening on [fdaa:0:285f:a7b:ab3:848a:90fd:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T01:54:09.978090339Z app[848a90fd] ewr [info] Reaped child process with pid: 547, exit code: 0
2021-07-14T01:54:13.378275290Z app[848a90fd] ewr [info] [info] Running FitclubWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2021-07-14T01:54:13.381802126Z app[848a90fd] ewr [info] [info] Access FitclubWeb.Endpoint at http://fitclub-stage.fly.dev
2021-07-14T01:54:13.983761953Z app[848a90fd] ewr [info] Reaped child process with pid: 568 and signal: SIGUSR1, core dumped? false
2021-07-14T01:54:13.984319954Z app[848a90fd] ewr [info] Reaped child process with pid: 573, exit code: 0

env.sh.eex

#!/bin/sh
ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)
# Check blank IP
if [ -z "$ip" ]
then
  ip="0:0:0:0:0:0:0:1"
fi
export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=$FLY_APP_NAME@$ip
echo "--------------------------------------------------"
echo $RELEASE_NODE
echo "--------------------------------------------------"
export ELIXIR_ERL_OPTIONS="-proto_dist inet6_tcp"

runtime.exs

import Config

if config_env() == :prod do
  secret_key_base =
    System.get_env("SECRET_KEY_BASE") ||
      raise """
      environment variable SECRET_KEY_BASE is missing.
      You can generate one by calling: mix phx.gen.secret
      """

  app_name =
    System.get_env("FLY_APP_NAME") ||
      raise "FLY_APP_NAME not available"

  config :fitclub, FitclubWeb.Endpoint,
    server: true,
    url: [host: "#{app_name}.fly.dev", port: 80],
    force_ssl: [rewrite_on: [:x_forwarded_proto]],
    http: [
      port: String.to_integer(System.get_env("PORT") || "4000"),
      # IMPORTANT: support IPv6 addresses
      transport_options: [socket_opts: [:inet6]]
    ],
    secret_key_base: secret_key_base

  database_url =
    System.get_env("DATABASE_URL") ||
      raise """
      environment variable DATABASE_URL is missing.
      For example: ecto://USER:PASS@HOST/DATABASE
      """

  config :fitclub, Fitclub.Repo,
    url: database_url,
    # IMPORTANT: Or it won't find the DB server
    socket_options: [:inet6],
    pool_size: String.to_integer(System.get_env("POOL_SIZE") || "10")
end

Looks like Elixir app not starting up - #13 by rushsteve12 is related.

My app is running on a Shared CPU with 512Mb of memory. I’ve got AppSignal running had haven’t seen any issues - but I have scaled up memory to 1024Mb and am redeploying just to confirm.

Just a guess, but can you try removing the force_ssl: [rewrite_on: [:x_forwarded_proto]] from runtime.exs? My guess is that that’s not playing well with the health check and that’s failing and triggering restarts. In the documentation it seems to stick to http at the Phoenix level, with the https being handled at the Fly proxy level.

Thanks for the idea - I removed the force_ssl: ... line and re-deployed.

The app deploys successfully, but still doesn’t work - below are the deploy output and logs.

--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/fitclub-stage]
a442c7500079: Pushed
eb78042abc90: Layer already exists
4f7db9ae24a2: Layer already exists
33f9c12cdc7b: Layer already exists
0f7b3ff8b310: Layer already exists
deployment-1626281588: digest: sha256:c29c5cf43942ab548bd627178c06dd315b35be068b7d42b80912cbcf0d1b5509 size: 1365
--> Pushing image done
Image: registry.fly.io/fitclub-stage:deployment-1626281588
Image size: 92 MB
==> Creating release
Release v26 created

You can detach the terminal anytime without stopping the deployment
Monitoring Deployment

1 desired, 1 placed, 1 healthy, 0 unhealthy
--> v26 deployed successfully

logs during deploy

2021-07-14T16:57:10.663138750Z runner[a4605513] ewr [info] Starting instance
2021-07-14T16:57:10.692789394Z runner[a4605513] ewr [info] Configuring virtual machine
2021-07-14T16:57:10.693759774Z runner[a4605513] ewr [info] Pulling container image
2021-07-14T16:57:13.872091572Z runner[a4605513] ewr [info] Unpacking image
2021-07-14T16:57:15.232635465Z runner[a4605513] ewr [info] Preparing kernel init
2021-07-14T16:57:15.694306580Z runner[a4605513] ewr [info] Configuring firecracker
2021-07-14T16:57:15.748714784Z runner[a4605513] ewr [info] Starting virtual machine
2021-07-14T16:57:15.912848790Z app[a4605513] ewr [info] Starting init (commit: cc4f071)...
2021-07-14T16:57:15.928643800Z app[a4605513] ewr [info] Running: `bin/fitclub start` as nobody
2021-07-14T16:57:15.937784387Z app[a4605513] ewr [info] 2021/07/14 16:57:15 listening on [fdaa:0:285f:a7b:ab2:a460:5513:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T16:57:15.944064080Z app[a4605513] ewr [info] --------------------------------------------------
2021-07-14T16:57:15.944593514Z app[a4605513] ewr [info] fitclub-stage@fdaa:0:285f:a7b:ab2:a460:5513:2
2021-07-14T16:57:15.945151552Z app[a4605513] ewr [info] --------------------------------------------------
2021-07-14T16:57:15.975026111Z app[a4605513] ewr [info] warning: -v (only in debug compiled code)
2021-07-14T16:57:16.933980486Z app[a4605513] ewr [info] Reaped child process with pid: 548, exit code: 0
2021-07-14T16:57:20.285203871Z app[a4605513] ewr [info] 16:57:20.284 [info] Running FitclubWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2021-07-14T16:57:20.288704547Z app[a4605513] ewr [info] 16:57:20.287 [info] Access FitclubWeb.Endpoint at http://fitclub-stage.fly.dev
2021-07-14T16:57:20.939714104Z app[a4605513] ewr [info] Reaped child process with pid: 569 and signal: SIGUSR1, core dumped? false
2021-07-14T16:57:20.940295125Z app[a4605513] ewr [info] Reaped child process with pid: 574, exit code: 0
2021-07-14T16:57:43.297205439Z runner[15ee65e3] vin [info] Shutting down virtual machine
2021-07-14T16:57:43.401895329Z app[15ee65e3] vin [info] Sending signal SIGINT to main child process w/ PID 508
2021-07-14T16:57:43.403974667Z app[15ee65e3] vin [info] BREAK: (a)bort (A)bort with dump (c)ontinue (p)roc info (i)nfo
2021-07-14T16:57:43.406077611Z app[15ee65e3] vin [info]        (l)oaded (v)ersion (k)ill (D)b-tables (d)distribution

Hmm can you share your fly.toml? It looks like the shutdown is ~1s after startup which is the default grace_period for the TCP check - did you forget to update that possibly?

In the guide it’s changed to:

  [[services.tcp_checks]]
    grace_period = "30s" # allow some time for startup
    interval = "15s"
    restart_limit = 6
    timeout = "2s"

Here’s my fly.toml

# fly.toml file generated for fitclub-prod on 2021-07-09T14:15:01-05:00

app = "fitclub-stage"

kill_signal = "SIGTERM"
kill_timeout = 5

[env]

[[statics]]
  guest_path = "/app/priv/static"
  url_prefix = "/public"

[deploy]
  release_command = "/app/bin/fitclub eval Fitclub.Release.migrate"

[[services]]
  internal_port = 4000
  protocol = "tcp"

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "30s" # allow some time for startup
    interval = "60s"
    restart_limit = 6
    timeout = "2s"
    method = "get"
    path = "/"
    protocol = "http"
    tls_skip_verify = true

There goes that theory!

Comparing with a working app of mine, the only thing that looks potentially significant is that I have:

[[services]]
  http_checks = []
  internal_port = 4000
  protocol = "tcp"
  script_checks = []

I have no idea what the default values are of http_checks and script_checks if they’re not set, but if they’re not [] then maybe one of them is failing and/or has a low timeout/grace period?

Tried your updated fly.toml - no dice. The app still crashes right after attempting to launch.

I’ve tried increasing to log_level: :debug as well, doesn’t help.

I’m going to look at building and running the Docker image locally. I tried this before and it worked fine, but I’m going to look at trying it again to see if I can unpack what’s going on.

If you run fly status instance <id> you might get better information on why it’s restarting. See if it says something about “restarting due to health check failure” or something?

Looks like it’s not the Fly instance - I think the app is crashing - but only on Fly - I’m trying to stand up a container to replicate it locally.

% flyctl status instance -a fitclub-stage 8374cdff

Instance
  ID            = 8374cdff
  Version       = 28
  Region        = ewr
  Desired       = run
  Status        = running
  Health Checks =
  Restarts      = 0
  Created       = 7m3s ago

Recent Events
TIMESTAMP            TYPE       MESSAGE
2021-07-14T19:28:11Z Received   Task received by client
2021-07-14T19:28:11Z Task Setup Building Task Directory
2021-07-14T19:28:15Z Started    Task started by client

Checks
ID SERVICE STATE OUTPUT

Recent Logs

This looks related: Server down: "can't run hallpass" - #3 by luke - I’m getting a similar error message.

I tore down my app - removed all running child processes from the Elixir Application (see below)

Any ideas @mcintyre1994 @kurt @thomas ?

defmodule Fitclub.Application do
  # See https://hexdocs.pm/elixir/Application.html
  # for more information on OTP Applications
  @moduledoc false

  use Application

  def start(_type, _args) do
    IO.puts("Arrived: #{__MODULE__}")

    children = [
    ]

    # See https://hexdocs.pm/elixir/Supervisor.html
    # for other strategies and supported options
    opts = [strategy: :one_for_one, name: Fitclub.Supervisor]
    Supervisor.start_link(children, opts)
  end

  # Tell Phoenix to update the endpoint configuration
  # whenever the application is updated.
  def config_change(changed, _new, removed) do
    FitclubWeb.Endpoint.config_change(changed, removed)
    :ok
  end
end

Logs with that code:

021-07-14T20:04:38.555964777Z runner[d9e3500d] yyz [info] Starting instance
2021-07-14T20:04:38.600656806Z runner[d9e3500d] yyz [info] Configuring virtual machine
2021-07-14T20:04:38.609471948Z runner[d9e3500d] yyz [info] Pulling container image
2021-07-14T20:04:41.537722114Z runner[d9e3500d] yyz [info] Unpacking image
2021-07-14T20:04:46.281097492Z runner[d9e3500d] yyz [info] Preparing kernel init
2021-07-14T20:04:47.379751262Z runner[d9e3500d] yyz [info] Configuring firecracker
2021-07-14T20:04:47.457521537Z runner[d9e3500d] yyz [info] Starting virtual machine
2021-07-14T20:04:47.712698889Z app[d9e3500d] yyz [info] Starting init (commit: cc4f071)...
2021-07-14T20:04:47.745539451Z app[d9e3500d] yyz [info] Running: `bin/fitclub start` as nobody
2021-07-14T20:04:47.762220274Z app[d9e3500d] yyz [info] 2021/07/14 20:04:47 listening on [fdaa:0:285f:a7b:aa2:d9e3:500d:2]:22 (DNS: [fdaa::3]:53)
2021-07-14T20:04:47.775132582Z app[d9e3500d] yyz [info] --------------------------------------------------
2021-07-14T20:04:47.775808256Z app[d9e3500d] yyz [info] fitclub-stage@fdaa:0:285f:a7b:aa2:d9e3:500d:2
2021-07-14T20:04:47.776524933Z app[d9e3500d] yyz [info] --------------------------------------------------
2021-07-14T20:04:48.755419849Z app[d9e3500d] yyz [info] Reaped child process with pid: 546, exit code: 0
2021-07-14T20:04:52.763455504Z app[d9e3500d] yyz [info] Reaped child process with pid: 567 and signal: SIGUSR1, core dumped? false
2021-07-14T20:04:52.953632836Z app[d9e3500d] yyz [info] Arrived: Elixir.Fitclub.Application
2021-07-14T20:04:53.766825477Z app[d9e3500d] yyz [info] Reaped child process with pid: 570, exit code: 0
2021-07-14T20:05:12.550813569Z runner[bd408c1b] ord [info] Shutting down virtual machine
2021-07-14T20:05:12.631356554Z app[bd408c1b] ord [info] Sending signal SIGINT to main child process w/ PID 507
2021-07-14T20:05:12.634659896Z app[bd408c1b] ord [info] BREAK: (a)bort (A)bort with dump (c)ontinue (p)roc info (i)nfo
2021-07-14T20:05:12.636720549Z app[bd408c1b] ord [info]        (l)oaded (v)ersion (k)ill (D)b-tables (d)istribution

It’s probably not what’s going on but I think your TCP check will probably fail now since you’re no longer starting the endpoint - but I’m sure that’d look different in the logs where this looks exactly the same. I don’t think I have any useful suggestions sorry! But if the children-less version of your app can be put on eg. Github I’d be happy to try deploying it and see if I get the same logs - might narrow things down a bit :slight_smile:

I destroyed my apps and created them from scratch under new slugs, and now everything is fine.

Didn’t change any code. I will monitor and report back if something changes.

1 Like

Just stumbled upon similar issue. FYI things started to go wrong when I’ve scaled the app down from 2 to 1 instance and removed backup regions, causing some service shuffling. Then the exact app that has previously worked stopped to serve requests…

Deployment is successful, app is running yet I’m getting Reaped child process with pid: 568 and signal: SIGUSR1, core dumped? false all the time.

I eventually solved this by cleaning up my deploy command.

After being unable to solve this issue, I realized that deploying via Github Actions was the culprit. I could deploy manually and the app ran fine, but when attempting to deploy via CI it would hang.

Hope that helps

@mikehostetler Thanks for the info. Indeed I’ve also switched to Github Actions not long before the issues, so this may’ve been the culprit. I’ve also deployed directly and app started working again. I wonder what’s going on - I’d really like to deploy via CI. The worst part is that the fly app suddenly starts misbehaving and yet all outputs like fly status and fly logs look 100% normal.

(btw, Reaped child process ... seems to be a completely normal log message for Phoenix apps so that part was irrelevant.)

I don’t have the Github Action that didn’t work for me - but here’s the Github Action that I now use and it works fine.

I am following a Monorepo pattern, so my primary Elixir/Phoenix server sits in the ./fitclub_server directory within my repository.

name: "Deploy Server Stage"
on:
  workflow_dispatch:

env:
  FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}

jobs:
  deploy:
    name: Deploy Server to Staging
    runs-on: ubuntu-18.04
    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - uses: superfly/flyctl-actions@1.1
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
          FLY_PROJECT_PATH: "./fitclub_server"
        with:
          args: "deploy -a hrmfitclub-stage --config ./fly.stage.toml"

:+1: If you’d like to make it faster via Docker layer caching, here’s mine: GitHub Action Docker image cache - #4 by ksluszniak.

1 Like