Apps dying unexpectedly

After changing the autoscale settings (and reverting back to normal), the apps keeps dying. There seems to be some automatic scale change step before dying (even when autoscaling is disabled). See v32->33

❯ flyctl releases list
VERSION STABLE TYPE     STATUS    DESCRIPTION                                    USER  DATE                 
v34     true   scale    dead      Update autoscaling config                      email 4s ago               
v33     true   scale    dead                                                           15m56s ago           
v32     true   scale    succeeded                                                      16m46s ago           
v31     true   scale    succeeded Scale VM count: ["processB, 1", "processA, 1"] email 1h9m ago             
v30     false  rollback failed    Reverting to version 25                              2h16m ago            
v29     false  rollback failed    Reverting to version 25                              2h16m ago            
v28     false  scale    failed    Scale VM count: ["processB, 1", "processA, 1"] email 4h19m ago            
v27     true   scale    dead      Update autoscaling config                      email 4h22m ago            
v26     true   scale    dead                                                           5h16m ago            
v25     true   scale    succeeded                                                      5h36m ago            
v24     true   scale    succeeded Update autoscaling config                      email 6h6m ago             
v23     false  scale    failed    Scale VM count: ["processA, 1", "processB, 1"] email 6h10m ago            
v22     true   release  succeeded Deploy image                                   email 6h15m ago            
v21     true   release  succeeded Deploy image                                   email 6h23m ago            
v20     true   release  succeeded Deploy image                                   email 6h33m ago            
v19     true   rollback dead      Reverting to version 17                              6h37m ago            
v17     true   release  succeeded Deploy image                                   email 6h39m ago            
v16     true   scale    succeeded Update autoscaling config                      email 6h40m ago            
v15     true   release  succeeded Deploy image                                   email 2021-10-28T05:30:06Z

Note: I tried to restart app with flyctl scale count procA=1 procB=1 But it dies again after some time.

Though, another app with same contents and config; but never saw autoscaling or scale change seems to keep running and working.

What could be happening here?

Edit: Now unable to deploy either:

❯ flyctl autoscale show
     Scale Mode: Balanced
      Min Count: 3
      Max Count: 10
❯ flyctl scale show
VM Resources for <app-id>
        VM Size: shared-cpu-1x
      VM Memory: 256 MB
          Count: procB=0 procA=0 
 Max Per Region: procC=0 procB=0 procA=0 
❯ flyctl releases list
VERSION STABLE TYPE     STATUS    DESCRIPTION                              USER  DATE       
v43     false  release  failed    Deploy image                             email 4m2s ago   
v42     false  release  failed    Deploy image                             email 6m31s ago  
v41     true   scale    dead      Update autoscaling config                email 27m40s ago 
v40     false  rollback failed    Reverting to version 32                        1h16m ago  
v39     false  rollback failed    Reverting to version 32                        1h20m ago  
v38     false  scale    failed    Scale VM count: ["procA, 1", "procB, 1"] email 2h7m ago   
v37     false  release  failed    Secrets updated                          email 2h8m ago   
v36     false  release  failed    Secrets updated                          email 2h22m ago  
v35     false  release  failed    Secrets updated                          email 2h25m ago

With LOG_LEVEL=debug flyctl deploy deployment status is null.

Release v43 created

You can detach the terminal anytime without stopping the deployment
Monitoring Deployment
DEBUG --> POST https://api.fly.io/graphql {{"query":"query ($appName: String!, $deploymentId: ID!) { app(name: $appName) { deploymentStatus(id: $deploymentId) { id inProgress status successful description version desiredCount placedCount healthyCount unhealthyCount allocations { id idShort status region desiredStatus version healthy failed canary restarts checks { status serviceName } } } } }","variables":{"appName":"rdns","deploymentId":""}}
}
DEBUG <-- 200 https://api.fly.io/graphql (459.77ms) {"data":{"app":{"deploymentStatus":null}}}

The fly checks list command shows if the app is failing any health checks, that’s usually the reason apps die - do you see any failing checks?

Nope. Empty.

❯ flyctl checks list
Health Checks for <app-id>
NAME STATUS ALLOCATION REGION TYPE LAST UPDATED OUTPUT

flyctl scale count procA=1 procB=1 spins up instances but they shutdown after about 40mins.

@amithm7 I’ve pulled this out to a separate thread, easier to track. Are you still having a problem?

I had destroyed and relaunched the app with same app id after 2 days (2021-11-01T07:19:28Z), which is a drastic solution that can happen only while we are in testing phase I guess.

It would be great if there could be some clarity before we go live about a few:

  • If this had happened because it was a multi-process app? Should we not do multi-process apps for now that it is preview and not ready?
  • Or had it happened because scale count minimum was set to 0 at some point?

Also, I don’t quite understand how autoscaling interacts with count scaling. Does autoscaling always override count scaling?
I’m asking because flyctl scale count procA=1 procB=1 was restarting the app for sometime, which is count scaling if I am right.

I don’t think autoscaling and count scaling work well together — I’d suggest disabling autoscaling (it’s disabled by default) if you’re setting scale counts. And don’t think multi-process is too compatible with autoscaling as well, if you’re trying to control each process separately it’s very hard to express that under the autoscaling primitives.

This is the first (and as of now only) report we have of the problem, so I’m inclined to think it’s specific to this configuration. I’d suggest disabling autoscaling, setting scaling counts as you’re doing and we can monitor the app.

2 Likes