Are other people’s apps down?

Most of my apps are giving PR_CONNECT_RESET_ERROR, and show as “pending” in the dashboard. When I try to redeploy, it hangs and then aborts. However, I don’t see any active issues listed on the status page. Is this issue unique to me, or are other Fly users experiencing app downtime right now?

It may have been due to an issue we had earlier with deploys Status - Increased error rates for

Are you still encountering problems?

1 Like

I am still experiencing the same issue as you’re describing @Curiositry.

1 Like

It looks like one of your applications is crashing due to running out of memory. You can view that by running fly vm status 3feb4378

I had one app, running in arn, which was on state “Pending”. All other apps within that org were running perfectly.
This app was only which had volume. I created volumes to two other regions and app started. Region arn app has not started. I also tried to create new volume within arn but it failed.

Do you mind sharing the name of your app and/or posting logs from the app when it started failing / after starting it with a new volume?

I rather not share the name of the app, but volume id is vol_915grnwypxprn70q.

Volume I tried to create (which has state ‘failed’ now) is vol_ez1nvxk2gdxrmxl7.

Last rows in log for failed instance:

{"event":{"provider":"runner"},"fly":{"app":{"instance":"0db9d82a","name":"REDACTED"},"region":"arn"},"host":"f1ec","log":{"level":"info"},"message":"Shutting down virtual machine","timestamp":"2023-03-16T19:09:32.706813753Z"}
{"event":{"provider":"app"},"fly":{"app":{"instance":"0db9d82a","name":"REDACTED"},"region":"arn"},"host":"f1ec","log":{"level":"info"},"message":"Sending signal SIGINT to main child process w/ PID 528","timestamp":"2023-03-16T19:09:32.951780516Z"}

@savikko this looks like a race condition. It seems like you deleted volumes and created new ones. Nomad apps take a bit for those kinds of changes to reconcile, but Nomad itself will still try (and fail) to boot VMs.

When a volume is in state “failed”, it means we haven’t successfully created it yet. We actually keep trying, though.

Yes, it looks like race condition but I doubt it was. Situation with arn instance was there and then I first tried to create volume within arn. And it failed.

Then I created volumes to other regions and those were successful so I started to delete those which were not needed anymore. Then I also tried to delete arn volumes, without success.

Anyways, my app is happy now :slight_smile:

One of my apps went down at 3:11p Eastern time and has been down ever since. (It’s now 5:16p Eastern time.) It’s in a Pending state. I just tried a flyctl deploy. It looks like that hung.

Correction: two of my apps went down at 3:11p.

1 Like

Here are the last lines in the log for one app:

2023-03-16T19:09:48.575 runner[6dba3c3b] yyz [info] Shutting down virtual machine
2023-03-16T19:09:48.756 app[6dba3c3b] yyz [info] Sending signal SIGINT to main child process w/ PID 528
2023-03-16T19:09:48.762 app[6dba3c3b] yyz [info] 2023-03-16T15:09:48-04:00 [SERVER] INFO: Shutdown requested
2023-03-16T19:09:48.762 app[6dba3c3b] yyz [info] 2023-03-16T15:09:48-04:00 [SERVER] INFO: Called signal: SIGINT
2023-03-16T19:09:48.763 app[6dba3c3b] yyz [info] 2023-03-16T15:09:48-04:00 [SERVER] INFO: Stopping all monitors
2023-03-16T19:09:50.767 app[6dba3c3b] yyz [info] 2023-03-16T15:09:50-04:00 [DB] INFO: Closing the database
2023-03-16T19:09:52.785 app[6dba3c3b] yyz [info] 2023-03-16T15:09:52-04:00 [DB] INFO: SQLite closed
2023-03-16T19:09:52.787 app[6dba3c3b] yyz [info] 2023-03-16T15:09:52-04:00 [CLOUDFLARED] INFO: Stop cloudflared
2023-03-16T19:09:52.789 app[6dba3c3b] yyz [info] 2023-03-16T15:09:52-04:00 [SERVER] INFO: Graceful shutdown successful!
2023-03-16T19:09:53.463 app[6dba3c3b] yyz [info] Starting clean up.
2023-03-16T19:09:53.463 app[6dba3c3b] yyz [info] Umounting /dev/vdc from /app/data

Here are the last lines in the log for the other app:

2023-03-16T19:09:50.928 runner[23a22887] yyz [info] Shutting down virtual machine
2023-03-16T19:09:50.981 app[23a22887] yyz [info] Sending signal SIGINT to main child process w/ PID 528
2023-03-16T19:09:51.084 app[23a22887] yyz [info] [INFO 2023-03-16 19:09:51] Server is shutting down
2023-03-16T19:09:51.255 app[23a22887] yyz [info] Starting clean up.
2023-03-16T19:09:51.255 app[23a22887] yyz [info] Umounting /dev/vdc from /app/data

The other app is also in a Pending state.

Both apps came back up at 5:36p Eastern time.

1 Like

@senyo True, though unless I’m mistaken, that’s the one app that didn’t go down today.

1 Like

Thanks for the replies everyone. All my apps are back online now; I hope yours are soon, if they aren’t already.

since yesterday: Testing local Docker instance… Nope

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.