Possible registry outages?

charsleysa · November 10, 2021, 12:24am

Getting unhealthy deployment errors and when I check the instance logs I see the following:

2021-11-10T00:11:46.824 runner[a326d5ac] lhr [info] Starting instance
2021-11-10T00:11:46.853 runner[a326d5ac] lhr [info] Configuring virtual machine
2021-11-10T00:11:46.855 runner[a326d5ac] lhr [info] Pulling container image
2021-11-10T00:13:46.903 runner[a326d5ac] lhr [info] Pull failed, retrying (attempt #0)
2021-11-10T00:15:46.932 runner[a326d5ac] lhr [info] Pull failed, retrying (attempt #1)
2021-11-10T00:17:46.962 runner[a326d5ac] lhr [info] Pull failed, retrying (attempt #2)
2021-11-10T00:17:46.962 runner[a326d5ac] lhr [info] Pulling image failed

jsierles · November 10, 2021, 12:30am

Sorry for the trouble. We’re looking into a potential issue with our Docker registries. We’ll report back as soon as we have more information.

jsierles · November 10, 2021, 12:52am

Can you try again now? We’ve fixed the problem with the registry.

charsleysa · November 10, 2021, 1:32am

Still experiencing the issue

2021-11-10T01:20:26.467 runner[8d92de86] syd [info] Starting instance
2021-11-10T01:20:26.490 runner[8d92de86] syd [info] Configuring virtual machine
2021-11-10T01:20:26.491 runner[8d92de86] syd [info] Pulling container image
2021-11-10T01:22:26.683 runner[8d92de86] syd [info] Pull failed, retrying (attempt #0)
2021-11-10T01:24:26.786 runner[8d92de86] syd [info] Pull failed, retrying (attempt #1)
2021-11-10T01:26:26.890 runner[8d92de86] syd [info] Pull failed, retrying (attempt #2)
2021-11-10T01:26:26.890 runner[8d92de86] syd [info] Pulling image failed

The first node in the deployment started successfully but then the rest experienced the issue, making it worse due to some of the previous nodes being shutdown and the rollback unable to restore nodes.

thomas · November 10, 2021, 1:41am

We’re looking at this now. It may have cleared up (possibly a network issue outside of North America).

jsierles · November 10, 2021, 2:12am

Please do try again.

eeaa · November 10, 2021, 2:29am

I’m still seeing this in the dfw region:

flyctl --app foo logs
2021-11-10T02:17:20.743 runner[13dffc63] dfw [info] Starting instance
2021-11-10T02:17:20.771 runner[13dffc63] dfw [info] Configuring virtual machine
2021-11-10T02:17:20.772 runner[13dffc63] dfw [info] Pulling container image
2021-11-10T02:19:21.004 runner[13dffc63] dfw [info] Pull failed, retrying (attempt #0)
2021-11-10T02:21:21.145 runner[13dffc63] dfw [info] Pull failed, retrying (attempt #1)
2021-11-10T02:23:21.219 runner[13dffc63] dfw [info] Pull failed, retrying (attempt #2)
2021-11-10T02:23:21.219 runner[13dffc63] dfw [info] Pulling image failed

charsleysa · November 10, 2021, 3:09am

Still occurring

2021-11-10T02:27:43.222 runner[5745c593] lax [info] Starting instance
2021-11-10T02:27:43.290 runner[5745c593] lax [info] Configuring virtual machine
2021-11-10T02:27:43.292 runner[5745c593] lax [info] Pulling container image
2021-11-10T02:28:43.307 runner[5745c593] lax [info] Pull failed, retrying (attempt #0)
2021-11-10T02:29:43.321 runner[5745c593] lax [info] Pull failed, retrying (attempt #1)
2021-11-10T02:30:43.332 runner[5745c593] lax [info] Pull failed, retrying (attempt #2)
2021-11-10T02:30:43.332 runner[5745c593] lax [info] Pulling image failed

kurt · November 10, 2021, 3:26am

Did that rollback a deploy or is the version you want running now?

kurt · November 10, 2021, 3:34am

If pull issues are breaking deploys (these are somewhat intermittent), you can get going by disabling auto rollback:

[experimental]
  auto_rollback = false

If a VM fails during deploys, it leaves the rest in place. You can then run fly status to see outdated VMs, and try stopping them one by one with fly vm stop <id> to get them updated to the newest version.

charsleysa · November 10, 2021, 3:37am

It seems to be fixed now, managed to deploy without issues.

kurt · November 10, 2021, 3:39am

Ok well I guess the trick is not to say “hey it’s fixed, try again”.

We’re still monitoring for these kind of errors. Feel free to post if you hit another.

Topic		Replies	Views
LHR volume unmountable Questions / Help lhr , volumes	2	50	April 24, 2024
App is down and can't deploy because machine is unreachable	19	449	September 16, 2023
Deployment failed: Pull image failed[lhr]	4	323	March 10, 2022
Service crashed	2	327	September 23, 2021
Database down again (lhr)	6	684	April 11, 2021

Possible registry outages?

Related Topics