Broke Fly Init Process Loading Docker Image Generated By Nix

Looks like I have broken the Fly app init process. I’m building our app image with Nix, uploading it to the Fly registry with skopeo, and then having fly instantiate the image. It used to work, but has stopped working

2023-07-13T20:02:48Z runner[e286000c913086] sjc [info]Pulling container image registry.fly.io/benwis-leptos:mgkyyry8s122nsd6670w0755kl5wpv7h
2023-07-13T20:02:52Z runner[e286000c913086] sjc [info]Successfully prepared image registry.fly.io/benwis-leptos:mgkyyry8s122nsd6670w0755kl5wpv7h (4.292878233s)
2023-07-13T20:02:53Z runner[e286000c913086] sjc [info]Configuring firecracker
2023-07-13T20:02:53Z app[e286000c913086] sjc [info] INFO Starting init (commit: 1d1821d)...
2023-07-13T20:02:53Z app[e286000c913086] sjc [info]thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', init/src/main.rs:401:37
2023-07-13T20:02:53Z app[e286000c913086] sjc [info]note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.244748] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00006500
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.245494] CPU: 0 PID: 1 Comm: init Not tainted 5.15.98--tigris-gHEAD #1
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.246154] Call Trace:
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.246391]  <TASK>
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.246600]  show_stack+0x52/0x5c
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.246915]  dump_stack_lvl+0x38/0x4d
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.247287]  dump_stack+0x10/0x16
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.247603]  panic+0x100/0x2b7
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.247880]  do_exit.cold+0x50/0xa0
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.248230]  do_group_exit+0x3b/0xb0
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.248674]  __x64_sys_exit_group+0x18/0x20
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.249209]  do_syscall_64+0x3b/0x90
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.249623]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.250121] RIP: 0033:0x7f9cbc44d99c
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.250527] Code: eb ef 48 8b 76 28 e9 25 0b 00 00 64 48 8b 04 25 00 00 00 00 48 8b b0 b0 00 00 00 e9 af ff ff ff 48 63 ff b8 e7 00 00 00 0f 05 <ba> 3c 00 00 00 48 89 d0 0f 05 eb f9 0f 1f 84 00 00 00 00 00 41 57
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.252222] RSP: 002b:00007ffcb2dfce48 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.252915] RAX: ffffffffffffffda RBX: 00007f9cbc0cc040 RCX: 00007f9cbc44d99c
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.253606] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000065
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.254265] RBP: 0000000000000001 R08: 0000555556d6f930 R09: 0000000000000000
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.254931] R10: 00007f9cbc4522b0 R11: 0000000000000246 R12: 00007ffcb2dfcea8
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.255598] R13: 00007ffcb2dfceb8 R14: 0000000000000000 R15: 0000000000000000
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.256246]  </TASK>
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.256488] Kernel Offset: disabled
2023-07-13T20:02:53Z app[e286000c913086] sjc [info][    0.256820] Rebooting in 1 seconds..

Anything I can do here to get this back? Redeploying does not seem to help. It might be related to this: Kernel panic starting init in a nix-built container with tini as init - #6 by jade

It was working before, but I’m not sure why it’s started doing this

:eyes: very weird. I’m looking into it.
the panic points to a line where we process environment variables, have you updated environment variables for your machine recently?

I’ve added an env var which might duplicate an env var set in the dockerfile, and added some env vars in the dockerfile itself. I think it worked after I did those, but can’t remember exactly

ok, I see, you have an environment variable SSL_CERT_FILE /nix/store/<redacted>.crt, and our init is trying to split that with a = character rather than a space. We can add some validation for that, but updating that to use a = should get you unstuck.

1 Like

Thanks so much! That was exactly it. Some minor tweaks and things are back online.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.