Container reverting to old version automatically?

I have an application that does not seem to be updating or I almost suspect I am not talking to the same version that is coming up in the logs. I added version numbers all over to try to figure out what is happening, so i will show how the application I am building is not the same one I am communicating with via iex remote.

Here is what my logs look like locally when it starts up, note the version number 5

[info] Running MkrandioWeb.Endpoint with cowboy 2.9.0 at 127.0.0.1:4000 (http)
Starting airplane manager V5
[info] Access MkrandioWeb.Endpoint at http://localhost:4000
5 Initializing airplane manager 5

Now in iex still local, I can view the state of a component that I am trying to debug:

iex(2)> manager = Process.whereis(Mkrandio.AirplaneManager)
#PID<0.555.0>
iex(3)> :sys.get_state(manager)
%{
status: %{
log: [“Initializing airplane manager 5”, “Starting 100 simulators”,
“Plane supervisor has”,
“%{active: 100, specs: 100, supervisors: 0, workers: 100}”]
},
supervisor: #PID<0.557.0>,
version: 5
}

I am keeping a copy of the logs in the state so I am sure they are from that instance, and also the version number.

Now I build the container, push it and deploy:

docker build -t taguniversalmachine/mkrandio:latest .
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
taguniversalmachine/mkrandio latest 07e61623d066 52 seconds ago ​128MB

docker push taguniversalmachine/mkrandio:latest
The push refers to repository docker.io/taguniversalmachine/mkrandio

From my fly.toml:

[build]
​image =“docker.io/taguniversalmachine/mkrandio:latest

so all the tags are the same everywhere

Now deploying:

​fly deploy
==> Verifying app config
→ Verified app config
==> Building image
Searching for image ‘docker.io/taguniversalmachine/mkrandio:latest’ locally…
image found: sha256:07e61623d0668973e445bfd8b07264394addb61f89737ffe0160d36531e2beef
==> Pushing image to fly
The push refers to repository [registry.fly.io/mkrandio]

It looks like the same image I just built, and the logs reflect the correct version:

2022-01-11T17:18:57.663 app[d253c291] atl [info]Starting airplane manager V5
2022-01-11T17:18:57.663 app[d253c291] atl [info]17:18:57.663 [info] Access MkrandioWeb.Endpoint at http://mkrandio.fly.dev
2022-01-11T17:18:57.664 app[d253c291] atl [info]5 Initializing airplane manager 5
2022-01-11T17:18:57.957 app[d253c291] atl [info]5 Plane supervisor has
2022-01-11T17:18:57.957 app[d253c291] atl [info]5 %{active: 10, specs: 10, supervisors: 0, workers: 10}
2022-01-11T17:19:41.346 proxy[d253c291] atl [warn]Health check status changed ‘passing’ => ‘warning’
2022-01-11T17:20:03.049 proxy[d253c291] atl [error]Health check status changed ‘warning’ => ‘critical’
2022-01-11T17:21:43.499 app[c39acd74] dfw [info]Reaped child process with pid: 755, exit code: 0
2022-01-11T17:21:43.500 app[c39acd74] dfw [info]Reaped child process with pid: 757, exit code: 0
2022-01-11T17:21:43.500 app[c39acd74] dfw [info]Reaped child process with pid: 776 and signal: SIGUSR1, core dumped? false
2022-01-11T17:24:06.079 runner[d253c291] atl [info]Shutting down virtual machine
2022-01-11T17:24:06.138 app[d253c291] atl [info]Sending signal SIGINT to main child process w/ PID 509

So it looks like something crashed but the fly dashboard showx it running, and I can connect to it via ssh:

fly ssh console
Connecting to top1.nearest.of.mkrandio.internal… complete

And now, if I look at the state of the module in question:

iex(mkrandio@c39acd74)2>manager = Process.whereis(Mkrandio.AirplaneManager)
#PID<0.2123.0>
iex(mkrandio@c39acd74)2>:sys.get_state(manager)
%{supervisor: #PID<0.2125.0>}

I get an old version of the state which was several deploys ago before I added the status and version fields.

So my question is, how is it possible that the logs show my new code running, but when I log into the remote Iex shell, it is old code? Is fly automatically rolling back to an old version when the new version crashes? What confuses me about that is I should see logs of the old version booting up if that was the case.
Any pointers to what I am not understanding?

By default, deploys do roll back if the new version fails. When you run fly deploy, we start a canary VM and ensure that it runs healthily. If it crashes, we kill it off.

You can run fly status to see a list of all VMs running at any given time, along with our internal version number. You’ll most likely find that there’s an old version still running and the newer versions exited/failed health checks.

I suspected that, but fly status has always just shown me one instance, and in addition it says it has no stable version to revert to

App
Name = mkrandio
Owner = personal
Version = 21
Status = running
Hostname = mkrandio.fly.dev

Deployment Status
ID = 037df172-0d5d-9d2c-a9a6-3fac168e7859
Version = v21
Status = failed
Description = Failed due to unhealthy allocations - no stable job version to auto revert to
Instances = 1 desired, 1 placed, 0 healthy, 1 unhealthy

Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
c39acd74 app 7 dfw run running 1 total, 1 critical 2 2022-01-10T19:34:47Z

but as I am typing this I discovered on the dashboard under Activity that all the recent versions from 21 on down have a red dot, and one way down Version 13 has a green dot, now it all makes sense, ok well now I know what to work on.
My only quibble then is that fly status shows version 21 running then below it says version 7 so it doesn’t match the running revision 13 on the dashboard, and in any case it seems to incorrectly indicate that no stable version was found when indeed it is running an old version.

1 Like