Hello. We deploy our application from a “rolling” tag on DockerHub. Every time we commit to our main branch on GitHub, we push to a tag called next
on DockerHub (a lot of projects on DockerHub use a tag called latest
in the same way). Our deploys are manually triggered. When we run a deploy, our deploy command is:
flyctl deploy --remote-only --image registry.hub.docker.com/shieldsio/shields:next
We also regularly run scaling events on a schedule to increase/decrease the number of VMs we are running at different times of day, so at various points throughout the day we will run a command like
flyctl scale count 14 --yes
Recently, I noticed a problem which boiled down to some of our machines running one version of the application and some machines running a different version. Both versions concerned corresponded to code that was pushed after the most recent deploy (i.e: code I thought had not been deployed yet!).
My theory is that when scaling events occur and we need to add machines, they are now pulling the latest image from registry.hub.docker.com/shieldsio/shields:next
instead of using whatever image the currently running machines are using. So if there is something new pushed to the next
tag, any new machines come up running that image but any existing machines in the cluster are still running an older one.
I am pretty sure this is new behaviour. We’ve deploying from the next
tag since we migrated to fly.io over a year ago and I’m fairly certain that until recently adding new machines used whatever image the currently running machines are using, even if a newer image was pushed to the next
tag on DockerHub. Maybe this is a new change since the migration to fly apps v2?
For the moment I am going to switch to deploying from a specific digest. So instead of deploying registry.hub.docker.com/shieldsio/shields:next
, our deploy command will become something like
flyctl deploy --remote-only --image registry.hub.docker.com/shieldsio/shields@sha256:0dc4ac18729b5fafb7191cd06055de51fd67b0e9bc25e8a3c6ca413c5281dcf6
each time.
Would it be possible to confirm:
- Is what we are seeing intended behaviour, or unexpected?
- Is this behaviour new since the migration to fly apps v2, or some other change?
Thanks