Overriding staged secrets for Machines

The CI we have setup --stages secrets (the last one was staged on Nov 4) because rolling deploys (which setting secrets triggers) don’t quite work for our app (udns) with 40+ machine VMs.

These staged secrets were deployed today (Nov 7). And because those broke prod, I am trying to stage a new set of secrets without success. Is there a time limit to when staged secrets are deleted to allow for newer ones?

Right now, it seems like setting or staging secrets anew against the same key results in no-ops, and fly secrets list continues to show the older secret. No amount of deploys / m updates are making any difference.

# set just now
➜  fly secrets set TLS_CERTKEY=- < certkey -a udns --stage
Secrets have been staged, but not set on VMs. Deploy or update machines in this app for the secrets to take effect.

# shows older staged secret
➜  fly secrets list -a udns                               
NAME       	DIGEST          	CREATED AT           
TLS_CERTKEY	1757f45b147a02ef	2022-11-04T22:17:55Z	

# deploy or update
fly deploy --config fly.machines.toml -a udns --remote-only --strategy immediate --dockerfile ./node.Dockerfile

# shows older staged secret
➜  fly secrets list -a udns                               
NAME       	DIGEST          	CREATED AT           
TLS_CERTKEY	1757f45b147a02ef	2022-11-04T22:17:55Z	

I’m not able to reproduce this on a machines app.

nginx $ fly secrets set A=3 --stage
Secrets have been staged, but not set on VMs. Deploy or update machines in this app for the secrets to take effect.
nginx $ fly secrets list
NAME	DIGEST          	CREATED AT
A   	f0247b96459e8044	2s ago

nginx $ fly secrets set A=4 --stage
Secrets have been staged, but not set on VMs. Deploy or update machines in this app for the secrets to take effect.
nginx $ fly secrets list
NAME	DIGEST          	CREATED AT
A   	07e97d0e40ef95b6	3s ago

I too often see secrets set commands having no effect. One reproducible way I’ve found to trigger this error is setting a value that has been used before. So based on your example @jsierles I get:

$ fly secrets set A=3 --stage
Secrets have been staged, but not set on VMs. Deploy or update machines in this app for the secrets to take effect.
$ fly secrets list
NAME	DIGEST          	CREATED AT
A   	f0247b96459e8044	2s ago
$ fly secrets set A=4 --stage
Secrets have been staged, but not set on VMs. Deploy or update machines in this app for the secrets to take effect.
$ fly secrets list
NAME	DIGEST          	CREATED AT
A   	07e97d0e40ef95b6	3s ago
$ fly secrets set A=3 --stage
Secrets have been staged, but not set on VMs. Deploy or update machines in this app for the secrets to take effect.
$ fly secrets list
NAME	DIGEST          	CREATED AT
A   	07e97d0e40ef95b6	3s ago
$ fly secrets set A=4 --stage
Error No change detected to secrets. Skipping release.
1 Like

Interesting. Thanks for the repro. Will take a closer look!

1 Like

I believe we have a workaround in place for this now, can you give it another try?

2 Likes

Can’t reproduce anymore so I’d say that worked, thank you! Tests on my Terraform secrets PR passing now too :slight_smile:

1 Like

Hi Kurt,

Yes, it works now! Just deployed 50m ago. Thanks.

Btw, I monitored udns for on and off this entire week when it was kind of down (as in, zero customer traffic was routed to it), and without fail, Machines were being spun up consistently in multiple regions (and often times, the same Machine is spun immediately after it goes down with exit-code 0) but no requests were sent to it (the logs in our process indicate at this). Explains why our previous bill was $80+/mo (close to the maximum we could have paid for the 40-odd machines we were running). We send ~15% of the prod traffic to udns and serving it costs us 2x than what we pay our existing provider.