Best practice for build-time secrets?

After diving into the documentation for how to best use build-time secrets when deploying my Next.js application, in addition to searching the forums here for pointers on best practice, I’m still somewhat unclear on what the actual best practice is for using build-time secrets in a sane and safe way. Given this, I - and presumably others too (based on the forum activity) - would deeply appreciate a recipe or guide on how to actually go about getting the build process to work when secrets are involved.

In my case, I’m deploying a Next.js application that needs stuff like DATABASE_URL and SESSION_SECRET already during the build process, and I’m running flyctl deploy from GitLab CI.

Some questions are:

  1. Should I really need to add all secrets to GitLab CI, so that they can be added with the --build-secret flag?
    • This seems to somewhat defeat the purpose of those secrets that already exist in Fly, that were only shown once for security purposes. In the GitLab settings they’re visible, although they can be masked during jobs.
  2. Is there some way of accessing the secrets set on the builder inside the docker process?
    • Could flyctl deploy inject them for me, if they’re already set?

Ideally I’d only set these secrets in a single place, and they’d be available wherever needed. I get that there probably are plenty of good reasons this isn’t the case, so what would really help is clear best practices or examples. the build secret docs shows passing in a single secret - I have 20+.

I hope this rambling plea makes any sense - I just want to migrate all my stuff to Fly.

Some related threads with unsatisfactory or missing solutions:

1 Like

Hi!

Yep, the build-time secrets documented here is the way currently. We’ve had some internal discussions on if we can improve that - so that’s in process, but there’s no firm decisions on that yet.

Partially the build-time secrets are a bit annoying due to how Docker makes us “mount” secrets into a container. I’ve had a similar question about injecting them for you during the build step - It doesn’t seem like Docker gives us a way to “mount” secrets into the build due to how Docker exposes its API there (or we haven’t found it yet).

Right, that’s what I guessed. What I’m therefore asking is for proper guidelines on how to work with this in a real life application, instead of only showing how to pass a single secret in isolation. Especially if the recommended way actually is to first add all secrets to the CI, then pass all those secrets along in the pipeline, then mounting all those secrets in one gigantic RUN command in the Dockerfile, in addition to setting all those secrets on the final app server using flyctl secrets - because doing this seems somewhat crazy.

Better docs / recipes / guidelines / tutorials would at least give some confidence in doing this.

1 Like

I’ll come right out and say it: the current best practices for build time secrets is: don’t.

I’ll sketch out four different strategies that comply with the above recommendation.

Your first example, SESSION_SECRET, is a good one. Do you really have sessions during your build process? Your first quoted example is very much related, but in a Rails context. If you want to dive into that one, take a look at rails assets:precompile in production failed due to missing master key.

The “solution” in that case is embedded in the Dockerfile that fly will generate for Rails applications:

ENV SECRET_KEY_BASE 1

It turn out that asset:precompile doesn’t involve client sessions, but does load your Rails application configuration. Including parts it doesn’t need. So the solution in that case is to pass a fake secret which enables the configuration to load.

That’s strategy one.

Strategy two is a release machine. Make deployment a two step process. Build a image that doesn’t require any secrets. Run that image on your private fly network with access to secrets and let it do its thing. One that step is complete, deploy your application normally. This technique is often used for database migrations.

Strategy three is just a minor variation on strategy three. Run the final step(s) of your build process on your deploy machine immediately before application startup. Oftentimes this can be accomplished via a shell script that runs the final build step(s) followed by starting your application. This can be used for db migration for sqlite3, and perhaps static site generation, but realistically you would want to only do this for build steps that generally are fast and rarely fail.

The fourth approach is really wild, and not for everyone. Everything you can do on your development machine can be done on a remote server, including uploading images to a docker repository and launching vms.

3 Likes

Thank you for your detailed and direct response - albeit a somewhat discouraging one :slight_smile:

I’m not expecting Fly to ever model itself after or ever become Heroku, but given your own focus and existing guides on migrating from Heroku, and the removal of Heroku free product plans, making the transition from Heroku to Fly as seamless as possible would seem to align with Fly’s interests.

That being said, the friction-less way secrets/environment is handled on Heroku is certainly something to take inspiration from, with secrets being defined in one place alone and both building and running the application using these without any extra action.

As @kurt says in one of the linked threads:

We designed the secrets this way intentionally. Admittedly, we didn’t expect Heroku to drive away their users, or we’d have done some work to make this part smooth. :slight_smile:

This not being - for the time - “smooth”, official docs or guidelines on how to re-create or just cope with having multiple secrets in a real-world scenario would at least make somewhat up for it. The build-time secrets documentation is sparse, and the example is at the border of being too simple to be useful.

As long as you can deal with radical candor, there is no time like the present to get started.

There are a number of products which can help with this. Vault, HSM, and KMS are examples. But before proceeding, let’s acknowledge that we are playing with fire:

  • You have 20+ secrets that you are willing to share with us, and we therefore have an obligation to protect.
  • We provide you the ability to run the code of your choice on our machines with full access to these secrets.
  • You have a need to be able to run deploy from a platform that neither of us control.

If you search the web it is not hard to find examples of the damage that can occur once you lose access to your secrets. I’ll decline to provide links to competitor’s woes, but I will say that such links are not hard to find.

Now back to addressing your requirements. Fundamentally the problem is one of putting all of your secrets in one place, and then making it difficult to get access to those secrets. So the question reduces to: how difficult do you want to make it to get access to your secrets?

Circling back to your requirement: cope with having multiple secrets in a real-world scenario, I can describe how Rails does it: custom credentials. The approach is absurdly simple, yet brilliantly effective:

  • A script that you run that, given a single master secret, decrypts a file if it exists, launches an editor, and upon exit from that editor encrypts the result.
  • An API that can be called from the deployed machine to extract a named secret, making use of the one master key which is deployed as a platform secret.

We could take that one step further and write a small script that extracts all of the secrets in the file, sets environment variables, then launches your application. You would then only need to modify the CMD/ENTRYPOINT in your Dockerfile to run this script instead of launching your application directly.

Realize that I am trying to walk a fine line here. What I just described is something you could chose to do, and I’ve tried to make clear the risks you will be assuming if you do so. That being said, if you wish to explore this further, I would be glad to help either extracting and adapting this code from Rails or exploring how to make use of one of the many existing tools that can help you securely manage secrets.

2 Likes

Once again thank you for your detailed reply.

I think, at this point of our startup, the ease of Heroku wins, although I really wanted this to work - we have other things that we need to prioritize. Will continue exploring Fly for my side-projects, and hopefully get accustomed to the way of doing things better.

1 Like

@rubys I’ve read through the entire thread and tried a couple of approaches but hit a snag at every turn.

I’m trying to deploy a sveltekit application, which requires a build step (which builds SSG server side generated HTML). To do this, the application needs access to various secret and not so secret env variables (in my case supabase URL, anony API key, plus typesense separate read and write keys. And that’s just the start - by the time this project is done we’ll probably have at least a dozen of these).

Yes you could argue that some of these keys might not be strictly necessary for SSG and you’d be right (the typesense write API key for example), and we could indeed set a dummy env value for that in the Dockerfile, but the majority of these variables is necessary for the build.

My issue is that the build process needs more resources than what is needed to later run the app. I’m planning to run the app on a fleet of 256 MB instances, but for the build process we need at least 512MB, maybe more later as the code size increases.

What I’ve tried so far / options I considered:

  1. just doing it on the production container right before launch: Impossible without scaling the container to a size I don’t really need to later run it.
  2. using release_command in fly.toml (with heroku bulder). This at first sounds like exactly the right place, but then, the file system is ephemeral so it’s not usable to build files for production.
  3. in the build docker file, passing the secrets one by one via github actions, mounting them etc. as described here Build Secrets · Fly Docs This is what I’ve ended up doing right now but it’s not tenable - it’s just a workaround for one or two secrets but not for a dozen.
  4. A workaround to consider is using a service like the ones you mentioned above (I’d probably choose doppler.com for ease of use) but I’d much prefer if all of this could stay with fly.io.

All this to say: I’d be very interested in something like the rails custom credentials tool you mentioned above, but implementing this myself is a bit above my paygrade. Or any other solution you could come up as part of fly.io infra. I’m sure that the above use case is quite common so it would be very useful for the community. Thanks!

OK, you definitely caught my attention with “fleet of 256 MB instances”.

A few brainstorming ideas.

  • If all you need is some memory right before launch, consider defining a swapfile. This may not be the solution you ultimately go with, but it is the quickest to get up and running. Here’s a script that you can steal from: dockerfile-rails/docker-entrypoint at c7f85b21d44078ba94a7d10c3da899227f420a0b · rubys/dockerfile-rails · GitHub
  • If you are indeed very interested in a custom credentials tool, check to see if there is one already implemented for you. node-credentials - npm looks promising.
  • You’ve got your credentials on your development machine, and made them available to github actions. You could build SSG server side generated HTML in either of those places.
  • Since you are using GitHub actions, you have a multi-stage build process. What’s one more step? :slight_smile: Define a second fly application. GIve it as much resources as it needs. Make it a fly machine so that it created when needed but goes away when you exit. A deploy of this app will have the full source to your application and access to all of your secrets. Have it do your SSG, and then do a deploy of your fleet from there. Just like how you can install flyctl on github actions, you can install flyctl on your fly application.

I definitely empathize with the problem you are dealing with, and would like to help. I’m also aware of the damage that a wholesale leaking of secrets can cause. Over time I expect fly’s auth token to become more fine grained as to the amount of access an individual token can provide. The approaches listed above should limit the impact such changes will have on you.

2 Likes

Thanks Sam! I can see you like solving puzzles like this one :slight_smile: Same here.

I think I’ll go with the last option. Sounds like the most elegant approach to me.

Good choice. Would seem like overkill if you were deploying only a single instance, but for a fleet? That’s the way I would go. It also gives you a good place to add whatever future orchestration you might need down the road.