Currently with fly.io, running flyctl launch on a Rails application will produce a Dockerfile. That Dockerfile can be used to deploy your application on fly.io or elsewhere.
Starting with Rails 7.1, running rails newwill produce Dockerfiles. For many (and likely most) people this Dockerfile will work without modification on fly.io.
Also currently being explored is Dockerfile-less Deploys. The aim here is that the fly-rails gen doesn’t merely provide you with a starting point but to maintain your Dockerfile for you.
This provides all sorts of possibilities. There are some things that the new Dockerfile in Rails does better than the one than fly.io has been producing. And vice versa.
A potential roadmap:
For Rails 7.1 applications, fly launch should leave the Dockerfile alone and use it. Everything that fly.io currently does to produce a Dockerfile that is of interest to the Rails team will be converted into pull requests and made available to all.
For Rails applications prior to 7.1, fly launch should produce a Dockerfile that closely matches the Dockerfile that Rails produces for 7.1 applications. There may be a temptation to do better, but I think we should resist that as it would result in a perceived downgrade and inconsistency for Rails 7.1 and later applications.
Everything else should be opt in, likely via the fly-rails gem.
This roadmap is being put out for discussion.
The things that the Rails Dockerfile already does better, namely making use of rails db:prepare and snapboot when available, have already been added to the fly-rails gem. Over time this support will make its way into flyctl launch.
The remainder of this post identifies things that the Rails Dockerfile and the fly-rails gem do differently so that together with the Rails team we can make an informed decision as to where that support belongs.
Differences
For shorthand, the following will refer to the dockerfiles being compared as “7.1” and “fly”, not making a distinction between flyctl launch produced Dockerfiles and fly-rails produced Dockerfiles. And initial attempt has been made to sort this list into things most likely to be of immediate benefit to Rails and what parts may end up being fly specific.
bin files on windows machines may contain strings like .exe and \r. Such won’t run on Linux machines. If detected, sed commands are added to the Dockerfile to adjust the binstubs.
If node is used, 7.1 will install the latest version of Node 19, and then use npm to install the latest yarn. Fly uses volta to install the exact versions of Node and Yarn that are used in development.
Fly uses Multi-stage builds, 7.1 does not. This results in a number of differences:
Debian packages like build-essential that are only needed at build time are present in 7.1 deployed images.
Multi-stage builds can perform stages in parallel, resulting in shorter initial builds.
Results of stages can be cached, meaning that adding a gem and redeploying will only need to install the additional gem and not perform a bundle install from scratch.
7.1 is based on the Ruby images; fly is based on the debian slim images. A packaged rails new application results in a 1.6GB image with 7.1, and a 602MB image with fly. Smaller images load faster.
7.1 images are based on MRI/CRuby memory management. Fly is based on jemalloc. The diffrence can be significant.
Fly uses Rakefiles for build, release, and server steps, facilitating customization using Ruby scripts rather than bash scripts.
Fly will allocate a swapfile to handle OOM situations.
Fly detects the use of popular packages such as rmagick, execjs, and puppeteer and adds the necessary Debian dependencies. This is something that likely can’t be applied at rails new time.
By default, 7.1 runs db:prepare on every deployed machine whereas fly will run this as a “release” step prior to implementing a rolling deployment.
Fly will configure and deploy postgresql, redis, passenger, anycable, nginx and other packages as a part of the rails deployment, including setting of appropriate secrets.
Would love some help getting the default Docker files in the best shape possible. Just right off the bat, I’d like to see:
Multi-stage builds when using node (but keep as-is on the default import map path)
Use jemalloc by default, if we can absolutely guarantee no issues with the underlying distro. Can’t risk segfaults.
Explore debian slim image, but with the directive that compatibility is more important than image size. So all major gems (and popular minor ones) must have their dependencies satisfied by the slim image.
Explore the swap file rather than die on OOM
Some questions:
Where are you tracking the node/npm version needed by Volta?
Can you show me the Rakefiles you use for these standard tasks? Including release?
Do you setup dependencies via docker-compose or another way?
The easiest way to answer these questions, if you are willing, is with a free demo. Install flyctl via brew, then login (setting up an account if necessary): Log in to Fly · Fly Docs
Then run rails new passing the -j option so you can see how node support is done. Edit config/routes:
root "rails/welcome#index"
Run fly launch, accepting the defaults, then fly deploy. Take a look a the Dockerfile and lib/tasks/fly.rake.
fly open will open a browser to your site. fly logs and fly dashboard are also of interest.
mutli-stage builds for node only - no problem, will do.
jemalloc/debian slim: these images are here: Quay. Slim means that instead of having every database you could possible use installed you need to apt-get install the database you may be using.
How much of a saving is it to do bundle install in a separate build step? Have you not found any gems that require header files or other dynamic dependencies from the build packages? If not, and the savings are material, then it seems worth to do this for the default Rails dockerfile too.
Sounds good with the slim image, but I don’t want users to have to edit the Dockerfile if they start on sqlite3 and then switch to mysql/postgresql later. So need to find a way where all 3 DBs can be supported without alterations to the Dockerfile.
bundle install with a cache can be a big savings. Add a gem to a Gemfile and that will cause the bundle install step to rerun. But instead of starting from scratch and installing everything (which can take, tens of seconds) it will only install the one gem which may take a second or so. That being said, modifying a Gemfile is a relatively rare event. But having bundle install run in parallel with yarn is worthwhile. fly deploy will show you the build steps running in real time.
Yes, there always will be more dependencies. I’ve seen people stumble because some package they use includes execjs and node needs to be installed for the server to start (or even assets:precompile) even though node isn’t actually used. Puppeteer and imagemagic are popular and have dependencies. I’m starting to collect up a list: fly-rails/Dockerfile.erb at main · superfly/fly-rails · GitHub, though that code needs to be moved out of the template.
The generated Dockerfile can install as many databases as you like. fly launch currently does sqlite3 and postgres. But it might be worth exploring a different path. Instead of generating a starter Dockerfile and requiring the developer to maintain it as new dependencies are added, how about generating the Dockerfile on every build? See Dockerfile-less-deploys · Fly. You can actually run that scenario yourself and “eject” the Dockerfile at any point.
I’m willing to contribute any or all of this to Rails itself, and maintain the rest as a separate gem.
Good point re: cache for bundle. That would be a nice level up.
I prefer having a stable Dockerfile, but thinking that if the extra dependencies for all the DB connectors just live in a build-step, then there’s no price to pay in the final image? So we might as well include the 3 majors.
Not so concerned about imagemagic (deprecated for new apps anyway, we use libvips now) or other stuff like Puppeteer. That’s one of the reasons I like a stable Dockerfile. You CAN edit it yourself, should you need to. Just want it not to be necessary for the majority of default things.
Anyway, glad we’re getting this on the road! Please cc me on any/all PRs to bring some of these improvements to the new stock Dockerfile, and I’ll review and merge immediately.
I’ll run some tests, but I believe that you will need the postgresql and mysql client at runtime (but not sqlite3 which is truly just a library); but I don’t think that this anything to worry about. There is a huge difference between “everything a Debian user might need”, and “things commonly used by Rails applications”. And as you say, users CAN edit to strip things they don’t need.
I’m quite OK with not including node for importmap applications, but there be dragons here. I went back and checked, and an example where a rails 7 importmaps user had a problem was one where boostrap requires autoprefixer-rails which requires execjs which requires nodejs - even though nodejs is not used at runtime. See: Rails app `fly deploy` failing with ExecJS::RuntimeUnavailable: Could not find a JavaScript runtime
Will do. I’m going to take a day or two to plan how to approach this, and then likely produce a steady stream of smaller pull requests. That way I can get feedback and make mid-course corrections without getting deep into the weeds making a change that Rails won’t accept.
Use jemalloc by default, if we can absolutely guarantee no issues with the underlying distro
Although using a jemalloc-backed image is great in terms of performance, I have some concerns here.
The image used by Fly (and maintained by Evil Martians) uses Fullstaq Ruby. Even though both the Docker image and other Fullstaq Ruby distributions have been battle-tested in production, they’re not official Ruby/MRI projects. This introduces potential point of failures: Fullstaq Ruby could become deprecated at some point, Evil Martians could migrate to a different Docker registry (why don’t we use Docker Hub ).
To sum up, I think, sticking with the official Ruby image would be better. As an alternative to having jemalloc by default, I’d suggest adding a guide/doc on customizing Dockerfiles mentioning performance-optimized versions. And add a link to the guide to the guide right into the Dockerfile:
# See https://guides.rubyonrails.org/docker.html for other image options
FROM ruby:...
Speaking of multi-stage builds, etc., I’d recommend checking an example from the post. The Dockerfile defines a production-builder stage to both install Ruby gems and precompile assets. The resulting production image is usually way smaller than when using single-stage builds.
Btw, Node (and node_modules/) is not the only reason for image bloat. RubyGems with C extensions could also produce a lot of bloat during bundle install (in case no pre-compiled binaries provided). Unfortunately, I’ve seen this problem in almost every Rails project I worked on.
However, we can add a dedicate command set to manage Dockerfiles (smth like bin/rails docker upgrade --interactive). So, a custom generator, which could be applied multiple times to upgrade the configuration (similarly to switching from one db to another, we have rails db:system:change).
Another idea is to separate Dockerfile template from moving parts: base image, system deps, etc. For example, in Ruby on Whales, we keep system deps in a separate file (Aptfile), and all the software versions are defined via args. So, changing Dockerfile is rarely needed. In Rails, we can add a config/deployment/docker.yml with the configuration or something like that.
For comparison, the Fullstaq Ruby images are compiled with an outdated jemalloc version (3.6.0, released in 2014) which lacked the decay-based purging feature introduced in version 4.1, which is equivalent to dirty_decay_ms:0,muzzy_decay_ms:0. (muzzy_decay_msdefaults to 0 in recent versions so doesn’t need to be specified.) However, setting a small, non-zero decay gives significant performance gains at a very slight cost in memory utilization in my experience. Using the background thread further improves performance slightly, and limiting arenas is also a performance gain for Ruby (something Heroku has extensively tested in the context of glibc malloc).
First time I have heard about fullstaq ruby. It sounds like a no-no unless the ops know the implications of using a non-official, alpha ruby distribution in production. I don’t think it’s a good default. I’d still prefer paying for the supposedly 30% extra memory .