Preparations for Rails 7.1

rubys · December 21, 2022, 1:05am

Introduction

Currently with fly.io, running flyctl launch on a Rails application will produce a Dockerfile. That Dockerfile can be used to deploy your application on fly.io or elsewhere.

Starting with Rails 7.1, running rails new will produce Dockerfiles. For many (and likely most) people this Dockerfile will work without modification on fly.io.

Also currently being explored is Dockerfile-less Deploys. The aim here is that the fly-rails gen doesn’t merely provide you with a starting point but to maintain your Dockerfile for you.

This provides all sorts of possibilities. There are some things that the new Dockerfile in Rails does better than the one than fly.io has been producing. And vice versa.

A potential roadmap:

For Rails 7.1 applications, fly launch should leave the Dockerfile alone and use it. Everything that fly.io currently does to produce a Dockerfile that is of interest to the Rails team will be converted into pull requests and made available to all.
For Rails applications prior to 7.1, fly launch should produce a Dockerfile that closely matches the Dockerfile that Rails produces for 7.1 applications. There may be a temptation to do better, but I think we should resist that as it would result in a perceived downgrade and inconsistency for Rails 7.1 and later applications.
Everything else should be opt in, likely via the fly-rails gem.

This roadmap is being put out for discussion.

The things that the Rails Dockerfile already does better, namely making use of rails db:prepare and snapboot when available, have already been added to the fly-rails gem. Over time this support will make its way into flyctl launch.

The remainder of this post identifies things that the Rails Dockerfile and the fly-rails gem do differently so that together with the Rails team we can make an informed decision as to where that support belongs.

Differences

For shorthand, the following will refer to the dockerfiles being compared as “7.1” and “fly”, not making a distinction between flyctl launch produced Dockerfiles and fly-rails produced Dockerfiles. And initial attempt has been made to sort this list into things most likely to be of immediate benefit to Rails and what parts may end up being fly specific.

bin files on windows machines may contain strings like .exe and \r. Such won’t run on Linux machines. If detected, sed commands are added to the Dockerfile to adjust the binstubs.
If node is used, 7.1 will install the latest version of Node 19, and then use npm to install the latest yarn. Fly uses volta to install the exact versions of Node and Yarn that are used in development.
Fly uses Multi-stage builds, 7.1 does not. This results in a number of differences:
- Debian packages like build-essential that are only needed at build time are present in 7.1 deployed images.
- Multi-stage builds can perform stages in parallel, resulting in shorter initial builds.
- Results of stages can be cached, meaning that adding a gem and redeploying will only need to install the additional gem and not perform a bundle install from scratch.
7.1 is based on the Ruby images; fly is based on the debian slim images. A packaged rails new application results in a 1.6GB image with 7.1, and a 602MB image with fly. Smaller images load faster.
7.1 images are based on MRI/CRuby memory management. Fly is based on jemalloc. The diffrence can be significant.
Fly uses Rakefiles for build, release, and server steps, facilitating customization using Ruby scripts rather than bash scripts.
Fly will allocate a swapfile to handle OOM situations.
Fly detects the use of popular packages such as rmagick, execjs, and puppeteer and adds the necessary Debian dependencies. This is something that likely can’t be applied at rails new time.
By default, 7.1 runs db:prepare on every deployed machine whereas fly will run this as a “release” step prior to implementing a rolling deployment.
Fly will configure and deploy postgresql, redis, passenger, anycable, nginx and other packages as a part of the rails deployment, including setting of appropriate secrets.

DHH · December 21, 2022, 7:38am

Hey Sam,

Would love some help getting the default Docker files in the best shape possible. Just right off the bat, I’d like to see:

Multi-stage builds when using node (but keep as-is on the default import map path)
Use jemalloc by default, if we can absolutely guarantee no issues with the underlying distro. Can’t risk segfaults.
Explore debian slim image, but with the directive that compatibility is more important than image size. So all major gems (and popular minor ones) must have their dependencies satisfied by the slim image.
Explore the swap file rather than die on OOM

Some questions:

Where are you tracking the node/npm version needed by Volta?
Can you show me the Rakefiles you use for these standard tasks? Including release?
Do you setup dependencies via docker-compose or another way?

rubys · December 21, 2022, 7:58am

The easiest way to answer these questions, if you are willing, is with a free demo. Install flyctl via brew, then login (setting up an account if necessary): Log in to Fly · Fly Docs

Then run rails new passing the -j option so you can see how node support is done. Edit config/routes:

root "rails/welcome#index"

Run fly launch, accepting the defaults, then fly deploy. Take a look a the Dockerfile and lib/tasks/fly.rake.

fly open will open a browser to your site. fly logs and fly dashboard are also of interest.

rubys · December 21, 2022, 8:09am

Addressing your points:

mutli-stage builds for node only - no problem, will do.
jemalloc/debian slim: these images are here: Quay. Slim means that instead of having every database you could possible use installed you need to apt-get install the database you may be using.
See flyctl/fly.rake at cebb10222899cc767e9866d541c19c62fb6ba25f · superfly/flyctl · GitHub for swapfile

As to node/npm version, node -v, yarn -v, and these are placed in the Dockerfile.

The whole rakefile can be found here: flyctl/fly.rake at cebb10222899cc767e9866d541c19c62fb6ba25f · superfly/flyctl · GitHub

We don’t use docker-compose. If you run fly launch we prompt for things like postgres and redis and set secrets as needed.

As an alternative to fly launch with prompts, I have been experimenting with a rails generator with thor flags: GitHub - superfly/fly-rails: Rails support for Fly-io

DHH · December 21, 2022, 8:54am

Just had a look at the Fly Dockerfile. Curious:

How much of a saving is it to do bundle install in a separate build step? Have you not found any gems that require header files or other dynamic dependencies from the build packages? If not, and the savings are material, then it seems worth to do this for the default Rails dockerfile too.
Sounds good with the slim image, but I don’t want users to have to edit the Dockerfile if they start on sqlite3 and then switch to mysql/postgresql later. So need to find a way where all 3 DBs can be supported without alterations to the Dockerfile.

rubys · December 21, 2022, 9:27am

bundle install with a cache can be a big savings. Add a gem to a Gemfile and that will cause the bundle install step to rerun. But instead of starting from scratch and installing everything (which can take, tens of seconds) it will only install the one gem which may take a second or so. That being said, modifying a Gemfile is a relatively rare event. But having bundle install run in parallel with yarn is worthwhile. fly deploy will show you the build steps running in real time.

Yes, there always will be more dependencies. I’ve seen people stumble because some package they use includes execjs and node needs to be installed for the server to start (or even assets:precompile) even though node isn’t actually used. Puppeteer and imagemagic are popular and have dependencies. I’m starting to collect up a list:
fly-rails/Dockerfile.erb at main · superfly/fly-rails · GitHub, though that code needs to be moved out of the template.

The generated Dockerfile can install as many databases as you like. fly launch currently does sqlite3 and postgres. But it might be worth exploring a different path. Instead of generating a starter Dockerfile and requiring the developer to maintain it as new dependencies are added, how about generating the Dockerfile on every build? See Dockerfile-less-deploys · Fly. You can actually run that scenario yourself and “eject” the Dockerfile at any point.

I’m willing to contribute any or all of this to Rails itself, and maintain the rest as a separate gem.

DHH · December 21, 2022, 9:41am

Good point re: cache for bundle. That would be a nice level up.

I prefer having a stable Dockerfile, but thinking that if the extra dependencies for all the DB connectors just live in a build-step, then there’s no price to pay in the final image? So we might as well include the 3 majors.

Not so concerned about imagemagic (deprecated for new apps anyway, we use libvips now) or other stuff like Puppeteer. That’s one of the reasons I like a stable Dockerfile. You CAN edit it yourself, should you need to. Just want it not to be necessary for the majority of default things.

Anyway, glad we’re getting this on the road! Please cc me on any/all PRs to bring some of these improvements to the new stock Dockerfile, and I’ll review and merge immediately.

rubys · December 21, 2022, 4:00pm

I’ll run some tests, but I believe that you will need the postgresql and mysql client at runtime (but not sqlite3 which is truly just a library); but I don’t think that this anything to worry about. There is a huge difference between “everything a Debian user might need”, and “things commonly used by Rails applications”. And as you say, users CAN edit to strip things they don’t need.

I’m quite OK with not including node for importmap applications, but there be dragons here. I went back and checked, and an example where a rails 7 importmaps user had a problem was one where boostrap requires autoprefixer-rails which requires execjs which requires nodejs - even though nodejs is not used at runtime. See: Rails app `fly deploy` failing with ExecJS::RuntimeUnavailable: Could not find a JavaScript runtime

Will do. I’m going to take a day or two to plan how to approach this, and then likely produce a steady stream of smaller pull requests. That way I can get feedback and make mid-course corrections without getting deep into the weeds making a change that Rails won’t accept.

palkan · December 22, 2022, 1:28am

Hey everyone

Re: jemalloc vs slim vs whatever

Use jemalloc by default, if we can absolutely guarantee no issues with the underlying distro

Although using a jemalloc-backed image is great in terms of performance, I have some concerns here.
The image used by Fly (and maintained by Evil Martians) uses Fullstaq Ruby. Even though both the Docker image and other Fullstaq Ruby distributions have been battle-tested in production, they’re not official Ruby/MRI projects. This introduces potential point of failures: Fullstaq Ruby could become deprecated at some point, Evil Martians could migrate to a different Docker registry (why don’t we use Docker Hub ).

To sum up, I think, sticking with the official Ruby image would be better. As an alternative to having jemalloc by default, I’d suggest adding a guide/doc on customizing Dockerfiles mentioning performance-optimized versions. And add a link to the guide to the guide right into the Dockerfile:

# See https://guides.rubyonrails.org/docker.html for other image options
FROM ruby:...

rubys · December 22, 2022, 1:35am

Perhaps there could be a --jemalloc flag on Rails new? And --slim?

palkan · December 22, 2022, 1:38am

(And while I’m here)

Speaking of multi-stage builds, etc., I’d recommend checking an example from the post. The Dockerfile defines a production-builder stage to both install Ruby gems and precompile assets. The resulting production image is usually way smaller than when using single-stage builds.

Btw, Node (and node_modules/) is not the only reason for image bloat. RubyGems with C extensions could also produce a lot of bloat during bundle install (in case no pre-compiled binaries provided). Unfortunately, I’ve seen this problem in almost every Rails project I worked on.

palkan · December 22, 2022, 1:42am

I think, that’s too much for rails new

However, we can add a dedicate command set to manage Dockerfiles (smth like bin/rails docker upgrade --interactive). So, a custom generator, which could be applied multiple times to upgrade the configuration (similarly to switching from one db to another, we have rails db:system:change).

Another idea is to separate Dockerfile template from moving parts: base image, system deps, etc. For example, in Ruby on Whales, we keep system deps in a separate file (Aptfile), and all the software versions are defined via args. So, changing Dockerfile is rarely needed. In Rails, we can add a config/deployment/docker.yml with the configuration or something like that.

wjordan · December 22, 2022, 3:18pm

I’d prefer we stick with the official Ruby images as well. Using jemalloc with Ruby isn’t all that complicated:

RUN apt-get install libjemalloc2
ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

You can also set MALLOC_CONF to tune the performance/memory tradeoff. I’ve had good results with the line:

ENV MALLOC_CONF=dirty_decay_ms:1000,narenas:2,background_thread:true

For comparison, the Fullstaq Ruby images are compiled with an outdated jemalloc version (3.6.0, released in 2014) which lacked the decay-based purging feature introduced in version 4.1, which is equivalent to dirty_decay_ms:0,muzzy_decay_ms:0. (muzzy_decay_ms defaults to 0 in recent versions so doesn’t need to be specified.) However, setting a small, non-zero decay gives significant performance gains at a very slight cost in memory utilization in my experience. Using the background thread further improves performance slightly, and limiting arenas is also a performance gain for Ruby (something Heroku has extensively tested in the context of glibc malloc).

rubys · December 22, 2022, 3:49pm

First pull request: Change dockerfile from using Node 19 to match dev environment by rubys · Pull Request #46794 · rails/rails · GitHub

damel · January 16, 2023, 7:57pm

Dockerfile.fly? Keep both

rubys · January 16, 2023, 8:40pm

Why have two files with the same content?

Seriously, the Dockerfile produced by the changes already merged into Rails main approach what fly already produces, and the generator at GitHub - rubys/dockerfile-rails: Provide Rails generators to produce Dockerfiles and related files. already surpasses it. Give it a try, and let us know if see anything missing - either with issues or pull requests or by posting here.

Having Rails produce a Dockerfile that is the best that we collectively can create is in everybody’s best interest.

dbackeus · January 17, 2023, 2:47pm

Another issue to consider regarding fullstaq-ruby (for built in jemalloc): there is currently no ARM support which would be a deal breaker in quite a few contexts- Raspberry Pi/ARM version? · Issue #38 · fullstaq-ruby/server-edition · GitHub

rubys · January 17, 2023, 3:22pm

dockerfile-rails defaults to the ruby slim image. jemmaloc and fullstaq are opt-in.

containerops · January 17, 2023, 5:52pm

Awesome work.

First time I have heard about fullstaq ruby. It sounds like a no-no unless the ops know the implications of using a non-official, alpha ruby distribution in production. I don’t think it’s a good default. I’d still prefer paying for the supposedly 30% extra memory .

rubys · January 17, 2023, 6:54pm

Trust me, we are entirely OK with that

Topic		Replies	Views
Rails 7+ applications do not need a `dockerfile-rails` gem Questions / Help rails	12	615	March 21, 2024
Error failed to fetch an image or build from source...Unknown desc = quay.io/evl.ms/fullstaq-ruby:3.0.1-jemalloc-bullseye-slim: not found postgres , rails	33	2200	December 29, 2022
Cut over to Rails Dockerfile Generator on Sunday 29 Jan 2023 announcement , rails	3	1028	January 29, 2023
[Rails] Can't deploy an app following the Getting Started guide Questions / Help rails	24	4294	February 27, 2023
Existing Rails app errors out on fly-deploy rails	21	2729	March 3, 2023

Preparations for Rails 7.1

Introduction

Differences

Related topics