Feature Request: flyctl should state it hit the memory limit when process is killed during deploy

I have a number Elixir apps that I’m migrating to Fly.io, and during the initial deploy the process is always killed. I usually see output like this:

#20 110.5 ==> html_sanitize_ex
#20 110.5 Compiling 11 files (.ex)
#20 110.6 warning: redefining @doc attribute previously set at line 15.
#20 110.6
#20 110.6 Please remove the duplicate docs. If instead you want to override a previously defined @doc, attach the @doc attribute to a function head:
#20 110.6
#20 110.6     @doc """
#20 110.6     new docs
#20 110.6     """
#20 110.6     def scrub(...)
#20 110.6
#20 110.6   lib/html_sanitize_ex/scrubber/no_scrub.ex:24: HtmlSanitizeEx.Scrubber.NoScrub.scrub/1
#20 110.6
#20 110.6 warning: redefining @doc attribute previously set at line 15.
#20 110.6
#20 110.6 Please remove the duplicate docs. If instead you want to override a previously defined @doc, attach the @doc attribute to a function head:
#20 110.6
#20 110.6     @doc """
#20 110.6     new docs
#20 110.6     """
#20 110.6     def scrub(...)
#20 110.6
#20 110.6   lib/html_sanitize_ex/scrubber/no_scrub.ex:29: HtmlSanitizeEx.Scrubber.NoScrub.scrub/1
#20 110.6
#20 120.1 Killed

The output looks just like a normal build; there’s no Elixir error there. Essentially these apps seem to hit a memory limit during the deployment because they’re compiling Nifs, and the process is terminated without any further information.

It would be great if flyctl reported the reason why the deploy was killed – it took me a little while to work this out, and could be useful for other developers starting with the fly.io platform.

Thanks!

Is it being killed during the build or at runtime? I’m not all too familiar with Elixir and Nifs, but I assume they can be pre-compiled at build time and runtime wouldn’t have any troubles.

If this is during the build, it might be hitting the remote builder’s memory limits. You can bump the memory on your remote builder with:

flyctl scale memory 16384 -a fly-builder-<your-builder>

(you’ll have to find your builder app name, should be printed before the build)

It’s being killed during the build. Scaling the VM’s memory to 2048 fixes the problem, but the feature I’d like to see is more details on why the build failed, especially if it’s due to a memory limitation during the build.

You scaled the builder’s VM’s memory to 2048? They come with 8GB by default. That sounds more like you scaled the app’s VM’s memory? Just trying to figure out when the Out Of Memory error is happening.

Normally you should see from the logs for a running app if it was OOM killed.

We do not have that for builds, that’s a problem.

No, not the builder’s VM, the app’s VM. It could be that the build happened successfully but then the app VM terminated immediately when trying to run the app. But it’s hard to tell from the logs as they’re presented above. All I know is:

  • there was output from the Elixir build
  • then something was killed
  • executing flyctl scale memory 2048 before the deploy fixes the problem

Normally, our init at runtime will detect if the main process was OOM killed. It would log something like this:

Out of memory: Killed process 6792 (node) total-vm:941864kB, anon-rss:137744kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:3324kB oom_score_adj:0

However that last Killed line in your logs looks like Docker logs. Meaning this happened at build time, before we were actually able to release your app.

Are you using a remote builder or a local Docker daemon? I see you have a remote builder app on your account, so I assume the former is true. Like I mentioned earlier, you probably need to bump the memory limit on your builder. I don’t believe your last image built properly.

Welcome @OldhamMade!

This looks like it might be from loading the source into a Dockerfile and starting the app from there,
causing it to build the source when it is starting up… is that right?

Unless you want to increase the VM size to allow for the extra RAM required for building the application, then I suggest going through the process of building a release inside a dockerfile. Then the build doesn’t happen on the VM and you also get a faster startup time and reduced RAM requirement.

One test would be to do the build locally and watch the RAM usage on your machine as it does that. Just to get an idea of how many resources are being used.

Thanks. The issue is seen when simply calling fly deploy – I’m not doing anything different than those in the first article you posted. I simply have to remember to call flyctl scale memory 2048 before fly deploy for it to complete successfully. It’s not a blocker, all I’m suggesting is that there’s a bit more visibility around the “Killed” message so your other customers don’t get stuck at that point not knowing how to proceed.

I’m simply running fly deploy. The remote builder you can see on my account is an (unsuccessful) attempt to work around this issue which is actually a blocker for me right now.

As mentioned, if I run flyctl scale memory 2048 and then fly deploy the deploy is successful. I’m cool with this – it isn’t a bug. I’m just suggesting a potential DX improvement. :slight_smile:

Could you share the full output of flyctl deploy?

Unfortunately I’ve lost that terminal history, and right now I’m blocked from any deploys by this issue. Once that’s resolve I’ll be happy to scale down the VM and run the deploy again to see if the issue persists.

Ah okay, we’ll get that fixed today too. Could you DM me the app name and I’ll see if I can find any logs on our end?

1 Like