Deploy Rails 7.2 SQLite with LiteFS - Feels Impossible šŸ˜£

We have followed this article getting a new Rails 7.2 app deployed on fly with both sqlite and litefs.

We ran this:
bin/rails generate dockerfile --litefs

 INFO Preparing to run: `litefs mount bundle exec rake solid_queue:start` as root
 INFO [fly api proxy] listening at /.fly/api
2024/08/12 22:46:21 INFO SSH listening listen_address=[fdaa:0:9547:a7b:b0bb:dbc:d2b7:2]:22 dns_server=[fdaa::3]:53
Machine created and started in 5.111s
ERROR: too many arguments, specify a '--' to specify an exec command

Something that is confusing is I am seeing a db:prepare statement in both the bin/docker-entrypoint file and within the config/litefs.yml

exec:
  # Only run migrations on candidate nodes.
  - cmd: "./bin/rails db:prepare"
    if-candidate: true

  # Then run the application server on all nodes.
  - cmd: "./bin/rails server"
#!/bin/bash -e

# mount litefs
sudo -E litefs mount &

# If running the rails server then create or migrate existing database
if [ "${1}" == "./bin/rails" ] && [ "${2}" == "server" ] && [ "$FLY_REGION" == "$PRIMARY_REGION" ]; then
  ./bin/rails db:prepare
fi

exec "${@}"

Any help would be greatly appreciated - excited to get this new app deployed on fly!

It seems we are having issues getting a simple rails app deployed with multiple processes (app, solidq).

There is a lot of LiteFS docs on Fly, but they all seem a bit out of sync and out of date. For being a large part of the site, it feels very difficult to get setup. Is there an updated article or doc that I am missing somewhere?

1 Like

Added litefs, rails, sqlite

We were able to get a bit further along by modifying the fly.toml file as so:

[processes]
  app = "" # use litefs.yml's exec
  solidq = "-- bundle exec rake solid_queue:start"

[[mounts]]
  source = 'litefs'
  destination = '/var/lib/litefs'
  processes = ['app', 'solidq']

Would we be able to get an updated resource or article on deploying a simple rails app with multiple processes using sqlite / litefs.

Should workers need volumes as well? Should workers be able to be litefs primaries? We have tried the following, but our workers are still being assigned to be the primary (which we are also not even sure is a problem)

candidate: ${FLY_REGION == PRIMARY_REGION && FLY_PROCESS_GROUP == 'app'}

(Not sure this is the correct syntax to accomplish only allowing machines in the app process group being selected as the primaries)

Another question, with sqlite and litefs - why does the {FLY_REGION == PRIMARY_REGION} condition ever matter?

Also, we had issues with consul, and in order to correct this without simply appending characters to the key like so: key: "litefs/${FLY_APP_NAME}-v2" - we had to ssh into a random machine, manually get the consul cli installed, and then manually destroy the key. This seems like something that should be possible via the fly cli? Or is there a way to have consul better handle zombie primary cluster IDs?

1 Like

Hiā€¦ These are many good observationsā€”and generally are classic initial pain points with LiteFS (which, broadly speaking, seems stuck in pre-v1.0). Without claiming to address them comprehensivelyā€¦

I would like to see this, as well. You can vote for it and/or chime in with your own perspective in the following docs feedback thread:

https://community.fly.io/t/blueprints-autostart-and-stop-internal-apps-resiliency-by-machine/20343

If they only require read access to the database, then they should have volumes but should not be primary-candidates.

But if your workers do need write access, then they will have to go into the same Machine as the app. (Both canā€™t be primary simultaneously.)


Aside: It looks like there might be some workarounds, but they are rather intrusive changes and/or not very Rails-friendly.

So workers can execute code that would cause there to be writes, in this case are you saying we should not be using multiple fly process groups? If so - are we loosing out on the horizontal scaling of adding more workers, etc?

I guess this is all just very confusing to how it should be setup for a standard application with a simple solid queue worker using sqlite. The idea of using sqlite to simplify things is beginning to feel more complex, fragile, and more difficult to manage than postgresā€¦

Exactlyā€¦ For something like that, you would (almost always) instead want a central database, such as Postgres.

There are definitely trade-offs, and LiteFS isnā€™t automatically the right choice.

When it does fit, though, I would say that itā€™s very much in the ā€œless fragileā€ categoryā€¦

What doesnt make sense to me is that everything is moving more towards sqlite ie. SolidQueue and SolidCache in Rails - so why is this so hard to deploy on Fly?

For a simple app, we should be able to have multiple app processes with LiteFS - it is just very unclear how to do this on Fly at the moment. We are not trying to do anything complex. Simply use SQLite - is there a better alternative to LiteFS?

This has to be the most annoying part when trying to get this stuff setup:

2024-08-13T18:38:41.379 app[6e825496a57568] iad [info] level=INFO msg="cannot find primary, retrying: no primary"

2024-08-13T18:38:42.384 app[6e825496a57568] iad [info] level=INFO msg="cannot become primary, local node has no cluster ID and \"consul\" lease already initialized with cluster ID LFSCEB6A8B19657B8393"

There is no real easy way to solve this problem, why can things get stuck in this bad state?

How would we do this?

exec:
  # Only run migrations on candidate nodes.
  - cmd: "./bin/rails db:prepare"
    if-candidate: true

  # Then run the application server on all nodes.
  - cmd: "./bin/rails server"

  # Only run workers on candidate nodes.
  - cmd: "./bin/bundle exec rake solid_queue:start"
    if-candidate: true

This doesnt workā€¦

It also make no sense that a candidate is defined by it just being in a ā€œprimary regionā€ which honestly just makes no senseā€¦ What if there is multiple machines in that same region? There can be multiple ā€œcandidatesā€? If so, how would it only run on read/write machines?

A fellow user got this working with Node recently, but it was a fair amount of effort:

https://community.fly.io/t/litefs-get-request-containing-writes-litefs-with-multiple-processes/19510

As far as I know, there is no ā€œomakaseā€ for it, though.

(The local Rails experts may correct me.)

All this LiteFS stuff makes zero sense and it simply doesnā€™t work.

LiteFS seems to be advertised to get read write replicas anywhere you want running on the local app machine, but it simply doesnā€™t do this.

I am at a loss to why it even exists to be completely honest.

This is all in part with tons of inconsistent documentation and unclear docs to how to use LiteFS with multiple fly process groups that both need to read and write. (puma server and solid_queue workers).

The tldr is that using LiteFS with worker instances is not a good setup. LiteFS works well when you can fly-replay HTTP requests that do writes, itā€™s less useful for running a bunch of processes that need to write to sqlite.

If you want to run workers that also use LiteFS, youā€™ll need to come up with some mechanism for sending writes to the primary instance. I probably wouldnā€™t bother doing this, though, itā€™s not the sweet spot.

This is just a limitation of LiteFS and using sqlite directly. For what youā€™re doing, I expect youā€™ll have much more success with https://turso.tech/

2 Likes

Hi @kurt

Thanks so much for your reply. I think the main confusion and frustration from this comes from it being difficult to completely understand how LiteFS works.

If I am being honest, I pretty much thought that LiteFS basically just kept a local copy of a file on all machines, if a write happen anywhere, i would make sure that was persisted on all other nodes, meaning every machine was read/write compatible, therefor allowing every machine to locally have a copy of the DB. Meaning that we could basically infinitely scale horizontally as well as geographically.

We have looked a bit into turso but they have a lack of Ruby / Rails documentation if any at all, plus it just feels like a database as a service similar to supabase and there is no local copy of the database?

We have moved and migrated this simple project back to Postgres to move things along.

Thereā€™s no ROR sdk yet, but itā€™s on their todo list. As for now, youā€™ll need to interface via HTTP.
They have embedded replicas which syncs the remote db to your local FS for sub ms reads.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.