Sqlite/liteFS with "app servers" and a "worker server"

Hi all,

I’m working on a SvelteKit app which we’re expecting to have many many many more DB writes than reads. We’re using sqlite, since (IMO) this seems like a pretty good use case for that.

For now we’re only worrying about deploying in one region, since we only expect Australians to use it.

So far development has been good. I’ve set up a single app, which has a single volume–the DB lives on this volume at the moment. Everything is fine (although not that scalable?). However, I now need to host a component of our app which runs periodically (say, every 15 minutes). I’d like to have a server which stays up and runs cron (or similar) to do this.

This is where the problem occurs, these instances can’t share a volume (for good reason as I’ve discovered reading this forum).

It seems like LiteFS is the solution to this problem, but I have a couple of questions.

  1. It seems that LiteFS can’t handle multiple instances of the DB being written to. Is that correct?
  2. In the example Go app this is solved by sending all write requests to the “primary” node as decided by consul. Only a small subset of my users can write data (admins), my plan is to forward all of them to the primary node. Does this make sense? That way they won’t have any data consistency issues too.
  3. Can I have my “worker” app join the same consul network as my “app” app? The alternative it seems would be to have the “worker” start in the background of the current primary node. This seems problematic if the primary node changes though (I suppose you’d have to be constantly polling to check?)
  4. Is this even the best way to have my “worker” access the database? I guess unless I had some internal API I’d have to have it always be the primary (which brings us back to the previous idea?).
  5. How would I decide where migrations should run?

Sorry for the muddled thoughts, I’d really appreciate some feedback on how others have gone about achieving similar things.

hey Harrison, good questions. Hopefully I can clear some of it up.

SQLite has a single-writer/many-readers model so it actually excels at the opposite use case: many more reads than writes.

Yes, this stems from SQLite’s single-writer model more than LiteFS itself though. If your app can handle eventual consistency then something like Mycelial might work better.

Yes, that makes sense.

The current version of LiteFS (v0.2.0) requires all writes to go to a single primary node which makes “worker” apps difficult to implement. The next version (v0.3.0) should have “write forwarding” so that non-primary nodes can execute writes locally and automatically forward them to the primary. Write forwarding will be slower than executing writes directly on the primary but it’ll make it easier to use and it’ll better allow for “worker” app use cases.

This is another issue that’s helped by write forwarding since you’ll be able to run migrations form any node. Migrations are tricky in distributed apps in general though since they need to be able to work with multiple versions of the app (since multiple are running when you do a rolling deploy).

Let me know if you have any other questions or suggestions for LiteFS. It’s still in its early stages but we want to make it as easy to use as possible.

3 Likes

If multi-writer SQLite with support for high read/write concurrency and strong consistency is needed, maybe mvsqlite is a suitable choice. The downside is that you need to build and operate a FoundationDB cluster.

Disclaimer: I’m the author of mvsqlite :slight_smile:

2 Likes

Welp, the perils of early adoption… So it’s quite hard to migrate if you have multiple instances and your doing rolling deployment even assuming you have compatible tables. Since each new deployment won’t be the primary at first it can’t migrate the database and can’t run start running so failing that deployment. Write forwarding will make things a whole lot easier :slight_smile:

1 Like

tl;dr: read the fly litefs docs, they’re good. if you’re just trying to stand up a http server + background worker, run multiple processes using bash or some kind of process manager

apologies in advance for bumping this thread, but it seems like the best thread to jot some notes on the multi-worker approach and the state of the world has changed a bit over the last year.

in particular, i thought it might be worthwhile to update this thread for future searchers (and hopefully drop some relevant references).

some questions:

  • how does one run database migrations? (there’s a fly.io guide on db migrations with litefs)
  • how does one setup a web-service with database writes and a worker with database writes?
  • how does one setup a fleet of workers?

i’ll drop my findings and (guesses) below (but i’d also love to be corrected if i get something slightly–or super–incorrect)

how does one run database migrations?

tl;dr: check out the fly.io guide on performing database migrations using litefs

broadly, the approaches are:

  1. let each of the candidate litefs nodes become the primary, and then run migrations (this is the recommended approach)
  2. run migrations separately by promoting an ephemeral process to perform the migration (this seems like a nicer approach, conceptually. but you end up blocking for two promotions, which is perhaps why the previous method is recommended)
  3. use write-forwarding (this should only be for quick migrations)

i’ll go in order, below (i think that’ll make for a nicer reference) with some commentary based on my findings with a toy application.

let each of the candidate become the primary and then run migrations

from an application’s perspective, this is nice: whichever litefs process ultimately becomes the primary (the last litefs mount candidate process to roll out) will end up not only “knowing” about the current database schema but in the simple case will also be the one to perform the database migration. this makes a tremendous amount of sense in the simple-case with a single writer node.

take an example Rails application written for a fly v2 app. in this setup, the user may have each of:

  • a litefs process
  • a process manager (bash script, supervisord, Procfile-runner, etc.)
  • a Rails web server process
  • a Rails worker process
  • a cron process

if a user wants to avoid the process manager in the image, there’s similar ways to approach this with fly Machine processes or fly.toml processes. regardless of the approach, each has a different set of tradeoffs. that said, running a process manager on a single VM trades some slight Dockerfile complexity for deployment-ease (each of the web service, worker, and cron running via the process manager can all use the same litefs mounted sqlite db for writes)

with the process-manager approach, when the (single) new candidate VM rolls out it:

  • requests promotion to the primary litefs node
  • is promoted to become the primary
  • runs database migrations (which then propagate to all other litefs nodes)
  • starts the web server, worker, and cron processes
  • starts writing from any of the above three processes

in the case of a bunch of candidate vms, it still works out. each candidate becomes the primary and runs–or attempts to run–the database migration. the first candidate on a rollout becomes the primary and runs the database migration, every subsequent candidate becomes the primary and attempts to run the migration–but the migration itself results in a no-op (assuming the user has written the database migration to work in a distributed setup and be idempotent–this is the rails default db migration approach).

this approach ended up being the one i used because it had the fewest trade-offs and the least deployment complexity (despite pulling in an extra process manager binary)

it seems to be the easiest way when all of the worker processes have access to the primary sqlite database.

run migrations separately…

conceptually, i prefer this approach (and i’ll most likely iterate to using it for the toy app ismtbakererupting.com, because of the app’s particular usage patterns)

the usage here would be:

litefs run -promote -- bin/rails db:migrate

this could be run as part of a v2 app as a release_command.

this uses litefs run to:

  • promote the litefs process (probably an ephemeral vm) to the primary
  • perform the database migration
  • fails over to a different litefs primary

the only downside i note with this approach is stalling (or failing) writes from the application/worker/cron processes while the release_command performs the data migration (in most cases the thing performing the migration wouldn’t have a web server or worker running). I haven’t tested this but I think this could be on the order of hundreds of milliseconds… so, for essentially any hobby/toy app it would be acceptable. it translates nicely to a client-server db deployment scenario so using it for toy apps, even with the write failures, seems like a nice way to have things setup a similar way they might be setup with a more traditional db setup (which may or may not be valuable).

use write-forwarding

litefs (v0.3.0) introduced write forwarding and that’s a potential way to have all candidate (or non-candidate) nodes perform a database migration. the fly docs have a section on it but the two ways to use this are:

litefs run -with-halt-lock-on /path/to/litefs/db -- bin/rails db:migrate

using litefs run with -with-halt-lock-on would probably work fine for most database migrations but having to think more about migrations (e.g. will this run fast enough, should this really run on the primary, etc.) make this less of an obviously good choice for running migrations

migrations seem pretty a pretty well-trod path for litefs as far as both state-of-the-art and well-documented are concerned.

which takes us to stuff i have less a good grasp on (and getting @benbjohnson, @tpalpallatoc, or @rubys to point me RTFM, or code, would be amazing)

how does one setup a web-service with database writes and a worker with database writes?

i believe the options here are:

  • run the web service and worker in the same vm
  • have the process desired-to-write on not-the-primary use the HALT lock

run the web service and worker in the same vm

running multiple process in a v2 app is well described in the fly.io user guides essentially: users can use backgrounded processes on startup or us a process manager (any of varying complexity, i went with hivemind)

this approach is straightforward and the user’s application code can be blissfully unaware of litefs

as an aside, this class of solution has some of the most-disjointed docs:

  • there’s the referenced guide, above
  • the different approach using v2 [processes] which is every-process-gets-a-vm and doesn’t work with every process being a writer because every process has a different volume mount. the guide for that approach has a very similar name to the first guide
  • there’s the Machine API (referenced in the first doc), which runs multiple processes in a single VM, doesn’t have a handy fly.toml config (yet), and has the downside that any given process blowing up will end up terminating the machine. the upside is all of the processes can use the sale litefs sqlite db

have the process desired-to-write on not-the-primary use the HALT lock

the toy app i’m using has a webhook endpoint which will enqueue various jobs when invoked. this is the approach i took instead of standing up a scheduler (though with hivemind i’ll stand up a scheduler soon). in this way, the web service could run in a different VM and use the HALT lock to insert a job into the litefs database.

i find this approach interesting and wonder if it’s the approach fly used for workers referenced on Hacker News by tpalpallatoc:

until a few months ago, we were concurrently merging state updates from tens of thousands of VMs across our global worker fleet into a single SQLite database

(aside, that particular thread is one of the most amusing comment chains i’ve read this year)

i have no idea how this was setup and i’d love to read more about it. i could see this working by using the HALT lock. workers could perform fairly simple updates on the primary after using the local read replicas to grab all the data necessary for a given job. if workers are all in the same region as the primary the HALT lock would be pretty speedy so even with the lower throughput (@benbjohnson has mentioned 250ms latency when performed cross-region, so i imagine it’s much faster within the same region, or with multiple VMs on the same hardware) depending on the type of workload.

this approach (currently) requires the client is aware of litefs and the client has to perform the lock.

depending on the type of worker this could only need to perform the HALT lock on the final update of the job to the writer. If background jobs lifecycles are being handled by something else (redis, beanstalk, NATS, SQS, whatever) the only update on the primary could come at the end of the job

my toy app is using delayed with a single VM and a single worker. it seems like a particularly bad choice for litefs but with it being all on a single host it works fine.

this brings us to:

how does one setup a fleet of workers?

i don’t have any sort of a good answer here.

but the obvious solutions are:

  • big VM with all of the workers running on the single writer
  • writers partitioned by some criteria (like region, customer, business domain)
  • all the HALT locks all the time?
  • ???

big VM with all of the workers running on the single writer

the simplest and most obvious approach i can see is a very large VM with multiple worker processes writing against that VM. fly has some decently burly 16 core 128GB RAM machines available which will run a lot of workers even in a memory-hungry language like ruby. an approach here would forward all of the web-server writes to the primary chonky node and let the workers hammer on that single sqlite db.

i’d expect this to work to a hilariously large scale. there’s still the single-node fail-over scenario which results in queued or lost writes. with a highly replicated database (probably backed up with LiteFS Cloud) that write-downtime could be as low of 100s of ms and even at production scales that may not matter for a very long time.

of note, if the workers aren’t also interacting directly with the database, but instead working with well defined APIs, the litefs proxy (and fly replay header) would let the single writer all be happening in the context of a web-server.

writers partitioned by some criteria

this is referenced at various points through various HN threads by both folks at fly and folks working in other areas doing similarly unhinged things with sqlite. i could see this working where different databases with different primaries are configured as appropriate and have workers scaled based on the job load. for instance if there are workers for each fly region they could all be operating in that region writing against that region’s primary writer.

even in the case of a background job processor like delayed (where the job runner is polling the database), all of the background jobs could have their own writer separate from the application database (this is a better approach technically, anyway–though if we’re discussing technical merits it’s worth discussing whether a transactional database is the right place for background jobs–but doesn’t matter in practice until an application is operating at scale)

all the HALT locks all the time?

i’m guessing this is useful in a scenario where workers are operating in the same region (or on the same physical hardware, but a different VM) as the primary.

i think this would go something like

  • each worker not on the primary identifies that it’s not on the primary and implements HALT lock when performing writes
  • some watchdog process notes when a candidate becomes the primary and bounces the workers? (maybe it doesn’t matter and checking for the presence of being the primary by reading the filesystem in the context of performing a write is so fast it doesn’t matter)

???

I have a hunch this is where the big wins come in

something like:

  • workers primarily interact with primary writers via HTTP APIs
  • when the primary is being elected, the litefs proxy can handle writes going to the correct place
  • whatever state the workers need to manage for themselves they can use their own local sqlite primary to accomodate that need