Sharing data across app. Can the LiteFS db sync across app boundaries.
How does LiteFS handle multiple databases on Fly.io.
Item 1 - Sharing Data Across Apps
I know you can have multiple apps share the same database (backend / frontend).
I have a use case where I have a BaaS app that is the Primary.
The UI is then separate apps that can scale horizontally, and are read only.
In this case I share the LiteFS Cloud key, and the Consule key. I make sure the Consule setup uses the same app name for both.
Item 2 - Handling Multiple Databases
I asked a similar question here, and the answer made it sound like the proxy only knows about one database.
That being said, I have a static lease setup locally, and using NGINX where I control where POST requests go. The way I understand it, is that Fly.io proxy only knows of one database when using Consul.
With a static lease, I am not sure you have the option to use the fly proxy, as you are defining the endpoint as static. But your apps need to know “where” to write. This goes for all databases. This model can have downtime, manual intervention, and won’t handle scaling horizontally.
The LiteFS example repo uses NGINX to route all non-get requests to the right place. I think you could use this on fly.io to handle something similar, but making the proxy handle multiple databases for you.
Hope that helps, really curious the outcome with your work here.
One final thought - When I go to interact with LiteFS Cloud, I do have to select a database (not just the cloud instance), which makes the UI feel like it supports multiple. It’s possible that it would backup all DBs, but not 100% clear on this, and have not found a direct question/answer around the cloud specifically.
Though it sounds like they may cap at 10GBs for the Cloud instance backup. Not sure if your 100 DBs go beyond that. But you can always just backup to S3 or B2 or some other S3 compatible storage pretty easy with Litestream (also made by @benbjohnson)
Thanks for the very thorough information, that gives a lot of food for thought. I think i understand about the proxy stuff and the limitations, I could work around those in the app that has to connect to more than 1 database.
My use case is having clients that are 99.9% read with occasional writes to their own database but potentially a few instances, and I think using the proxy, or doing what the proxy does manually in the app will be fine.
But the management app will update records in many databases, and potentailly search across them.
I was more thinking about the actual fuse mount and sync behaviour.
When you first stand up the litefs you either create a database in the mount location with your app and that then gets replicated. But you could also import more than one database into the share as far as I can tell using litefs import. You then have one cluster with 2 databases.
If I join a second instance to the cluster, will both databases get synced to the local machine before litefs runs the binary specified in the exec? or does the sync only happen when you open the database?
If its the latter as I add more and more databases to the share it will take longer and loger for new instances to spin up as there is more data to sync to the machine. Does that make sense?
I am unclear if the sync happens before app starts. I think if you use exec.cmd and have LiteFS be the supervisor, it may do that. It does seem like the apps trying to connect before launching.
For my local setup, I only have one FUSE directory that everything sits in, same with LiteFS location.
And to my surprise, both databases are synced. I think the limitation is only the proxy.
The WAL and SHM files are named the same as the DB, so they don’t conflict. Works really well =).
If I understood your questions correctly, I think your fine.
That being said, I am not sure you can prevent the sync from sharing all DBs across all instances. I think that would be up to the individual app, that it can only read/write from that specific DB. And that might bloat your volumes…
The sync is for all databases in a cluster. We don’t currently have a way to only replica individual databases but it’s a feature that’s been requested several times. Are you wanting to just specify a different set of databases on each replica? Do you need any kind of auth to restrict which databases each replica has access to?
A couple more questions, can a litefs cluster span apps? I want to run an instance per customer so I can have customers at different versions of my application. And I want a management app that is just accessible to me that has all the databases mounted into.
Also Is there a way to configure the backup client to run on a machine that isnt on a fly machine?
I gave managed to join the cluster on my local machine and become the master. I can then create a database locally and import it,
Would I need to have the backup client running from here if im the primary for any length of time?
Yes, as long as your apps are in the same organization they can talk to one another.
The backup client doesn’t need to run on Fly. It just needs to have LITEFS_CLOUD_TOKEN set as an environment variable and it needs to be primary. That should be the only two conditions.
If you have your cluster configured to back up to LiteFS Cloud then the cloud will be the data authority. So if you switch to your local machine to be primary, you can update the database, however, when you switch back to your other machine that’s connected to LiteFS Cloud, it won’t see the updates and it’ll revert state.
tl;dr yes you’ll need to configure any node that can become primary to use LiteFS Cloud if you have it enabled.
I think that would be helpful. I played around with multiple BaaS solutions that sit ontop of Sqlite, and many of them use more than required more than 1 database, or support multiple databases optionally with Sqlite.
AirByte (sync, with airtable)
Flow (sync, with airtable)
And probably some others. =)
It was a exhaustive 2 weeks, and I ended up using my first choice.
I have fixed it, I started again. I think it had something to do with my local machine not being connected to the backup for a certain period of time. I got some strange errors like the one below once I had connected it to the backup.
http: POST /stream: error: stream error: db=\"kt.db\" err=stream ltx (0000000000000001): write ltx snapshot to chunked stream: canceled, http server closed"
I deleted the machines, and volumes, created a new cluster, changed the consul key and re-imported the databases and it all now seems to work as it should. I havent tried to connect my machine to it. I will save that for another day.
I have this working. All you need is the share the same consul key. In my case my case I used my FrontEnd app as the shared key. But I have that app setup to not be promoted. Ever. It is a read only instance.
The Backend App, is setup to be the primary and can be promoted.