Hello! I am running a ChromaDB server, and have 2 fly.io machines, each with an attached volume to maintain persistent data. ChromaDB stores this data in an sqlite db. The issue I am running into is keeping the volumes in sync. Since fly.io is handling which machine to use for each user, some users will get the db containing data, and some will get the other one. I have not been able to figure out how to sync the databases periodically. Any help is super appreciated!
This is a pretty hard problem to solve (distributed systems.) Do you HAVE to use chromaDB for the vector embeddings? If not, I’d recommend using Turso, which has vectors embedding and they manage the distributed replication.
Does chromaDB store anything outside of sqlite? If it’s only sqlite, couldn’t you just keep those sync’d with LiteFS?
Just be aware LiteFS is still pre 1.0 and hasn’t had any commits in over 8 months.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.