Database no longer synced by litefs after issue deploying

I’ve managed to break a production app database syncing. Database writes seem to be happening locally and are not synced to litefs. litefs export returns database pre-data-changes. I’m looking for advice on steps to troubleshoot.

After having no luck for a few hours I decided to try a clean slate approach—destroying all volumes and machines and restoring the db from a backup on a single instance for simplicity. Unfortunately the problem persists. The app is running and is usable, but data changes appear to only happen locally, as the web interface and litefs export results in a database state identical to the pre-change backup.

Background

Last night some of the app’s secrets were accidentally tweaked. This seems to be the start of the problem, but I was also having issues with the deployment freezing which I resolved using fly deploy --local-only and I can’t rule that out either. I think the secrets have been fixed, but given it’s still not working right, I am not confident. One secret tweaked was DATABASE_URL, which the app uses to open the database. Prior to this issue, the app had been successfully running with multiple instances.

Relevant sections of config below.

litefs.yaml

fuse:
  dir: "/litefs"

data:
  dir: "/var/lib/litefs"
exit-on-error: false

proxy:
  addr: ":8080"
  target: "localhost:3000"
  db: "db"

exec:
  - cmd: "pnpm run start"

lease:
  type: "consul"
  candidate: ${FLY_REGION == PRIMARY_REGION}
  promote: true
  advertise-url: "http://${FLY_ALLOC_ID}.vm.${FLY_APP_NAME}.internal:20202"
  consul:
    url: "${FLY_CONSUL_URL}"
    key: "${FLY_APP_NAME}/primary"

fly.toml

[[mounts]]
  source = "litefs"
  destination = "/var/lib/litefs"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 1
  processes = ["app"]

Dockerfile

ADD litefs.yml /etc/litefs.yml
RUN mkdir -p /litefs /var/lib/litefs
ENTRYPOINT ["litefs", "mount"]

From General to Questions / Help

Added litefs

Hi… The logs and the full Dockerfile (including LiteFS version) are likely to shed light on this, I think, if we could see them. Plus DATABASE_URL (file:///litefs/te.db ?)—with sensitive information removed—which would catch the common problem of writes landing outside FUSE.

It’s non-intuitive, but this on its own is known to cause problems, :dragon:.

(Basically, there are two separate copies of the cluster ID, and they have to be kept in synch.)

Hope this helps a little!

Added consul

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.