Updating litefs.yml lease consul key resets my data

I ran into an issue where a prisma schema update caused deployments to fail on my live database. Whenever I deployed, I would get an infinite repeat of the following logs:

level=INFO msg="cannot find primary, retrying: no primary"
level=INFO msg="cannot become primary, local node has no cluster ID and \"consul\" lease already initialized"

There was another discussion that helped me to get past these errors: cannot find primary, retrying: no primary

The problem is that whenever I deploy now I have to update the lease consul key in my litefs.yml file. Changing this value resets my live database so any new accounts created or data uploaded is gone.

How can I fix this so that I don’t get the infinite cannot find primary logs, but also don’t have to reset my data with every deployment?

Hi… Changing the Consul key doesn’t remove existing /litefs/db rows; I verified this myself this afternoon. It sounds like you may instead be in the related situation of unknowingly having LiteFS’s internal bookkeeping writes (/var/lib/litefs) fall outside of the persistent volume completely.

(Thus, it’s that entire portion of the filesystem that’s being reset, :dragon:—regardless of Consul settings.)

This is confusing initially—and all admit that it is—due to the presence of three different persistent stores that need to align:

typical location purpose
LiteFS FUSE mount /litefs/ web application writes and reads here
LiteFS internal storage /var/lib/litefs/ never mess with this directly (!)
Consul key-value pairs (invisible cluster) managed by Fly.io

What do you have yourself, for the first two settings?

Added consul, volumes

I’m honestly not really sure where to find my LiteFS FUSE mount or my LiteFS internal storage. Can you help direct me? Here is my other/litefs.yml file for reference though:

fuse:
  # Required. This is the mount directory that applications will
  # use to access their SQLite databases.
  dir: '/app/data/sqlite'

data:
  # Path to internal data storage.
  dir: '/app/data/litefs'

proxy:
  # matches the internal_port in fly.toml
  addr: ':${INTERNAL_PORT}'
  target: 'localhost:${PORT}'
  db: '${DATABASE_FILENAME}'

# The lease section specifies how the cluster will be managed. We're using the
# "consul" lease type so that our application can dynamically change the primary.
#
# These environment variables will be available in your Fly.io application.
lease:
  type: 'consul'
  candidate: ${FLY_REGION == PRIMARY_REGION}
  promote: true
  advertise-url: 'http://${HOSTNAME}.vm.${FLY_APP_NAME}.internal:20202'

  consul:
    url: '${FLY_CONSUL_URL}'
    key: 'litefs/${FLY_APP_NAME}-nex-1'

exec:
  - cmd: node ./other/setup-swap.js

  - cmd: npx prisma migrate deploy
    if-candidate: true

  - cmd: npm start

This shows that /app/data/sqlite is the FUSE mount and /app/data/litefs is the internal storage.

That second one needs to correspond to a persistent volume, which by default you do not get on Fly.

What do you have within fly.toml so far?

Here is my fly.toml:

primary_region = "atl"
kill_signal = "SIGINT"
kill_timeout = 5
processes = [ ]

[experimental]
allowed_public_ports = [ ]
auto_rollback = true

[mounts]
source = "data"
destination = "/var/lib/litefs"

[build]
  [build.args]
    NODE_VERSION = "20"

[deploy]
release_command = "node ./other/sentry-create-release"

[[services]]
internal_port = 8_080
processes = [ "app" ]
protocol = "tcp"
script_checks = [ ]

  [services.concurrency]
  hard_limit = 100
  soft_limit = 80
  type = "requests"

  [[services.ports]]
  handlers = [ "http" ]
  port = 80
  force_https = true

  [[services.ports]]
  handlers = [ "tls", "http" ]
  port = 443

  [[services.tcp_checks]]
  grace_period = "1s"
  interval = "15s"
  restart_limit = 0
  timeout = "2s"

  [[services.http_checks]]
  interval = "10s"
  grace_period = "5s"
  method = "get"
  path = "/resources/healthcheck"
  protocol = "http"
  timeout = "2s"
  tls_skip_verify = false
  headers = { }

This is the source of your problem, then… You want it to be /app/data/litefs instead, to match data.dir in litefs.yml.

(It’s probably also prudent to double-check that DATABASE_FILENAME points to /app/data/sqlite/, the FUSE mount.)

Your Node application writes to the FUSE mount, and then LiteFS transmogrifies those into writes onto the persistent volume, :tiger:…

I was able to get this working now and the data persisted through a deployment. Thank you so much for your help @mayailurus!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.