LiteFS Cloud: 2 replicas - no primary host

Hi everyone!

I managed to deploy my pocketbase test app successfully but one of machines will never get a role of primary even though I cloned it. I wonder how can I set manually? I believe I can’t do it.

  1. When I start the app, there is one info level=INFO msg="cannot set primary status on host environment" err="cannot set machine metadata: code=429" What is the 429 status code?
    I went through the recent commits but no… it didn’t help to solve it.
    Set host environment metadata by benbjohnson · Pull Request #357 · superfly/litefs · GitHub

  2. Later when it tries to shut down, another info comes out level=INFO msg="cannot unset primary status on host environment" err="Post \"http://localhost/v1/apps/mili-lifets-pocketbase/machines/080e522c650598/metadata/role\": context canceled". I believe it is something to do with the first one.

  3. Another question is that we usually store the lifets data in the /var/lib/litefs according the example litefs.yml but my pocketbase app creates another pb_data which is for its sqlite data etc. Do I need to tell the LiteFS Cloud or in the lifets.yml that it needs to replicate that data? Or will it happen automatically? I believe I’ll get this answer when above errors/infos are done.

I was right about the 3rd option whcih I changed in the FUSE but I need still to monitor it cuz I got only one machine running and need to clone another.

  1. However, I still get the error: [info]level=INFO msg=“cannot set primary status on host environment” err=“cannot set machine metadata: code=429”. I’d love to know what it means if anybody encountered it.
  2. After a while, there is another info syd [info]level=INFO msg=“B39A8E039E97CA5C: cannot find primary & ineligible to become primary, retrying: no primary”. I wonder if it is something I need to be worried about it.
# The fuse section describes settings for the FUSE file system. This file system
# is used as a thin layer between the SQLite client in your application and the
# storage on disk. It intercepts disk writes to determine transaction boundaries
# so that those transactions can be saved and shipped to replicas.
fuse:
  **dir: "/pb_data"**

# The data section describes settings for the internal LiteFS storage. We'll
# mount a volume to the data directory so it can be persisted across restarts.
# However, this data should not be accessed directly by the user application.
data:
  dir: "/var/lib/litefs"

# This flag ensure that LiteFS continues to run if there is an issue on starup.
# It makes it easy to ssh in and debug any issues you might be having rather
# than continually restarting on initialization failure.
exit-on-error: false

# This section defines settings for the option HTTP proxy.
# This proxy can handle primary forwarding & replica consistency
# for applications that use a single SQLite database.
proxy:
  addr: ":8080"
  target: "localhost:8081"
  db: "db"
  passthrough:
    - "*.ico"
    - "*.png"

# This section defines a list of commands to run after LiteFS has connected
# and sync'd with the cluster. You can run multiple commands but LiteFS expects
# the last command to be long-running (e.g. an application server). When the
# last command exits, LiteFS is shut down.
exec:
  - cmd: "/pocketbase serve --http=0.0.0.0:8081"

# The lease section specifies how the cluster will be managed. We're using the
# "consul" lease type so that our application can dynamically change the primary.
#
# These environment variables will be available in your Fly.io application.
lease:
  type: "consul"
  advertise-url: "http://${HOSTNAME}.vm.${FLY_APP_NAME}.internal:20202"
  candidate: ${FLY_REGION == PRIMARY_REGION}
  promote: true

  consul:
    url: "${FLY_CONSUL_URL}"
    key: "litefs/${FLY_APP_NAME}"

HTTP 429 is “Too Many Requests”. We have a rate limit on the machine metadata endpoint of 1 req/sec to prevent abuse. I added a retry mechanism in PR #366 that should hopefully help that issue. What version of LiteFS are you using in your Dockerfile?

This one might be from the shutdown canceling the internal context and cutting off the shutdown early. I’ll take a look at that. These machine metadata issues should affect your LiteFS instance. It just updates metadata on Fly.io to reflect which node is primary or replica in your flyctl output.

This error is because your node is not a candidate to become primary. This setting in the config sets the node as primary if its region (FLY_REGION) matches the primary region set on the app (PRIMARY_REGION):

lease:
  ...
  candidate: ${FLY_REGION == PRIMARY_REGION}

If you ssh in and run env you can see the values of each of these environment variables.

1 Like

Hi Mr. @benbjohnson.

I used COPY --from=flyio/litefs:0.5 /usr/local/bin/litefs /usr/local/bin/litefs which is the last one I believe from the docs.

FLY_REGION=ams
PRIMARY_REGION=ams

So it doesn’t need to send any info about it cuz they are the same?


  1. I assume that if I’ll ask this question, it come of as a dumb one. My pocketbase application(based on golang + sqlite) saves images locally( I can connect with S3 later). I guess LiteFS will skip over them?

It looks like PR #366 isn’t in a released version yet so it won’t get picked up by fly/litefs:0.5 yet. Once we release v0.5.2, then you’ll see it. You can also use the PR image by using:

COPY --from=flyio/litefs:pr-366 /usr/local/bin/litefs /usr/local/bin/litefs

It look like your log message about the node being ineligible above was coming from the "syd" region. With the way the lease.candidate field is setup, LiteFS will only become primary in the "ams" region. If you don’t have a node running in that region then you won’t have a primary node to accept writes so replicas outside the ams region will complain that they can’t find the primary.

Not a dumb question at all! LiteFS expects all the files in your FUSE mount to be a SQLite database file. If you’re saving images, you should save them to a separate directory or S3.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.