LiteFS Cloud: "Cannot set machine metadata: code=429"

Hi everyone.

I thank you for the LiteFS Cloud announcement.

I wanted deploy a golang app with it but there is a bug regarding of
level=INFO msg=“cannot set primary status on host environment” err=“cannot set machine metadata: code=429”

FROM golang:1.18 AS build-backend

RUN mkdir /app
ADD . /app
WORKDIR /app

RUN CGO_ENABLED=0 GOOS=linux go build -o pocketbase .

FROM alpine:latest AS production
COPY --from=build-backend /app .
EXPOSE 8080

# Install required packages
RUN apk add --no-cache \
    ca-certificates \
    fuse3 \
    sqlite

COPY --from=flyio/litefs:0.5 /usr/local/bin/litefs /usr/local/bin/litefs

ENTRYPOINT litefs mount

# The fuse section describes settings for the FUSE file system. This file system
# is used as a thin layer between the SQLite client in your application and the
# storage on disk. It intercepts disk writes to determine transaction boundaries
# so that those transactions can be saved and shipped to replicas.
fuse:
  dir: "/litefs"

# The data section describes settings for the internal LiteFS storage. We'll
# mount a volume to the data directory so it can be persisted across restarts.
# However, this data should not be accessed directly by the user application.
data:
  dir: "/var/lib/litefs"

# This flag ensure that LiteFS continues to run if there is an issue on starup.
# It makes it easy to ssh in and debug any issues you might be having rather
# than continually restarting on initialization failure.
exit-on-error: false

# This section defines settings for the option HTTP proxy.
# This proxy can handle primary forwarding & replica consistency
# for applications that use a single SQLite database.
proxy:
  addr: ":8080"
  target: "localhost:8080"
  db: "db"
  passthrough:
    - "*.ico"
    - "*.png"

# This section defines a list of commands to run after LiteFS has connected
# and sync'd with the cluster. You can run multiple commands but LiteFS expects
# the last command to be long-running (e.g. an application server). When the
# last command exits, LiteFS is shut down.
exec:
  - cmd: "/pocketbase serve --http=0.0.0.0:8080"

# The lease section specifies how the cluster will be managed. We're using the
# "consul" lease type so that our application can dynamically change the primary.
#
# These environment variables will be available in your Fly.io application.
lease:
  type: "consul"
  advertise-url: "http://${HOSTNAME}.vm.${FLY_APP_NAME}.internal:20202"
  candidate: ${FLY_REGION == PRIMARY_REGION}
  promote: true

  consul:
    url: "${FLY_CONSUL_URL}"
    key: "litefs/${FLY_APP_NAME}"
# fly.toml app configuration file generated for mili-lifets-pocketbase on 2023-07-05T22:09:00+02:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#

app = "mili-lifets-pocketbase"
primary_region = "ams"

[[mounts]]
  source = "litefs"
  destination = "/var/lib/litefs"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0

Thanks for trying out LiteFS Cloud. The error sounds more scary than it actually is. We recently added a feature to LiteFS to update the Fly Machine metadata so you can query which node is the primary vs replica through flyctl.

This error is reporting that it’s unable to update that metadata info but it won’t affect the usage of LiteFS itself. I added an issue to improve the error message and I added an issue to retry that machine metadata update.

1 Like

Thank you for the reply. I don’t need to use it rn but I’m looking forward to trying it out again! I’ll try to find out which node is the primary vs replica through flyctl.

Right now, we just expose it through the JSON API but we’ll be integrating metadata into more place soon. You can run the following:

lfsc-test-runner $ fly machine list --json

And you’ll see the "role" in config.metadata in the JSON output.

[
    {
        ...
        "config": {
            "metadata": {
                "fly_platform_version": "v2",
                "fly_process_group": "app",
                "fly_release_id": "...",
                "fly_release_version": "...",
                "role": "replica"
            }
        }
    ...
}
1 Like

I redeployed and got a new error:
level=INFO msg="cannot set primary status on host environment" err="Post \"http://localhost/v1/apps/mili-lifets-pocketbase/machines/080e522c650598/metadata/role\": dial unix /.fly/api: connect: connection refused"

here is my metadata:

"metadata": {
                "fly_flyctl_version": "0.1.49",
                "fly_platform_version": "v2",
                "fly_process_group": "app",
                "fly_release_id": "akRvx4BJJDKR8uMVJxo1LZobp",
                "fly_release_version": "1",
                "role": "replica"
 },

PS: The old error is still there. My bad.

Hi Mr. Benb Johnson.

I saw that issues are merged however I got a new error. Here are the logs:

 2023/07/08 10:37:26 listen tcp 0.0.0.0:8080: bind: address already in use
  subprocess exited with error code 1, litefs shutting down
  level=INFO msg="cannot unset primary status on host environment" err="Post \"http://localhost/v1/apps/mili-lifets-pocketbase/machines/080e522c650598/metadata/role\": context canceled"
  level=INFO msg="primary backup stream exiting"
  level=INFO msg="5D516DFA2C02F734: exiting primary, destroying lease"
  litefs shut down complete
   INFO Main child exited normally with code: 1
   INFO Starting clean up.
   INFO Umounting /dev/vdb from /var/lib/litefs
   WARN hallpass exited, pid: 240, status: signal: 15 (SIGTERM)
  2023/07/08 10:37:27 listening on [fdaa:2:57e0:a7b:141:7499:977:2]:22 (DNS: [fdaa::3]:53)
  [    2.186243] reboot: Restarting system
  machine did not have a restart policy, defaulting to restart
   INFO Starting init (commit: db101a53)...
   INFO Mounting /dev/vdb at /var/lib/litefs w/ uid: 0, gid: 0 and chmod 0755
   INFO Resized /var/lib/litefs to 10733223936 bytes
   INFO Preparing to run: `/bin/sh -c litefs mount` as root
  ERROR [fly api proxy] failed to start listener: Address in use (os error 98)```

Ain't it the same issue?