Hello,
I’m starting a new IoT project and learnt about litefs recently. It seems like a great fit for our needs.
Our use case consists of a central system (in fly.io) that will be accepting the write operations. Then, we have on prem deployments where we need to have read replicas of the primary database.
I have so far successfully configured a multi machine setup in fly, using consul for lease management. Now, I’m trying to get the replica to consume the changes from the primary in fly. The litefs configuration itself I believe it’s fine, I’m having a networking issue.
I have a wireguard connection established to fly, DNS resolution is working fine. However, the replica tries to connect to the primary, it fails with the following error:
/ # litefs mount -debug
config file read from /etc/litefs.yml
LiteFS v0.5.11, commit=63eab529dc3353e8d159e097ffc4caa7badb8cb3
level=INFO msg="no backup client configured, skipping"
level=INFO msg="Using Consul to determine primary"
level=INFO msg="initializing consul: key=litefs/litefs-example-shy-log-5055 url=https://:4afcc737-8b9f-fe66-8393-778103640f76@consul-iad-13.fly-shared.net/litefs-example-shy-log-5055-g3zmqxej4n49dlp4/ hostname=54d9f4410df9 advertise-url=http://54d9f4410df9:20202"
level=INFO msg="LiteFS mounted to: /litefs"
level=INFO msg="http server listening on: http://localhost:20202"
level=INFO msg="waiting to connect to cluster"
level=INFO msg="cannot become primary, local node has no cluster ID and \"consul\" lease already initialized with cluster ID LFSC9A304FC90A4F3EC8"
level=INFO msg="4666E21935031903: existing primary found (874dd6b0192d18), connecting as replica to \"http://874dd6b0192d18.vm.litefs-example-shy-log-5055.internal:20202\""
level=INFO msg="4666E21935031903: disconnected from primary with error, retrying: connect to primary: Post \"http://874dd6b0192d18.vm.litefs-example-shy-log-5055.internal:20202/stream\": dial tcp [fdaa:1:829a:a7b:17a:828f:1705:2]:20202: connect: cannot assign requested address ('http://874dd6b0192d18.vm.litefs-example-shy-log-5055.internal:20202')"
level=INFO msg="cannot become primary, local node has no cluster ID and \"consul\" lease already initialized with cluster ID LFSC9A304FC90A4F3EC8"
While troubleshooting the problem, I realize that a simple GET
to http://874dd6b0192d18.vm.litefs-example-shy-log-5055.internal:20202
it’s failing.
/var/lib/litefs/dbs # curl -6 -I -v http://874dd6b0192d18.vm.litefs-example-shy-log-5055.internal:20202
* Host 874dd6b0192d18.vm.litefs-example-shy-log-5055.internal:20202 was resolved.
* IPv6: fdaa:1:829a:a7b:17a:828f:1705:2
* IPv4: (none)
* Trying [fdaa:1:829a:a7b:17a:828f:1705:2]:20202...
* Immediate connect fail for fdaa:1:829a:a7b:17a:828f:1705:2: Address not available
* Failed to connect to 874dd6b0192d18.vm.litefs-example-shy-log-5055.internal port 20202 after 130 ms: Could not connect to server
* closing connection #0
curl: (7) Failed to connect to 874dd6b0192d18.vm.litefs-example-shy-log-5055.internal port 20202 after 130 ms: Could not connect to server
I’m not sure how vm ports are exported or if I need to something special in order to open the access. The same call within fly is working fine.
Any help would be more than appreciated. I’m really looking forward to use litefs as our synchronization mechanism.