Django + LiteFS: LiteFS Cloud Cluster reports '0 databases'

I’m experimenting with Django + LiteFS on fly.io. While my app initially “works” (I followed the ‘Getting Started with LiteFS on Fly.io’ docs), my LiteFS cluster says it has 0 databases.

Additionally, there are some logs that are potentially suspicious and that indicate the app to be crashing / rebooting periodically.

Any help would be appreciated.

 2023-07-18T07:26:12.597 proxy [17811757a59068] den [info] Downscaling app test-app in region den from 1 machines to 0 machines. Automatically stopping machine 17811757a59068

2023-07-18T07:26:12.600 app[17811757a59068] den [info] INFO Sending signal SIGINT to main child process w/ PID 246

2023-07-18T07:26:17.656 app[17811757a59068] den [info] INFO Sending signal SIGTERM to main child process w/ PID 246

2023-07-18T07:26:17.777 app[17811757a59068] den [info] INFO Main child exited with signal (with signal 'SIGTERM', core dumped? false)

2023-07-18T07:26:17.777 app[17811757a59068] den [info] INFO Starting clean up.

2023-07-18T07:26:17.778 app[17811757a59068] den [info] INFO Umounting /dev/vdb from /var/lib/litefs

2023-07-18T07:26:17.779 app[17811757a59068] den [info] ERROR error umounting /var/lib/litefs: EBUSY: Device or resource busy, retrying in a bit

2023-07-18T07:26:18.531 app[17811757a59068] den [info] ERROR error umounting /var/lib/litefs: EBUSY: Device or resource busy, retrying in a bit

2023-07-18T07:26:19.283 app[17811757a59068] den [info] ERROR error umounting /var/lib/litefs: EBUSY: Device or resource busy, retrying in a bit

2023-07-18T07:26:20.035 app[17811757a59068] den [info] ERROR error umounting /var/lib/litefs: EBUSY: Device or resource busy, retrying in a bit

2023-07-18T07:26:20.807 app[17811757a59068] den [info] [2023-07-18 07:26:20 +0000] [266] [INFO] Handling signal: term

2023-07-18T07:26:20.807 app[17811757a59068] den [info] [2023-07-18 03:26:20 -0400] [267] [INFO] Worker exiting (pid: 267)

2023-07-18T07:26:20.807 app[17811757a59068] den [info] [2023-07-18 03:26:20 -0400] [268] [INFO] Worker exiting (pid: 268)

2023-07-18T07:26:20.807 app[17811757a59068] den [info] [2023-07-18 03:26:20 -0400] [269] [INFO] Worker exiting (pid: 269)

2023-07-18T07:26:20.807 app[17811757a59068] den [info] [2023-07-18 03:26:20 -0400] [270] [INFO] Worker exiting (pid: 270)

2023-07-18T07:26:20.807 app[17811757a59068] den [info] [2023-07-18 03:26:20 -0400] [271] [INFO] Worker exiting (pid: 271)

2023-07-18T07:26:20.807 app[17811757a59068] den [info] sending signal to exec process

2023-07-18T07:26:20.807 app[17811757a59068] den [info] waiting for exec process to close

2023-07-18T07:26:20.807 app[17811757a59068] den [info] signal received, litefs shutting down

2023-07-18T07:26:20.808 app[17811757a59068] den [info] WARN hallpass exited, pid: 247, status: signal: 15 (SIGTERM)

2023-07-18T07:26:20.824 app[17811757a59068] den [info] level=INFO msg="cannot unset primary status on host environment" err="Post \"http://localhost/v1/apps/test-app/machines/17811757a59068/metadata/role\": context canceled"

2023-07-18T07:26:20.824 app[17811757a59068] den [info] level=INFO msg="7551BFFFBA9DF060: exiting primary, destroying lease"

2023-07-18T07:26:20.834 app[17811757a59068] den [info] 2023/07/18 07:26:20 listening on [fdaa:0:d518:a7b:15c:8252:5b9c:2]:22 (DNS: [fdaa::3]:53)

2023-07-18T07:26:20.948 app[17811757a59068] den [info] ERROR: exit status 1: fusermount3: failed to unmount /litefs: Device or resource busy

2023-07-18T07:26:21.809 app[17811757a59068] den [info] [ 309.654430] reboot: Restarting system 

Hi @creimers

The “Device or resource busy” errors are harmless, as they happen only during shutdown.

Where do you store your SQLite DBs? Are they at /litefs (this is the location where LiteFS filesystem is mounted)?

Have you configured LiteFS Cloud token for your app as described here? LiteFS can work without LiteFS Cloud and without the token it doesn’t send any data to the cloud.

Hi @pavel, thanks for your reply.

The database url of the application is "sqlite:////litefs/test.sqlite3".

The litefs token env var is set.

In litefs.yml, I have specified

fuse:
  dir: "/litefs"
data:
  dir: "/var/lib/litefs"

In fly.toml, it reads

[mounts]
  source = "data"
  destination = "/var/lib/litefs"

And here’s the output from fly volumes list:

ID                  	STATE  	NAME	SIZE	REGION	ZONE	ENCRYPTED	ATTACHED VM   	CREATED AT
vol_podq4qpy5w6vg8w1	created	data	1GB 	den   	aa12	true     	17811757a59068	14 hours ago

This looks correct.

What about LITEFS_CLOUD_TOKEN? Have you set the secret for your app? Without it, LiteFS will not attempt to communicate with LiteFS Cloud and send backups to it.

I set the token like this:

fly secrets set LITEFS_CLOUD_TOKEN="FlyV1 token copied from the dashboard"

Hang on, now the databases do show up in LiteFs Cloud. Can there be a delay?

And how about the very first log line form my original post?

Downscaling app test-app in region den from 1 machines to 0 machines.

Might that be related at all?

Yep. I see them in our activity logs. There shouldn’t be any delay. LiteFS tries to push changes to the cloud every second.

It is possible that the LITEFS_CLOUD_TOKEN that you had previously was for a different cluster? I saw in the activity log that you had created test cluster that was later deleted. This might explain this behavior.

Might that be related at all?

This line just means that the machine was automatically downscaled due to inactivity. It will be started back automatically on any incoming request: Automatically stop and start Machines · Fly Docs

This shouldn’t affect LiteFS, as we let the machines run for some time before downscaling them again, so LiteFS will have time to push data to the cloud.

Ok thank you, things are clear now. :+1: