It seems that my site keeps falling over every few hours and won’t come up for a few minutes until it finally decides to reboot. I’m not sure what’s causing this and the logs are not helpful at all. They just stop for a while until the reboot happens. I’m not sure how to read these metrics, but something’s going on here with the VM Concurrency (whatever that means):
Anyway, any direction is appreciated.
Here are the logs. Note the 7 minute pause in here:
...
2022-11-23T01:33:23.657 app[db827b1f] den [info] GET /blog/migrating-to-reacts-new-context-api 200 - - 624.697 ms
2022-11-23T01:33:24.292 app[db827b1f] den [info] Updated the cache value for blog:my-modern-frontends-live-experience:rankings. Getting a fresh value for this took 539ms. Caching for 604800000ms + 86400000ms stale in SQLite cache.
2022-11-23T01:33:24.354 app[db827b1f] den [info] Background refresh for total-post-reads:compose-render-props successful. Getting a fresh value for this took 293ms. Caching for 60000ms + 86400000ms stale in LRU.
2022-11-23T01:33:24.409 app[db827b1f] den [info] Updated the cache value for blog:rankings. Getting a fresh value for this took 656ms. Caching for 3600000ms + 86400000ms stale in SQLite cache.
2022-11-23T01:40:51.142 runner[db827b1f] den [info] Shutting down virtual machine
2022-11-23T01:40:51.250 app[db827b1f] den [info] Sending signal SIGINT to main child process w/ PID 529
2022-11-23T01:40:51.253 app[db827b1f] den [info] sending signal to exec process
2022-11-23T01:40:51.253 app[db827b1f] den [info] waiting for exec process to close
2022-11-23T01:40:51.253 app[db827b1f] den [info] signal received, litefs shutting down
2022-11-23T01:40:51.254 app[db827b1f] den [info] exiting primary, destroying lease
2022-11-23T01:40:51.305 app[db827b1f] den [info] exit status 1: fusermount: failed to unmount /litefs/data: Device or resource busy
2022-11-23T01:41:05.421 runner[db827b1f] den [info] Starting instance
2022-11-23T01:41:05.595 runner[db827b1f] den [info] Configuring virtual machine
2022-11-23T01:41:05.596 runner[db827b1f] den [info] Pulling container image
2022-11-23T01:41:05.896 runner[db827b1f] den [info] Unpacking image
2022-11-23T01:41:05.910 runner[db827b1f] den [info] Preparing kernel init
2022-11-23T01:41:06.063 runner[db827b1f] den [info] Setting up volume 'data'
2022-11-23T01:41:06.066 runner[db827b1f] den [info] Opening encrypted volume
2022-11-23T01:41:06.201 runner[db827b1f] den [info] Configuring firecracker
2022-11-23T01:41:06.364 runner[db827b1f] den [info] Starting virtual machine
2022-11-23T01:41:06.566 app[db827b1f] den [info] Starting init (commit: 81d5330)...
2022-11-23T01:41:06.594 app[db827b1f] den [info] Mounting /dev/vdc at /data w/ uid: 0, gid: 0 and chmod 0755
2022-11-23T01:41:06.667 app[db827b1f] den [info] Preparing to run: `docker-entrypoint.sh litefs mount -- node ./other/start.js` as root
2022-11-23T01:41:06.695 app[db827b1f] den [info] 2022/11/23 01:41:06 listening on [fdaa:0:23df:a7b:d827:2:6f13:2]:22 (DNS: [fdaa::3]:53)
2022-11-23T01:41:06.756 app[db827b1f] den [info] config file read from /etc/litefs.yml
2022-11-23T01:41:06.756 app[db827b1f] den [info] LiteFS commit=33a80a3c5d7c3109170e7cb9cdab6ebf40b2fc3b
2022-11-23T01:41:06.756 app[db827b1f] den [info] Using Consul to determine primary
2022-11-23T01:41:08.987 app[db827b1f] den [info] initializing consul: key= url=https://:b832a863-60c2-48d4-8289-bdc5d80fc444@consul-iad.fly-shared.net/kcd-g3zmqx5x3y49dlp4/ hostname=db827b1f advertise-url=http://db827b1f.vm.kcd.internal:20202
2022-11-23T01:41:09.276 app[db827b1f] den [info] LiteFS mounted to: /litefs/data
2022-11-23T01:41:09.276 app[db827b1f] den [info] http server listening on: http://localhost:20202
2022-11-23T01:41:09.276 app[db827b1f] den [info] waiting to connect to cluster
2022-11-23T01:41:09.468 app[db827b1f] den [info] primary lease acquired, advertising as http://db827b1f.vm.kcd.internal:20202
2022-11-23T01:41:09.468 app[db827b1f] den [info] connected to cluster, ready
2022-11-23T01:41:09.469 app[db827b1f] den [info] starting subprocess: node [./other/start.js]
2022-11-23T01:41:09.473 app[db827b1f] den [info] waiting for signal or subprocess to exit
2022-11-23T01:41:09.710 app[db827b1f] den [info] No .primary file found.
2022-11-23T01:41:09.711 app[db827b1f] den [info] Using current instance (db827b1f) as primary (in den)
2022-11-23T01:41:09.711 app[db827b1f] den [info] Instance (db827b1f) in den is primary. Deploying migrations.
2022-11-23T01:41:11.238 app[db827b1f] den [info] Prisma schema loaded from prisma/schema.prisma
2022-11-23T01:41:11.247 app[db827b1f] den [info] Datasource "db": SQLite database "sqlite.db" at "file:/litefs/data/sqlite.db"
2022-11-23T01:41:11.323 app[db827b1f] den [info] 2 migrations found in prisma/migrations
2022-11-23T01:41:11.347 app[db827b1f] den [info] No pending migrations to apply.
2022-11-23T01:41:11.381 app[db827b1f] den [info] npm notice
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice New major version of npm available! 8.19.2 -> 9.1.2
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice Changelog: <https://github.com/npm/cli/releases/tag/v9.1.2>
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice Run `npm install -g npm@9.1.2` to update!
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice
2022-11-23T01:41:11.395 app[db827b1f] den [info] Starting app...
2022-11-23T01:41:11.810 app[db827b1f] den [info] > kentcdodds.com@1.0.0 start
2022-11-23T01:41:11.810 app[db827b1f] den [info] > cross-env NODE_ENV=production node --require ./node_modules/dotenv/config ./index.js
2022-11-23T01:41:14.672 app[db827b1f] den [info] Express server listening on port 8080
2022-11-23T01:41:14.853 app[db827b1f] den [info] prisma:info Starting a sqlite pool with 3 connections.
2022-11-23T01:41:16.138 app[db827b1f] den [info] Updated the cache value for total-post-reads:__all-posts__. Getting a fresh value for this took 278ms. Caching for 60000ms + 86400000ms stale in LRU.
2022-11-23T01:41:16.948 app[db827b1f] den [info] Updated the cache value for total-reader-count. Getting a fresh value for this took 1089ms. Caching for 300000ms + 86400000ms stale in LRU.
2022-11-23T01:41:16.989 app[db827b1f] den [info] prisma:query - 800ms -
2022-11-23T01:41:16.989 app[db827b1f] den [info] SELECT
2022-11-23T01:41:16.989 app[db827b1f] den [info] (SELECT COUNT(DISTINCT "userId") FROM "PostRead" WHERE "userId" IS NOT NULL) +
2022-11-23T01:41:16.989 app[db827b1f] den [info] (SELECT COUNT(DISTINCT "clientId") FROM "PostRead" WHERE "clientId" IS NOT NULL)
2022-11-23T01:41:18.898 app[db827b1f] den [info] HEAD / 200 89769 - 3061.132 ms
2022-11-23T01:41:18.904 app[db827b1f] den [info] prisma:query - 1783ms - SELECT COUNT(*), `main`.`PostRead`.`postSlug` FROM `main`.`PostRead` WHERE `main`.`PostRead`.`postSlug` NOT IN (?,?,?,?,?,?,?,?,?) GROUP BY `main`.`PostRead`.`postSlug` ORDER BY COUNT(`main`.`PostRead`.`postSlug`) DESC LIMIT ? OFFSET ?
2022-11-23T01:41:18.911 app[db827b1f] den [info] GET /blog/rss.xml 200 - - 171.851 ms
2022-11-23T01:41:18.931 app[db827b1f] den [info] Updated the cache value for total-post-reads:my-modern-frontends-live-experience. Getting a fresh value for this took 268ms. Caching for 60000ms + 86400000ms stale in LRU.
...
An important note is that this a Node.js app running in docker as a sub-process to LiteFS. Sadly I have no idea how to reproduce this issue so I really don’t have much insight into what could be causing this