Site falls over every few hours before rebooting

It seems that my site keeps falling over every few hours and won’t come up for a few minutes until it finally decides to reboot. I’m not sure what’s causing this and the logs are not helpful at all. They just stop for a while until the reboot happens. I’m not sure how to read these metrics, but something’s going on here with the VM Concurrency (whatever that means):

Anyway, any direction is appreciated.

Here are the logs. Note the 7 minute pause in here:

...
2022-11-23T01:33:23.657 app[db827b1f] den [info] GET /blog/migrating-to-reacts-new-context-api 200 - - 624.697 ms
2022-11-23T01:33:24.292 app[db827b1f] den [info] Updated the cache value for blog:my-modern-frontends-live-experience:rankings. Getting a fresh value for this took 539ms. Caching for 604800000ms + 86400000ms stale in SQLite cache.
2022-11-23T01:33:24.354 app[db827b1f] den [info] Background refresh for total-post-reads:compose-render-props successful. Getting a fresh value for this took 293ms. Caching for 60000ms + 86400000ms stale in LRU.
2022-11-23T01:33:24.409 app[db827b1f] den [info] Updated the cache value for blog:rankings. Getting a fresh value for this took 656ms. Caching for 3600000ms + 86400000ms stale in SQLite cache.
2022-11-23T01:40:51.142 runner[db827b1f] den [info] Shutting down virtual machine
2022-11-23T01:40:51.250 app[db827b1f] den [info] Sending signal SIGINT to main child process w/ PID 529
2022-11-23T01:40:51.253 app[db827b1f] den [info] sending signal to exec process
2022-11-23T01:40:51.253 app[db827b1f] den [info] waiting for exec process to close
2022-11-23T01:40:51.253 app[db827b1f] den [info] signal received, litefs shutting down
2022-11-23T01:40:51.254 app[db827b1f] den [info] exiting primary, destroying lease
2022-11-23T01:40:51.305 app[db827b1f] den [info] exit status 1: fusermount: failed to unmount /litefs/data: Device or resource busy
2022-11-23T01:41:05.421 runner[db827b1f] den [info] Starting instance
2022-11-23T01:41:05.595 runner[db827b1f] den [info] Configuring virtual machine
2022-11-23T01:41:05.596 runner[db827b1f] den [info] Pulling container image
2022-11-23T01:41:05.896 runner[db827b1f] den [info] Unpacking image
2022-11-23T01:41:05.910 runner[db827b1f] den [info] Preparing kernel init
2022-11-23T01:41:06.063 runner[db827b1f] den [info] Setting up volume 'data'
2022-11-23T01:41:06.066 runner[db827b1f] den [info] Opening encrypted volume
2022-11-23T01:41:06.201 runner[db827b1f] den [info] Configuring firecracker
2022-11-23T01:41:06.364 runner[db827b1f] den [info] Starting virtual machine
2022-11-23T01:41:06.566 app[db827b1f] den [info] Starting init (commit: 81d5330)...
2022-11-23T01:41:06.594 app[db827b1f] den [info] Mounting /dev/vdc at /data w/ uid: 0, gid: 0 and chmod 0755
2022-11-23T01:41:06.667 app[db827b1f] den [info] Preparing to run: `docker-entrypoint.sh litefs mount -- node ./other/start.js` as root
2022-11-23T01:41:06.695 app[db827b1f] den [info] 2022/11/23 01:41:06 listening on [fdaa:0:23df:a7b:d827:2:6f13:2]:22 (DNS: [fdaa::3]:53)
2022-11-23T01:41:06.756 app[db827b1f] den [info] config file read from /etc/litefs.yml
2022-11-23T01:41:06.756 app[db827b1f] den [info] LiteFS commit=33a80a3c5d7c3109170e7cb9cdab6ebf40b2fc3b
2022-11-23T01:41:06.756 app[db827b1f] den [info] Using Consul to determine primary
2022-11-23T01:41:08.987 app[db827b1f] den [info] initializing consul: key= url=https://:b832a863-60c2-48d4-8289-bdc5d80fc444@consul-iad.fly-shared.net/kcd-g3zmqx5x3y49dlp4/ hostname=db827b1f advertise-url=http://db827b1f.vm.kcd.internal:20202
2022-11-23T01:41:09.276 app[db827b1f] den [info] LiteFS mounted to: /litefs/data
2022-11-23T01:41:09.276 app[db827b1f] den [info] http server listening on: http://localhost:20202
2022-11-23T01:41:09.276 app[db827b1f] den [info] waiting to connect to cluster
2022-11-23T01:41:09.468 app[db827b1f] den [info] primary lease acquired, advertising as http://db827b1f.vm.kcd.internal:20202
2022-11-23T01:41:09.468 app[db827b1f] den [info] connected to cluster, ready
2022-11-23T01:41:09.469 app[db827b1f] den [info] starting subprocess: node [./other/start.js]
2022-11-23T01:41:09.473 app[db827b1f] den [info] waiting for signal or subprocess to exit
2022-11-23T01:41:09.710 app[db827b1f] den [info] No .primary file found.
2022-11-23T01:41:09.711 app[db827b1f] den [info] Using current instance (db827b1f) as primary (in den)
2022-11-23T01:41:09.711 app[db827b1f] den [info] Instance (db827b1f) in den is primary. Deploying migrations.
2022-11-23T01:41:11.238 app[db827b1f] den [info] Prisma schema loaded from prisma/schema.prisma
2022-11-23T01:41:11.247 app[db827b1f] den [info] Datasource "db": SQLite database "sqlite.db" at "file:/litefs/data/sqlite.db"
2022-11-23T01:41:11.323 app[db827b1f] den [info] 2 migrations found in prisma/migrations
2022-11-23T01:41:11.347 app[db827b1f] den [info] No pending migrations to apply.
2022-11-23T01:41:11.381 app[db827b1f] den [info] npm notice
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice New major version of npm available! 8.19.2 -> 9.1.2
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice Changelog: <https://github.com/npm/cli/releases/tag/v9.1.2>
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice Run `npm install -g npm@9.1.2` to update!
2022-11-23T01:41:11.382 app[db827b1f] den [info] npm notice
2022-11-23T01:41:11.395 app[db827b1f] den [info] Starting app...
2022-11-23T01:41:11.810 app[db827b1f] den [info] > kentcdodds.com@1.0.0 start
2022-11-23T01:41:11.810 app[db827b1f] den [info] > cross-env NODE_ENV=production node --require ./node_modules/dotenv/config ./index.js
2022-11-23T01:41:14.672 app[db827b1f] den [info] Express server listening on port 8080
2022-11-23T01:41:14.853 app[db827b1f] den [info] prisma:info Starting a sqlite pool with 3 connections.
2022-11-23T01:41:16.138 app[db827b1f] den [info] Updated the cache value for total-post-reads:__all-posts__. Getting a fresh value for this took 278ms. Caching for 60000ms + 86400000ms stale in LRU.
2022-11-23T01:41:16.948 app[db827b1f] den [info] Updated the cache value for total-reader-count. Getting a fresh value for this took 1089ms. Caching for 300000ms + 86400000ms stale in LRU.
2022-11-23T01:41:16.989 app[db827b1f] den [info] prisma:query - 800ms -
2022-11-23T01:41:16.989 app[db827b1f] den [info] SELECT
2022-11-23T01:41:16.989 app[db827b1f] den [info] (SELECT COUNT(DISTINCT "userId") FROM "PostRead" WHERE "userId" IS NOT NULL) +
2022-11-23T01:41:16.989 app[db827b1f] den [info] (SELECT COUNT(DISTINCT "clientId") FROM "PostRead" WHERE "clientId" IS NOT NULL)
2022-11-23T01:41:18.898 app[db827b1f] den [info] HEAD / 200 89769 - 3061.132 ms
2022-11-23T01:41:18.904 app[db827b1f] den [info] prisma:query - 1783ms - SELECT COUNT(*), `main`.`PostRead`.`postSlug` FROM `main`.`PostRead` WHERE `main`.`PostRead`.`postSlug` NOT IN (?,?,?,?,?,?,?,?,?) GROUP BY `main`.`PostRead`.`postSlug` ORDER BY COUNT(`main`.`PostRead`.`postSlug`) DESC LIMIT ? OFFSET ?
2022-11-23T01:41:18.911 app[db827b1f] den [info] GET /blog/rss.xml 200 - - 171.851 ms
2022-11-23T01:41:18.931 app[db827b1f] den [info] Updated the cache value for total-post-reads:my-modern-frontends-live-experience. Getting a fresh value for this took 268ms. Caching for 60000ms + 86400000ms stale in LRU.
...

An important note is that this a Node.js app running in docker as a sub-process to LiteFS. Sadly I have no idea how to reproduce this issue so I really don’t have much insight into what could be causing this :frowning:

Hey there!

VM Concurrency is how much of your concurrency limits is being used. We called it generically “concurrency” because it could be either connections or requests depending on your settings.

Looking at more charts, it seems like CPU usage goes through the roof at some point.

You can see more detailed graphs on fly-metrics.net

I can’t tell what’s going on exactly, but here’s how you can maybe figure it out:

  • Add more logging for things expected to be happening. Perhaps marking the start / end of expensive operations, logging the duration, etc.
  • Next time you notice it stall, you can try fly ssh console -a kcd and run top (or similar) to see what’s using so much CPU inside the instance

Something is definitely pegging the CPU, preventing our proxy from connecting to your app altogether. I see 502 responses with “connection timed out” errors.

I added a logs for every request both before and after and I noticed this:

2022-11-23T16:02:18Z app[b3c9a879] den [info]REMIX: POST 213.39.27.190, 149.248.193.173 -> /__metronome https://kentcdodds.com/blog/using-fetch-with-type-script
2022-11-23T16:02:20Z app[b3c9a879] den [info]<--- Last few GCs --->
2022-11-23T16:02:20Z app[b3c9a879] den [info][666:0x5b34580]   229432 ms: Scavenge (reduce) 984.0 (1005.4) -> 983.2 (1005.4) MB, 2.8 / 0.0 ms  (average mu = 0.192, current mu = 0.168) allocation failure;
2022-11-23T16:02:20Z app[b3c9a879] den [info][666:0x5b34580]   229495 ms: Scavenge (reduce) 985.1 (1006.4) -> 984.6 (1006.6) MB, 3.0 / 0.0 ms  (average mu = 0.192, current mu = 0.168) allocation failure;
2022-11-23T16:02:20Z app[b3c9a879] den [info][666:0x5b34580]   230370 ms: Mark-sweep (reduce) 985.4 (1006.6) -> 984.0 (1006.9) MB, 868.8 / 0.0 ms  (average mu = 0.260, current mu = 0.324) allocation failure; scavenge might not succeed
2022-11-23T16:02:20Z app[b3c9a879] den [info]<--- JS stacktrace --->
2022-11-23T16:02:20Z app[b3c9a879] den [info]FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2022-11-23T16:02:20Z app[b3c9a879] den [info] 1: 0xb6e500 node::Abort() [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 2: 0xa7e632  [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 3: 0xd47f20 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 4: 0xd482c7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 5: 0xf25685  [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 6: 0xf26588 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 7: 0xf36a93  [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 8: 0xf37908 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info] 9: 0xf1226e v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info]10: 0xf13637 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info]11: 0xef480a v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info]12: 0x12b7daf v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info]13: 0x16e99f9  [node]
2022-11-23T16:02:20Z app[b3c9a879] den [info]node:internal/process/promises:288
2022-11-23T16:02:20Z app[b3c9a879] den [info]            triggerUncaughtException(err, true /* fromPromise */);
2022-11-23T16:02:20Z app[b3c9a879] den [info]            ^
2022-11-23T16:02:20Z app[b3c9a879] den [info][UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "undefined".] {
2022-11-23T16:02:20Z app[b3c9a879] den [info]  code: 'ERR_UNHANDLED_REJECTION'
2022-11-23T16:02:20Z app[b3c9a879] den [info]}
2022-11-23T16:02:20Z app[b3c9a879] den [info]Node.js v18.12.1
2022-11-23T16:02:20Z app[b3c9a879] den [info]subprocess exited, litefs shutting down
2022-11-23T16:02:20Z app[b3c9a879] den [info]exiting primary, destroying lease
2022-11-23T16:02:20Z app[b3c9a879] den [info]litefs shut down complete
2022-11-23T16:02:21Z app[b3c9a879] den [info]Starting clean up.
2022-11-23T16:02:21Z app[b3c9a879] den [info]Umounting /dev/vdc from /data
2022-11-23T16:02:27Z runner[b3c9a879] den [info]Starting instance

That metronome request shouldn’t be a problem (it’s just metrics measurements), but then it happened again:

2022-11-23T16:05:08Z app[b3c9a879] den [info]REMIX: POST 151.52.202.237, 149.248.193.173 -> /__metronome https://kentcdodds.com/blog/make-your-own-dev-tools
2022-11-23T16:05:12Z app[b3c9a879] den [info]<--- Last few GCs --->
2022-11-23T16:05:12Z app[b3c9a879] den [info][666:0x6bbb580]   156398 ms: Mark-sweep (reduce) 985.1 (1007.3) -> 984.0 (1006.8) MB, 925.6 / 0.0 ms  (average mu = 0.284, current mu = 0.224) allocation failure; scavenge might not succeed
2022-11-23T16:05:12Z app[b3c9a879] den [info][666:0x6bbb580]   157547 ms: Mark-sweep (reduce) 985.1 (1006.8) -> 984.2 (1007.1) MB, 1115.7 / 0.0 ms  (average mu = 0.162, current mu = 0.028) allocation failure; scavenge might not succeed
2022-11-23T16:05:12Z app[b3c9a879] den [info]<--- JS stacktrace --->
2022-11-23T16:05:12Z app[b3c9a879] den [info]FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2022-11-23T16:05:12Z app[b3c9a879] den [info] 1: 0xb6e500 node::Abort() [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 2: 0xa7e632  [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 3: 0xd47f20 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 4: 0xd482c7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 5: 0xf25685  [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 6: 0xf26588 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 7: 0xf36a93  [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 8: 0xf37908 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info] 9: 0xf1226e v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info]10: 0xf13637 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info]11: 0xef480a v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info]12: 0x12b7daf v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info]13: 0x16e99f9  [node]
2022-11-23T16:05:12Z app[b3c9a879] den [info]node:internal/process/promises:288
2022-11-23T16:05:12Z app[b3c9a879] den [info]            triggerUncaughtException(err, true /* fromPromise */);
2022-11-23T16:05:12Z app[b3c9a879] den [info]            ^
2022-11-23T16:05:12Z app[b3c9a879] den [info][UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "undefined".] {
2022-11-23T16:05:12Z app[b3c9a879] den [info]  code: 'ERR_UNHANDLED_REJECTION'
2022-11-23T16:05:12Z app[b3c9a879] den [info]}
2022-11-23T16:05:12Z app[b3c9a879] den [info]Node.js v18.12.1
2022-11-23T16:05:12Z app[b3c9a879] den [info]subprocess exited, litefs shutting down
2022-11-23T16:05:12Z app[b3c9a879] den [info]exiting primary, destroying lease
2022-11-23T16:05:13Z app[b3c9a879] den [info]Starting clean up.
2022-11-23T16:05:13Z app[b3c9a879] den [info]Umounting /dev/vdc from /data

So I’m deploying with metronome disabled to see whether that fixes it :man_shrugging:

Guess when I added the silly amount of logging :laughing:

(The deep valleys were redeploys).

Not sure if is related but I’ve seen a couple of errors in Metronome in the last 9 hours:

Error
Invalid `prisma.postRead.groupBy()` invocation: Error occurred during query execution: ConnectorError(ConnectorError { user_facing_error: None, kind: ConnectionError(Timed out during query execution.) })

And also Syntax Errors like these

SyntaxError
Invalid regular expression: /inversion-of-control)\/index.mdx?$/: Unmatched ')'

SyntaxError 
Invalid regular expression: /speed-up-your-app-with-web-workers)\/index.mdx?$/: Unmatched ')'

SyntaxError
Invalid regular expression: /why-i-love-remix)\/index.mdx?$/: Unmatched ')'

I removed metronome from the site, but there are still some clients that have the old code and are making metronome post requests. I don’t think metronome is the culprit here, just an innocent bystander. Anyway, I just observed another OOM error:

022-11-23T18:25:51Z app[6f5087ba] den [info]handleRequestS: POST /__metronome 405
2022-11-23T18:25:51Z app[6f5087ba] den [info]CatchBoundary { status: 405, statusText: '', data: '' }
2022-11-23T18:25:51Z app[6f5087ba] den [info]handleRequestS: POST /__metronome 500
2022-11-23T18:25:51Z app[6f5087ba] den [info]GET ::ffff:172.19.0.145 -> /build/routes/chats/$season-F3UTI4S3.js -
2022-11-23T18:25:51Z app[6f5087ba] den [info]GET ::ffff:172.19.0.145 <- /build/routes/calls.record-4BIPTRXO.js 404 - - 217.512 ms
2022-11-23T18:25:51Z app[6f5087ba] den [info]GET ::ffff:172.19.0.145 <- /healthcheck 200 - - 4344.459 ms
2022-11-23T18:25:51Z app[6f5087ba] den [info]GET ::ffff:172.19.0.145 <- /blog/props-vs-state 200 - - 2035.556 ms
2022-11-23T18:25:51Z app[6f5087ba] den [info]GET ::ffff:172.19.0.145 <- /build/_shared/chunk-36WYQQTY.js 200 926 - 4013.990 ms
2022-11-23T18:25:51Z app[6f5087ba] den [info]Error: Unhandled error: 405
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at CatchBoundary5 (/app/build/index.js:14926:9)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at Fb (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:65:44)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at Ib (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:67:254)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at W (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:73:89)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at Ib (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:71:304)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at W (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:73:89)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at Ib (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:67:482)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at W (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:73:89)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at Ib (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:71:304)
2022-11-23T18:25:51Z app[6f5087ba] den [info]    at W (/app/node_modules/react-dom/cjs/react-dom-server-legacy.node.production.min.js:73:89)
2022-11-23T18:25:51Z app[6f5087ba] den [info]handleRequestE: POST /__metronome 500
2022-11-23T18:25:51Z app[6f5087ba] den [info]REMIX: GET 38.126.157.48, 149.248.193.173 -> /build/routes/chats/$season-F3UTI4S3.js undefined
2022-11-23T18:25:56Z app[6f5087ba] den [info]<--- Last few GCs --->
2022-11-23T18:25:56Z app[6f5087ba] den [info][666:0x530b580]   263186 ms: Mark-sweep (reduce) 986.1 (1007.3) -> 984.9 (1007.5) MB, 941.1 / 0.0 ms  (average mu = 0.230, current mu = 0.172) allocation failure; scavenge might not succeed
2022-11-23T18:25:56Z app[6f5087ba] den [info][666:0x530b580]   264321 ms: Mark-sweep (reduce) 985.9 (1007.5) -> 985.2 (1007.0) MB, 1133.5 / 0.0 ms  (average mu = 0.119, current mu = 0.002) allocation failure; scavenge might not succeed
2022-11-23T18:25:56Z app[6f5087ba] den [info]<--- JS stacktrace --->
2022-11-23T18:25:56Z app[6f5087ba] den [info]FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2022-11-23T18:25:56Z app[6f5087ba] den [info] 1: 0xb6e500 node::Abort() [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 2: 0xa7e632  [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 3: 0xd47f20 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 4: 0xd482c7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 5: 0xf25685  [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 6: 0xf26588 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 7: 0xf36a93  [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 8: 0xf37908 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info] 9: 0xf1226e v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info]10: 0xf13637 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info]11: 0xef480a v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info]12: 0x12b7daf v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info]13: 0x16e99f9  [node]
2022-11-23T18:25:56Z app[6f5087ba] den [info]node:internal/process/promises:288
2022-11-23T18:25:56Z app[6f5087ba] den [info]            triggerUncaughtException(err, true /* fromPromise */);
2022-11-23T18:25:56Z app[6f5087ba] den [info]            ^
2022-11-23T18:25:56Z app[6f5087ba] den [info][UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "undefined".] {
2022-11-23T18:25:56Z app[6f5087ba] den [info]  code: 'ERR_UNHANDLED_REJECTION'
2022-11-23T18:25:56Z app[6f5087ba] den [info]}
2022-11-23T18:25:56Z app[6f5087ba] den [info]Node.js v18.12.1
2022-11-23T18:25:56Z app[6f5087ba] den [info]subprocess exited, litefs shutting down
2022-11-23T18:25:56Z app[6f5087ba] den [info]exiting primary, destroying lease
2022-11-23T18:25:56Z app[6f5087ba] den [info]litefs shut down complete
2022-11-23T18:25:56Z app[6f5087ba] den [info]Starting clean up.
2022-11-23T18:25:56Z app[6f5087ba] den [info]Umounting /dev/vdc from /data
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Starting instance
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Configuring virtual machine
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Pulling container image
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Unpacking image
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Preparing kernel init
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Setting up volume 'data'
2022-11-23T18:26:03Z runner[6f5087ba] den [info]Opening encrypted volume
2022-11-23T18:26:04Z runner[6f5087ba] den [info]Configuring firecracker
2022-11-23T18:26:04Z runner[6f5087ba] den [info]Starting virtual machine
2022-11-23T18:26:04Z app[6f5087ba] den [info]Starting init (commit: 81d5330)...
2022-11-23T18:26:04Z app[6f5087ba] den [info]Mounting /dev/vdc at /data w/ uid: 0, gid: 0 and chmod 0755
2022-11-23T18:26:04Z app[6f5087ba] den [info]Preparing to run: `docker-entrypoint.sh litefs mount -- node ./other/start.js` as root
2022-11-23T18:26:04Z app[6f5087ba] den [info]2022/11/23 18:26:04 listening on [fdaa:0:23df:a7b:d827:2:6f13:2]:22 (DNS: [fdaa::3]:53)
2022-11-23T18:26:04Z app[6f5087ba] den [info]config file read from /etc/litefs.yml
2022-11-23T18:26:04Z app[6f5087ba] den [info]LiteFS commit=33a80a3c5d7c3109170e7cb9cdab6ebf40b2fc3b
2022-11-23T18:26:04Z app[6f5087ba] den [info]Using Consul to determine primary
2022-11-23T18:26:06Z app[6f5087ba] den [info]initializing consul: key= url=https://:b832a863-60c2-48d4-8289-bdc5d80fc444@consul-iad.fly-shared.net/kcd-g3zmqx5x3y49dlp4/ hostname=6f5087ba advertise-url=http://6f5087ba.vm.kcd.internal:20202
2022-11-23T18:26:07Z app[6f5087ba] den [info]LiteFS mounted to: /litefs/data-2
2022-11-23T18:26:07Z app[6f5087ba] den [info]http server listening on: http://localhost:20202
2022-11-23T18:26:07Z app[6f5087ba] den [info]waiting to connect to cluster
2022-11-23T18:26:07Z app[6f5087ba] den [info]primary lease acquired, advertising as http://6f5087ba.vm.kcd.internal:20202
2022-11-23T18:26:07Z app[6f5087ba] den [info]connected to cluster, ready
2022-11-23T18:26:07Z app[6f5087ba] den [info]starting subprocess: node [./other/start.js]
2022-11-23T18:26:07Z app[6f5087ba] den [info]waiting for signal or subprocess to exit
2022-11-23T18:26:07Z app[6f5087ba] den [info]No .primary file found.
2022-11-23T18:26:07Z app[6f5087ba] den [info]Using current instance (6f5087ba) as primary (in den)
2022-11-23T18:26:07Z app[6f5087ba] den [info]Instance (6f5087ba) in den is primary. Deploying migrations.
2022-11-23T18:26:08Z app[6f5087ba] den [info]Prisma schema loaded from prisma/schema.prisma
2022-11-23T18:26:09Z app[6f5087ba] den [info]Datasource "db": SQLite database "sqlite.db" at "file:/litefs/data-2/sqlite.db"
2022-11-23T18:26:09Z app[6f5087ba] den [info]2 migrations found in prisma/migrations
2022-11-23T18:26:09Z app[6f5087ba] den [info]No pending migrations to apply.
2022-11-23T18:26:09Z app[6f5087ba] den [info]npm notice
2022-11-23T18:26:09Z app[6f5087ba] den [info]npm notice New major version of npm available! 8.19.2 -> 9.1.2
2022-11-23T18:26:09Z app[6f5087ba] den [info]npm notice Changelog: <https://github.com/npm/cli/releases/tag/v9.1.2>
2022-11-23T18:26:09Z app[6f5087ba] den [info]npm notice Run `npm install -g npm@9.1.2` to update!
2022-11-23T18:26:09Z app[6f5087ba] den [info]npm notice
2022-11-23T18:26:09Z app[6f5087ba] den [info]Starting app...
2022-11-23T18:26:09Z app[6f5087ba] den [info]> kentcdodds.com@1.0.0 start
2022-11-23T18:26:09Z app[6f5087ba] den [info]> cross-env NODE_ENV=production node --require ./node_modules/dotenv/config ./index.js
2022-11-23T18:26:12Z app[6f5087ba] den [info]Express server listening on port 8080
2022-11-23T18:26:12Z app[6f5087ba] den [info]prisma:info Starting a sqlite pool with 3 connections.
2022-11-23T18:26:13Z app[6f5087ba] den [info]GET ::ffff:172.19.0.145 -> /healthcheck -
2022-11-23T18:26:13Z app[6f5087ba] den [info]REMIX: GET ::ffff:172.19.0.145 -> /healthcheck undefined
2022-11-23T18:26:13Z app[6f5087ba] den [info]HEAD ::ffff:172.19.0.146 -> / -
2022-11-23T18:26:13Z app[6f5087ba] den [info]REMIX: HEAD ::ffff:172.19.0.146 -> / undefined

I kinda think the logs are causing this because it started when I deployed the logs change:

So I’m going to reduce the amount of logging. Hopefully that doesn’t cause me to miss anything…

Did you change the node version by any chance?

I haven’t changed the node version. It’s running v18.12.1.

I’m afraid I can’t figure out how to install and use top within my VM. Any tips? I tried running apt-get install -y top and got:

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package top

Sorry, I’m pretty useless in this world :sweat_smile:

You’ll likely have to run apt update before. This will get the latest package list from the Ubuntu repositories.

Tried:

# apt update
Hit:1 http://deb.debian.org/debian bullseye InRelease
Hit:2 http://deb.debian.org/debian-security bullseye-security InRelease
Hit:3 http://deb.debian.org/debian bullseye-updates InRelease
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
4 packages can be upgraded. Run 'apt list --upgradable' to see them.
# apt install top
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package top
# 

In case it’s helpful at all, here’s my Dockerfile: kentcdodds.com/Dockerfile at main · kentcdodds/kentcdodds.com · GitHub

Oh this is Debian. My bad. apt install procps and then you should have top available.

2 Likes

Well. It happened again:

procps got me top, unfortunately when this happened again top didn’t reveal much of any use:

top - 14:42:52 up  8:41,  0 users,  load average: 1.00, 0.81, 0.46
Tasks:  53 total,   1 running,  52 sleeping,   0 stopped,   0 zomb
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,  0.0 id,100.0 wa,  0.0 hi,  0.
MiB Mem :   1982.5 total,   1029.2 free,    458.7 used,    494.7 b
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   1385.5 a

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM 
    1 root      20   0    6496   3448   3012 S   0.0   0.2 
    2 root      20   0       0      0      0 S   0.0   0.0 
    3 root       0 -20       0      0      0 I   0.0   0.0 
    4 root       0 -20       0      0      0 I   0.0   0.0 
    5 root      20   0       0      0      0 I   0.0   0.0 
    6 root       0 -20       0      0      0 I   0.0   0.0 
    7 root      20   0       0      0      0 I   0.0   0.0 
    9 root       0 -20       0      0      0 I   0.0   0.0 
   10 root      20   0       0      0      0 S   0.0   0.0 
   11 root      20   0       0      0      0 S   0.0   0.0 
   12 root      20   0       0      0      0 S   0.0   0.0 
   13 root      20   0       0      0      0 I   0.0   0.0 
   14 root      rt   0       0      0      0 S   0.0   0.0 
   15 root      20   0       0      0      0 S   0.0   0.0 
   17 root      20   0       0      0      0 S   0.0   0.0 
   18 root       0 -20       0      0      0 I   0.0   0.0 
   19 root       0 -20       0      0      0 I   0.0   0.0 
   20 root      20   0       0      0      0 S   0.0   0.0 

I tried to get that to show the process name with -c, but that’s the only output I could get. I don’t know if it means anything to anyone else, but it doesn’t mean anything to me :man_shrugging:

Here’s the log output at the time of the hang-up:

2022-11-25T14:35:57.603 app[4ecc7291] den [info] GET /blog/my-modern-frontends-live-experience 200 - - 63.310 ms
2022-11-25T14:35:57.635 app[4ecc7291] den [info] Background refresh for total-post-reads:my-modern-frontends-live-experience successful. Getting a fresh value for this took 90ms. Caching for 60000ms + 86400000ms stale in LRU.
2022-11-25T14:35:57.831 app[4ecc7291] den [info] GET /__metronome/metronome-5.6.1.js 200 - - 1.303 ms
2022-11-25T14:35:58.174 app[4ecc7291] den [info] GET /blog/rss.xml 200 - - 12.282 ms
2022-11-25T14:36:01.307 app[4ecc7291] den [info] POST /__metronome 204 - - 8.243 ms
2022-11-25T14:36:01.350 app[4ecc7291] den [info] HEAD / 200 91505 - 23.593 ms
2022-11-25T14:36:01.352 app[4ecc7291] den [info] GET /healthcheck 200 - - 36.657 ms
2022-11-25T14:36:02.638 app[4ecc7291] den [info] GET /blog/rss.xml 200 - - 14.334 ms
2022-11-25T14:36:02.692 app[4ecc7291] den [info] Updated the cache value for blog:use-ternaries-rather-than-and-and-in-jsx:rankings. Getting a fresh value for this took 545ms. Caching for 604800000ms + 86400000ms stale in SQLite cache.
2022-11-25T14:36:02.736 app[4ecc7291] den [info] Updated the cache value for blog:rankings. Getting a fresh value for this took 589ms. Caching for 3600000ms + 86400000ms stale in SQLite cache.
2022-11-25T14:43:31.780 runner[4ecc7291] den [info] Shutting down virtual machine
2022-11-25T14:43:31.887 app[4ecc7291] den [info] Sending signal SIGINT to main child process w/ PID 529
2022-11-25T14:43:31.889 app[4ecc7291] den [info] sending signal to exec process
2022-11-25T14:43:31.889 app[4ecc7291] den [info] waiting for exec process to close
2022-11-25T14:43:31.889 app[4ecc7291] den [info] signal received, litefs shutting down
2022-11-25T14:43:31.890 app[4ecc7291] den [info] exiting primary, destroying lease
2022-11-25T14:43:31.950 app[4ecc7291] den [info] exit status 1: fusermount: failed to unmount /litefs/data-2: Device or resource busy
2022-11-25T14:43:46.270 runner[4ecc7291] den [info] Starting instance
2022-11-25T14:43:46.386 runner[4ecc7291] den [info] Configuring virtual machine
2022-11-25T14:43:46.387 runner[4ecc7291] den [info] Pulling container image
2022-11-25T14:43:46.673 runner[4ecc7291] den [info] Unpacking image
2022-11-25T14:43:46.686 runner[4ecc7291] den [info] Preparing kernel init
2022-11-25T14:43:46.831 runner[4ecc7291] den [info] Setting up volume 'data'
2022-11-25T14:43:46.835 runner[4ecc7291] den [info] Opening encrypted volume
2022-11-25T14:43:46.973 runner[4ecc7291] den [info] Configuring firecracker
2022-11-25T14:43:47.028 runner[4ecc7291] den [info] Starting virtual machine
2022-11-25T14:43:47.241 app[4ecc7291] den [info] Starting init (commit: 81d5330)...
2022-11-25T14:43:47.272 app[4ecc7291] den [info] Mounting /dev/vdc at /data w/ uid: 0, gid: 0 and chmod 0755
2022-11-25T14:43:47.324 app[4ecc7291] den [info] Preparing to run: `docker-entrypoint.sh litefs mount -- node ./other/start.js` as root
2022-11-25T14:43:47.357 app[4ecc7291] den [info] 2022/11/25 14:43:47 listening on [fdaa:0:23df:a7b:d827:2:6f13:2]:22 (DNS: [fdaa::3]:53)
2022-11-25T14:43:47.436 app[4ecc7291] den [info] config file read from /etc/litefs.yml
2022-11-25T14:43:47.436 app[4ecc7291] den [info] LiteFS v0.3.0-beta5, commit=9bb4a2d406f69bdb513c4fd91a5986d62dc08612
2022-11-25T14:43:47.436 app[4ecc7291] den [info] Using Consul to determine primary
2022-11-25T14:43:49.800 app[4ecc7291] den [info] initializing consul: key= url=https://:b832a863-60c2-48d4-8289-bdc5d80fc444@consul-iad.fly-shared.net/kcd-g3zmqx5x3y49dlp4/ hostname=4ecc7291 advertise-url=http://4ecc7291.vm.kcd.internal:20202
2022-11-25T14:43:50.127 app[4ecc7291] den [info] LiteFS mounted to: /litefs/data-2
2022-11-25T14:43:50.127 app[4ecc7291] den [info] http server listening on: http://localhost:20202
2022-11-25T14:43:50.127 app[4ecc7291] den [info] waiting to connect to cluster
2022-11-25T14:43:50.505 app[4ecc7291] den [info] primary lease acquired, advertising as http://4ecc7291.vm.kcd.internal:20202
2022-11-25T14:43:50.505 app[4ecc7291] den [info] connected to cluster, ready
2022-11-25T14:43:50.505 app[4ecc7291] den [info] starting subprocess: node [./other/start.js]
2022-11-25T14:43:50.510 app[4ecc7291] den [info] waiting for signal or subprocess to exit
2022-11-25T14:43:50.770 app[4ecc7291] den [info] No .primary file found.
2022-11-25T14:43:50.770 app[4ecc7291] den [info] Using current instance (4ecc7291) as primary (in den)
2022-11-25T14:43:50.770 app[4ecc7291] den [info] Instance (4ecc7291) in den is primary. Deploying migrations.
2022-11-25T14:43:52.233 app[4ecc7291] den [info] Prisma schema loaded from prisma/schema.prisma
2022-11-25T14:43:52.243 app[4ecc7291] den [info] Datasource "db": SQLite database "sqlite.db" at "file:/litefs/data-2/sqlite.db"
2022-11-25T14:43:52.316 app[4ecc7291] den [info] 2 migrations found in prisma/migrations
2022-11-25T14:43:52.339 app[4ecc7291] den [info] No pending migrations to apply.
2022-11-25T14:43:52.371 app[4ecc7291] den [info] npm notice
2022-11-25T14:43:52.371 app[4ecc7291] den [info] npm notice New major version of npm available! 8.19.2 -> 9.1.2
2022-11-25T14:43:52.372 app[4ecc7291] den [info] npm notice Changelog: <https://github.com/npm/cli/releases/tag/v9.1.2>
2022-11-25T14:43:52.372 app[4ecc7291] den [info] npm notice Run `npm install -g npm@9.1.2` to update!
2022-11-25T14:43:52.372 app[4ecc7291] den [info] npm notice
2022-11-25T14:43:52.382 app[4ecc7291] den [info] Starting app...
2022-11-25T14:43:52.777 app[4ecc7291] den [info] > kentcdodds.com@1.0.0 start
2022-11-25T14:43:52.777 app[4ecc7291] den [info] > cross-env NODE_ENV=production node --require ./node_modules/dotenv/config ./index.js
2022-11-25T14:43:55.424 app[4ecc7291] den [info] Express server listening on port 8080
2022-11-25T14:43:55.609 app[4ecc7291] den [info] prisma:info Starting a sqlite pool with 3 connections.
2022-11-25T14:43:58.526 app[4ecc7291] den [info] Updated the cache value for total-post-reads:__all-posts__. Getting a fresh value for this took 313ms. Caching for 60000ms + 86400000ms stale in LRU.
2022-11-25T14:43:59.340 app[4ecc7291] den [info] GET /blog/rss.xml 200 - - 163.696 ms
2022-11-25T14:43:59.361 app[4ecc7291] den [info] GET /blog/rss.xml 200 - - 74.841 ms
2022-11-25T14:43:59.403 app[4ecc7291] den [info] Updated the cache value for total-reader-count. Getting a fresh value for this took 1189ms. Caching for 300000ms + 86400000ms stale in LRU.
2022-11-25T14:43:59.433 app[4ecc7291] den [info] prisma:query - 866ms -
2022-11-25T14:43:59.433 app[4ecc7291] den [info] SELECT
2022-11-25T14:43:59.433 app[4ecc7291] den [info] (SELECT COUNT(DISTINCT "userId") FROM "PostRead" WHERE "userId" IS NOT NULL) +
2022-11-25T14:43:59.433 app[4ecc7291] den [info] (SELECT COUNT(DISTINCT "clientId") FROM "PostRead" WHERE "clientId" IS NOT NULL)
2022-11-25T14:43:59.635 app[4ecc7291] den [info] GET /blog/rss.xml 200 - - 34.796 ms

One thing I noticed is that both times I have logs for this, the last thing logged was:

Updated the cache value for blog:rankings. Getting a fresh value for this took 589ms. Caching for 3600000ms + 86400000ms stale in SQLite cache.

It’s possible that’s a clue. It’s also possible that’s a fluke. It’s interesting to me that I didn’t have any issues yesterday (though I did have a couple deployments so maybe that’s why it was ok yesterday).

I have re-enabled metronome as well as sentry and I do not believe those are the issue here. My current theory is that there’s something in cachified (which is responsible for that log message) that is causing this.

It doesn’t look like much, but that little line shows something is very wrong:

%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,  0.0 id,100.0 wa,  0.0 hi,  0.

wa is the time spent waiting on IO (like disk operations). This shows that it’s spending 100% of its time doing that. A situation like that would definitely hang your instance.

I wonder if LiteFS is what’s causing this. @benbjohnson2 any ideas?

@kentcdodds I think we enabled strict-verify during debugging. Do you still have that flag enabled? I can’t remember if we removed it.

I don’t remember strict-verify and it’s not showing up in my litefs.yml

@kentcdodds Can you grab a stack trace of LiteFS and we can try to see if it’s hung on something:

http://$HOSTNAME:20202/debug/pprof/goroutine?debug=2

Ok, so next time this happens, I should go to http://kentcdodds.com:20202/debug/pprof/goroutine?debug=2 and that’ll download the stack trace? I tried just now and got ERR_CONNECTION_RESET. I also tried with kcd.fly.dev instead. Same result. I’m probably doing something wrong…

Sorry, you’ll need to hit the endpoint through the internal DNS since the LiteFS port isn’t publicly exposed. Should be something like "kcd.internal:20202".

Thanks. I got it with wireguard enabled:

goroutine 42734 [running]:
runtime/pprof.writeGoroutineStacks({0xb6a360, 0xc0002ba0e0})
	/usr/local/go/src/runtime/pprof/pprof.go:692 +0x70
runtime/pprof.writeGoroutine({0xb6a360?, 0xc0002ba0e0?}, 0xc00041c0d0?)
	/usr/local/go/src/runtime/pprof/pprof.go:681 +0x2b
runtime/pprof.(*Profile).WriteTo(0xa2dda0?, {0xb6a360?, 0xc0002ba0e0?}, 0xc?)
	/usr/local/go/src/runtime/pprof/pprof.go:330 +0x14b
net/http/pprof.handler.ServeHTTP({0xc000264251, 0x9}, {0xb6e300, 0xc0002ba0e0}, 0xc000423197?)
	/usr/local/go/src/net/http/pprof/pprof.go:253 +0x4a5
net/http/pprof.Index({0xb6e300?, 0xc0002ba0e0}, 0xc000130900)
	/usr/local/go/src/net/http/pprof/pprof.go:371 +0x13e
github.com/superfly/litefs/http.(*Server).serveHTTP(0xc000161400?, {0xb6e300, 0xc0002ba0e0}, 0xc000130900)
	/src/litefs/http/server.go:128 +0x186
net/http.HandlerFunc.ServeHTTP(0x40?, {0xb6e300?, 0xc0002ba0e0?}, 0x0?)
	/usr/local/go/src/net/http/server.go:2109 +0x2f
golang.org/x/net/http2/h2c.h2cHandler.ServeHTTP({{0xb6aec0?, 0xc000150aa0?}, 0xc000089bc0?}, {0xb6e300, 0xc0002ba0e0}, 0xc000130900)
	/go/pkg/mod/golang.org/x/net@v0.0.0-20220909164309-bea034e7d591/http2/h2c/h2c.go:113 +0x49f
net/http.serverHandler.ServeHTTP({0xc0004ab7d0?}, {0xb6e300, 0xc0002ba0e0}, 0xc000130900)
	/usr/local/go/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc00041e280, {0xb6ea98, 0xc00047c600})
	/usr/local/go/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:3102 +0x4db

goroutine 1 [select, 10 minutes]:
main.runMount({0xb6ea28, 0xc00002c080}, {0xc0000220c0, 0x3, 0x3})
	/src/litefs/cmd/litefs/main.go:103 +0x558
main.run({0xb6ea28?, 0xc00002c080?}, {0xc0000220b0?, 0xc000119f70?, 0x406719?})
	/src/litefs/cmd/litefs/main.go:47 +0x78
main.main()
	/src/litefs/cmd/litefs/main.go:30 +0x71

goroutine 23 [syscall, 11 minutes]:
os/signal.signal_recv()
	/usr/local/go/src/runtime/sigqueue.go:152 +0x2f
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:23 +0x19
created by os/signal.Notify.func1.1
	/usr/local/go/src/os/signal/signal.go:151 +0x2a

goroutine 40 [select]:
net/http.(*http2ClientConn).RoundTrip(0xc0000aa180, 0xc000130800)
	/usr/local/go/src/net/http/h2_bundle.go:7928 +0x451
net/http.(*http2Transport).RoundTripOpt(0xc00016a880, 0xc000130800, {0x80?})
	/usr/local/go/src/net/http/h2_bundle.go:7250 +0x1be
net/http.(*http2Transport).RoundTrip(...)
	/usr/local/go/src/net/http/h2_bundle.go:7211
net/http.http2noDialH2RoundTripper.RoundTrip({0xf11640?}, 0xc000130800?)
	/usr/local/go/src/net/http/h2_bundle.go:9743 +0x1b
net/http.(*Transport).roundTrip(0xf11640, 0xc000130800)
	/usr/local/go/src/net/http/transport.go:540 +0x38a
net/http.(*Transport).RoundTrip(0x575f45?, 0xb6a180?)
	/usr/local/go/src/net/http/roundtrip.go:17 +0x19
net/http.send(0xc000130800, {0xb6a180, 0xf11640}, {0xa70ae0?, 0xc0003a1a01?, 0x0?})
	/usr/local/go/src/net/http/client.go:251 +0x5f7
net/http.(*Client).send(0xf1ace0, 0xc000130800, {0x0?, 0xa5a340?, 0x0?})
	/usr/local/go/src/net/http/client.go:175 +0x9b
net/http.(*Client).do(0xf1ace0, 0xc000130800)
	/usr/local/go/src/net/http/client.go:715 +0x8fc
net/http.(*Client).Do(...)
	/usr/local/go/src/net/http/client.go:581
github.com/hashicorp/consul/api.(*Client).doRequest(0xc0001763c0, 0xa84af4?)
	/go/pkg/mod/github.com/hashicorp/consul/api@v1.11.0/api.go:927 +0x6d
github.com/hashicorp/consul/api.(*Session).Renew(0xc0003a1de0, {0xc000264210?, 0x12a05f200?}, 0xc0003fe3c0?)
	/go/pkg/mod/github.com/hashicorp/consul/api@v1.11.0/session.go:140 +0xce
github.com/superfly/litefs/consul.(*Lease).Renew(0xc00047ca80, {0xc0003a1e38?, 0x0?})
	/src/litefs/consul/consul.go:221 +0x46
github.com/superfly/litefs.(*Store).monitorLeaseAsPrimary(0xc000130500, {0xb6e9f0, 0xc000089b00}, {0xb6ed00, 0xc00047ca80})
	/src/litefs/store.go:533 +0x21f
github.com/superfly/litefs.(*Store).monitorLease(0xc000130500, {0xb6e9f0, 0xc000089b00})
	/src/litefs/store.go:451 +0x1fd
github.com/superfly/litefs.(*Store).Open.func1()
	/src/litefs/store.go:134 +0x25
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/go/pkg/mod/golang.org/x/sync@v0.0.0-20220601150217-0de741cfad7f/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
	/go/pkg/mod/golang.org/x/sync@v0.0.0-20220601150217-0de741cfad7f/errgroup/errgroup.go:72 +0xa5

goroutine 38 [IO wait]:
internal/poll.runtime_pollWait(0x7f1959c26e08, 0x72)
	/usr/local/go/src/runtime/netpoll.go:305 +0x89
internal/poll.(*pollDesc).wait(0xc00016ac80?, 0xc0000bb300?, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc00016ac80, {0xc0000bb300, 0x1300, 0x1300})
	/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc00016ac80, {0xc0000bb300?, 0xc00006b560?, 0xc0000bb305?})
	/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc0000127c0, {0xc0000bb300?, 0x7f1959eff000?, 0x7f1959efef00?})
	/usr/local/go/src/net/net.go:183 +0x45
crypto/tls.(*atLeastReader).Read(0xc000011d40, {0xc0000bb300?, 0x0?, 0xc0001c1860?})
	/usr/local/go/src/crypto/tls/conn.go:787 +0x3d
bytes.(*Buffer).ReadFrom(0xc00014de78, {0xb696c0, 0xc000011d40})
	/usr/local/go/src/bytes/buffer.go:202 +0x98
crypto/tls.(*Conn).readFromUntil(0xc00014dc00, {0xb6a0a0?, 0xc0000127c0}, 0x203000?)
	/usr/local/go/src/crypto/tls/conn.go:809 +0xe5
crypto/tls.(*Conn).readRecordOrCCS(0xc00014dc00, 0x0)
	/usr/local/go/src/crypto/tls/conn.go:616 +0x116
crypto/tls.(*Conn).readRecord(...)
	/usr/local/go/src/crypto/tls/conn.go:582
crypto/tls.(*Conn).Read(0xc00014dc00, {0xc00019e000, 0x1000, 0x771860?})
	/usr/local/go/src/crypto/tls/conn.go:1287 +0x16f
bufio.(*Reader).Read(0xc000194780, {0xc0002bb700, 0x9, 0x78e765?})
	/usr/local/go/src/bufio/bufio.go:237 +0x1bb
io.ReadAtLeast({0xb69540, 0xc000194780}, {0xc0002bb700, 0x9, 0x9}, 0x9)
	/usr/local/go/src/io/io.go:332 +0x9a
io.ReadFull(...)
	/usr/local/go/src/io/io.go:351
net/http.http2readFrameHeader({0xc0002bb700?, 0x9?, 0xc0004ab380?}, {0xb69540?, 0xc000194780?})
	/usr/local/go/src/net/http/h2_bundle.go:1565 +0x6e
net/http.(*http2Framer).ReadFrame(0xc0002bb6c0)
	/usr/local/go/src/net/http/h2_bundle.go:1829 +0x95
net/http.(*http2clientConnReadLoop).run(0xc0001c1f98)
	/usr/local/go/src/net/http/h2_bundle.go:8875 +0x130
net/http.(*http2ClientConn).readLoop(0xc0000aa180)
	/usr/local/go/src/net/http/h2_bundle.go:8771 +0x6f
created by net/http.(*http2Transport).newClientConn
	/usr/local/go/src/net/http/h2_bundle.go:7478 +0xaaa

goroutine 41 [select, 1 minutes]:
github.com/superfly/litefs.(*Store).monitorRetention(0xc000130500, {0xb6e9f0, 0xc000089b00})
	/src/litefs/store.go:613 +0xe7
github.com/superfly/litefs.(*Store).Open.func2()
	/src/litefs/store.go:138 +0x25
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/go/pkg/mod/golang.org/x/sync@v0.0.0-20220601150217-0de741cfad7f/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
	/go/pkg/mod/golang.org/x/sync@v0.0.0-20220601150217-0de741cfad7f/errgroup/errgroup.go:72 +0xa5

goroutine 50 [select, 10 minutes]:
os/exec.(*Cmd).watchCtx.func1()
	/usr/local/go/src/os/exec/exec.go:648 +0x96
created by os/exec.(*Cmd).watchCtx
	/usr/local/go/src/os/exec/exec.go:647 +0x8e

goroutine 46 [syscall]:
syscall.Syscall(0x4780ee?, 0xf08980?, 0x0?, 0x9f8a60?)
	/usr/local/go/src/syscall/syscall_linux.go:68 +0x27
syscall.read(0xf08980?, {0xc000324000?, 0x8?, 0x0?})
	/usr/local/go/src/syscall/zsyscall_linux_amd64.go:696 +0x45
syscall.Read(...)
	/usr/local/go/src/syscall/syscall_unix.go:183
bazil.org/fuse.(*Conn).ReadRequest(0xc000194300)
	/go/pkg/mod/bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fuse.go:580 +0xc6
bazil.org/fuse/fs.(*Server).Serve(0xc0002ba000, {0xb699a0?, 0xc000194000})
	/go/pkg/mod/bazil.org/fuse@v0.0.0-20200524192727-fb710f7dfd05/fs/serve.go:501 +0x3ef
github.com/superfly/litefs/fuse.(*FileSystem).Mount.func2()
	/src/litefs/fuse/file_system.go:93 +0x28
created by github.com/superfly/litefs/fuse.(*FileSystem).Mount
	/src/litefs/fuse/file_system.go:92 +0x31c

goroutine 47 [IO wait]:
internal/poll.runtime_pollWait(0x7f1959c26fe8, 0x72)
	/usr/local/go/src/runtime/netpoll.go:305 +0x89
internal/poll.(*pollDesc).wait(0xc00016a600?, 0x6?, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc00016a600)
	/usr/local/go/src/internal/poll/fd_unix.go:614 +0x234
net.(*netFD).accept(0xc00016a600)
	/usr/local/go/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc000011608)
	/usr/local/go/src/net/tcpsock_posix.go:142 +0x28
net.(*TCPListener).Accept(0xc000011608)
	/usr/local/go/src/net/tcpsock.go:288 +0x3d
net/http.(*Server).Serve(0xc0000fa1e0, {0xb6e0f0, 0xc000011608})
	/usr/local/go/src/net/http/server.go:3070 +0x385
github.com/superfly/litefs/http.(*Server).Serve.func1()
	/src/litefs/http/server.go:74 +0x2d
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/go/pkg/mod/golang.org/x/sync@v0.0.0-20220601150217-0de741cfad7f/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
	/go/pkg/mod/golang.org/x/sync@v0.0.0-20220601150217-0de741cfad7f/errgroup/errgroup.go:72 +0xa5

goroutine 51 [syscall, 10 minutes]:
syscall.Syscall6(0xc000048728?, 0x7a8c10?, 0xc00018cc00?, 0x40a7cd?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/syscall/syscall_linux.go:90 +0x36
os.(*Process).blockUntilWaitable(0xc000264390)
	/usr/local/go/src/os/wait_waitid.go:32 +0x87
os.(*Process).wait(0xc000264390)
	/usr/local/go/src/os/exec_unix.go:22 +0x28
os.(*Process).Wait(...)
	/usr/local/go/src/os/exec.go:132
os/exec.(*Cmd).Wait(0xc0001c5600)
	/usr/local/go/src/os/exec/exec.go:599 +0x4b
main.(*MountCommand).execCmd.func1()
	/src/litefs/cmd/litefs/mount_linux.go:332 +0x25
created by main.(*MountCommand).execCmd
	/src/litefs/cmd/litefs/mount_linux.go:332 +0x311

goroutine 42732 [select]:
net/http.(*http2clientStream).writeRequest(0xc00041a900, 0xc000130800)
	/usr/local/go/src/net/http/h2_bundle.go:8096 +0x9c7
net/http.(*http2clientStream).doRequest(0xc0002ba0c8?, 0xc0004f1fb8?)
	/usr/local/go/src/net/http/h2_bundle.go:7958 +0x1e
created by net/http.(*http2ClientConn).RoundTrip
	/usr/local/go/src/net/http/h2_bundle.go:7887 +0x30a

goroutine 42735 [runnable]:
net/http.(*connReader).startBackgroundRead.func2()
	/usr/local/go/src/net/http/server.go:674
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1594 +0x1
created by net/http.(*connReader).startBackgroundRead
	/usr/local/go/src/net/http/server.go:674 +0xca

Does that mean anything to you? Or do I need to wait until the next time it happens?