Logs displayed by `flyctl deploy` are missing lines

Here’s what flyctl deploy showed me (nevermind the actual work being done, I’m troubleshooting):

	 Starting instance
	 Configuring virtual machine
	 Pulling container image
	 Unpacking image
	 Preparing kernel init
	 Configuring firecracker
	 Starting virtual machine
	 Starting init (commit: 252b7bd)...
	 Preparing to run: `docker-entrypoint.sh sh -c /ping.sh && npx keystone prisma migrate deploy` as node
	 2022/04/19 21:23:38 listening on [fdaa:0:57f1:a7b:8aeb:1aea:b787:2]:22 (DNS: [fdaa::3]:53)
	 ping: bad address 'foo-postgres.internal'
	 ping: bad address 'iad.foo-postgres.internal'
	 PING fdaa:0:57f1:a7b:21e0:0:bbb4:2 (fdaa:0:57f1:a7b:21e0:0:bbb4:2): 56 data bytes
	 ping: permission denied (are you root?)
	 /ping.sh: line 8: dig: not found

This is what I see via the dashboard:

2022-04-19T21:23:34.270 runner[1aeab787] iad [info] Starting instance
2022-04-19T21:23:34.294 runner[1aeab787] iad [info] Configuring virtual machine
2022-04-19T21:23:34.294 runner[1aeab787] iad [info] Pulling container image
2022-04-19T21:23:35.803 runner[1aeab787] iad [info] Unpacking image
2022-04-19T21:23:37.138 runner[1aeab787] iad [info] Preparing kernel init
2022-04-19T21:23:38.626 runner[1aeab787] iad [info] Configuring firecracker
2022-04-19T21:23:38.626 runner[1aeab787] iad [info] Starting virtual machine
2022-04-19T21:23:38.767 app[1aeab787] iad [info] Starting init (commit: 252b7bd)...
2022-04-19T21:23:38.784 app[1aeab787] iad [info] Preparing to run: `docker-entrypoint.sh sh -c /ping.sh && npx keystone prisma migrate deploy` as node
2022-04-19T21:23:38.794 app[1aeab787] iad [info] 2022/04/19 21:23:38 listening on [fdaa:0:57f1:a7b:8aeb:1aea:b787:2]:22 (DNS: [fdaa::3]:53)
2022-04-19T21:23:43.802 app[1aeab787] iad [info] ping: bad address 'foo-postgres.internal'
2022-04-19T21:23:48.807 app[1aeab787] iad [info] ping: bad address 'iad.foo-postgres.internal'
2022-04-19T21:23:53.813 app[1aeab787] iad [info] ping: bad address 'global.foo-postgres.internal'
2022-04-19T21:23:53.813 app[1aeab787] iad [info] PING fdaa:0:57f1:a7b:21e0:0:bbb4:2 (fdaa:0:57f1:a7b:21e0:0:bbb4:2): 56 data bytes
2022-04-19T21:23:53.813 app[1aeab787] iad [info] ping: permission denied (are you root?)

Note that both have the timestamp 2022/04/19 21:23:38 and the same IP address; these are the same logs.

Note how the first listing is missing the line that says

ping: bad address 'global.foo-postgres.internal'

Sometimes the data loss is dramatic:

	 2022/04/19 21:40:20 listening on [fdaa:0:57f1:a7b:8aeb:392:a035:2]:22 (DNS: [fdaa::3]:53)
	 Datasource "postgresql": PostgreSQL database "postgres", schema "cms" at "top3.nearest.of.foo-postgres.internal:5432"
	 Main child exited normally with code: 1
	 Starting clean up.
Error release command failed, deployment aborted

vs

2022-04-19T21:40:20.163 app[0392a035] iad [info] 2022/04/19 21:40:20 listening on [fdaa:0:57f1:a7b:8aeb:392:a035:2]:22 (DNS: [fdaa::3]:53)
2022-04-19T21:40:27.839 app[0392a035] iad [info] Prisma schema loaded from schema.prisma
2022-04-19T21:40:27.862 app[0392a035] iad [info] Datasource "postgresql": PostgreSQL database "postgres", schema "cms" at "top3.nearest.of.foo-postgres.internal:5432"
2022-04-19T21:40:32.901 app[0392a035] iad [info] Error: P1001: Can't reach database server at `top3.nearest.of.foo-postgres.internal`:`5432`
2022-04-19T21:40:32.901 app[0392a035] iad [info] Please make sure your database server is running at `top3.nearest.of.foo-postgres.internal`:`5432`.
2022-04-19T21:40:33.177 app[0392a035] iad [info] Main child exited normally with code: 1
2022-04-19T21:40:33.177 app[0392a035] iad [info] Starting clean up.

That’s 3 out of 7 lines lost, with no rhyme or reason to which lines. It looks like an overloaded generic pubsub mechanism that’s unsuitable for logging :frowning:

This makes troubleshooting really frustrating. Can I trust the dashboard to have all the log lines, either?

Sometimes the data loss is not at line granularity:

2022-04-19T22:42:23Z app[ad0dce98] iad [info]✨ Starting Keystone
2022-04-19T22:42:25Z app[ad0dce98] iad [info]✨ Connecting to the database
2022-04-19T22:42:25Z app[ad0dce98] iad [info]✨ Creating server
2022-04-19T22:42:25Z app[ad0dce98] iad [info]✨ Preparing Admin U
2022-04-19T22:42:25Z app[ad0dce98] iad [info]✨ Preparing Admin U
 Next.js app
2022-04-19T22:42:26Z app[ad0dce98] iad [info]✅ Admin UI ready
2022-04-19T22:42:26Z app[ad0dce98] iad [info]⭐️ Server Ready on http://localhost:3000

More typical output:

2022-04-19T22:37:34.392 app[ad0dce98] iad [info] ✨ Starting Keystone
2022-04-19T22:37:35.472 app[ad0dce98] iad [info] ✨ Connecting to the database
2022-04-19T22:37:35.556 app[ad0dce98] iad [info] ✨ Creating server
2022-04-19T22:37:35.649 app[ad0dce98] iad [info] ✅ GraphQL API ready
2022-04-19T22:37:35.649 app[ad0dce98] iad [info] ✨ Preparing Admin UI Next.js app
2022-04-19T22:37:36.599 app[ad0dce98] iad [info] ✅ Admin UI ready
2022-04-19T22:37:36.602 app[ad0dce98] iad [info] ⭐️ Server Ready on http://localhost:3000

Note “Creating Admin U” no I, line break on the line that usually says “Admin UI Next.js app”.

For deployments, your log output is mixed in with output from the VM init process. Right now we don’t have an easy fix for that, but you should expect fly logs - or the dashboard - to be accurate. They’re the best tools for debugging.