I’m trying to debug something that happened about 14 hours ago where it seems like one of our services died and did not start back up until about 4 hours ago.
Running fly status --all gives me this:
Instances
ID VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
23524f06 10 sin run running 13 4h24m ago
048e3cf1 10 sin stop failed 1 2021-01-31T22:29:40Z
while running logs on 23524f06 with fly -i 23524f06 gives me only the last 100 lines, which since this is a particularly chatty application, only covers the past 30 minutes
Is there any way to see logs further back than that?
P.s. not sure if it’s related to the issue I’m debugging, but the metrics page in the dashboard is also not showing full stats even when I have it set to 1d prior. Only data transfer is showing for the full 1d, while firecracker load and memory is only showing from 4 hours ago.
There isn’t yet. Logs are considered “best effort” from us until we figure out a way to make them more reliable and probably charge for retention or allow 3rd party apps to ingest them.
That said, I might be able to allow more logs to be fetched via flyctl.
Looks like a bug in our UI. Thanks for reporting this
+1 for accessing logs from further back. Currently I’ve been experimenting with pushing to AWS Cloudwatch. Which works but isn’t ideal. And it only covers app-made logs, not anything related to fly itself.
Even 24 hours worth would be handy. Then you could auto-delete them. Since an issue should be spotted within 24 hours and that’s when the logs are most useful. For me at least.
Has there been any progress made in exporting logs to a third-party? I was thinking of something like a vector.dev vm (or anything really) listening for log events through the the fly api( which means customers would be able to roll their solution) for the whole org, enrich them and ship them off to a service like datadog. I used vector as an example since you are already using and it also does metrics withy to many movings parts.
I thought maybe you could stream through the API and have and since vector supports http sources I could deploy it like just another app and have it forward to datadog and co. But the NATS idea sounds even better.