Searchable application logs in Grafana

joshua-fly · March 21, 2024, 9:51pm

We’ve been working with the esteemed team at Quickwit to bring you an experimental application log cluster and search interface. You can try it out by:

Heading to your your shared metrics Grafana
Clicking on “Log Search” in your Fly.io app dashboard

Note: If you’re already signed in to Grafana, you need to log out and log back in!

This project ran smooth like butter, built on the back of our new Tigris Storage and Supabase Postgres extensions. Learn more about Quickwit’s tech.

For the period of this experiment, log search is free and we retain logs for 30 days. Learn how to build more complex queries in Quickwit’s Query Language Intro.

Is this useful for you?
Are you interested in querying the cluster directly, traces or alerting on logs?

We’d love your feedback!

dyc3 · March 21, 2024, 11:14pm

This looks great – It could completely eliminate the need for me to run a machine in order to export my logs to betterstack.

For me, it looks like it’s not capturing logs for most of my apps. Do I need to enable anything or should it just work?

Also, is there support for colored logs?

joshua-fly · March 22, 2024, 11:04am

It should work! You may need to log out which will log you back in again. Colored logs should work. Feel free to write to extensions@fly.io with the app names that aren’t working for you.

dyc3 · March 22, 2024, 8:59pm

I tried logging out as you suggested, but it only partially fixed the problem. I’ve sent an email with the affected fly apps.

joshua-fly · March 22, 2024, 9:27pm

@dyc3 We found the issue. It should be fixed shortly, after which you should start to see logs roll in.

dyc3 · March 25, 2024, 3:16am

Some feedback:

I do enjoy having my logs right next to my metrics. It’s very convenient.

I tried using the query functionality and I’m a little confused. In the grafana panel, when I click on a message it expands to show me the different fields.

From reading the quickwit docs, I should be able to query the message field for content in that field. However, if I try the query message:* I get no results. If I’m understanding the docs correctly, this query should return all entries that have a message field. If I try it with other fields it works perfectly.

Unfortunately, I don’t think this covers my user story completely enough to move off of betterstack. My app logs contain timestamps – not from the fly platform, its a part of the log message format in my app. Quickwit doesn’t seem to support wildcard prefixed queries (eg. fly.app.name:*foo), unless I’m missing something. If I can’t just grep logs by text in a crunch, then it’s a no go.

fmassot · March 25, 2024, 8:39am

Thanks for this great feedback.

The fact that message:* returns no result is most likely a bug (plugin side or quickwit side; I will check that).

Concerning the timestamp, I think the timestamp from Fly and your app should be very similar. Is this a real problem? Note that currently, the plugin displays only the timestamp at second precision; this is fixed in the latest plugin version, so it will be fixed soon.

Concerning the query fly.app.name:*foo, Quickwit does not yet support that indeed. I imagine you have a lot of apps, and you don’t want to specify the list of apps.

fulmicoton · March 25, 2024, 8:47am

The fact that message:* returns no result is most likely a bug (plugin side or quickwit side; I will check that).

It was just disabled in the index configuration. I just sent a PR. It will be fixed the next time we update the index config.

dyc3 · March 25, 2024, 12:49pm

My app emits logs that are in the format:

<date> <time> <logger-name> <log-level> <message>

Without having the ability to query with wildcard prefixes, I can’t really search my logs at all because of the date and time. Of course I could write code to be able to make that configurable (and I probably should) but it’s just another barrier to entry.

Ultimately, what I really want is to be able to write queries for the fields in my logs, like the logger name. But a nice (perhaps more generally useful) alternative would be the ability to quickly search by some substring.

fulmicoton · March 25, 2024, 2:12pm

@dyc3 Quickwit tokenizes your line.

If you log lines looks like

2024-03-21T23:30:21.981ZeINFOepublish_splitsePublishSplitsRequest { index_uid: "applogs292" }

All of the following queries will match your document

info
publish_splits
PublishSplitsRequest
applogs292

And you can combine them.
publish_splits PublishSplitsRequest is interpreted implicitly as
(publish_splits AND PublishSplitsRequest)

Do you experience cases where this tokenization is not sufficient?

dyc3 · March 26, 2024, 2:24pm

I think what happened when I tried it before was I didn’t have my query range set for long enough, and was searching for a string that occurred outside of that time range. Thanks for the clarification.

Cade · March 26, 2024, 8:17pm

I love this. I would pay for it, especially if it came with basic alerting.

carter · March 29, 2024, 3:24pm

Aw I just built this for myself a month ago looks great!

tj1 · March 30, 2024, 3:57pm

My logs are emitting json. I don’t see any way to query a json field? All of this is dumped under the field message:

	
{
  "pid": "#PID<0.624304.0>",
  "status": 304,
  "time": "2024-03-30T15:52:53.758869Z",
  "path": "/sw.js",
  "level": "info",
  "mfa": "StructuredLogger.Plugs.Logger.call/2",
  "request_id": "F8GVvVNs9-rHr5oAbNeB",
  "params": {},
  "method": "GET",
  "duration": 0.319
}

uncvrd · March 31, 2024, 3:44am

Just thought I’d mention that I’m seeing invalid dates in the first column of all my app logs

joshua-fly · March 31, 2024, 6:36pm

The ‘invalid date’ issue has been fixed.

uncvrd · March 31, 2024, 8:48pm

confirmed - thank you!

BrickInTheWall · April 8, 2024, 11:44am

Got nearly the same output, how to map/configure those fields correctly ?

Cade · April 8, 2024, 3:45pm

Got nearly the same output, how do you map/configure those fields correctly?

I agree with the sentiment. We’re also pushing json. It appears to be formatted correctly, but I can’t really query on it. None of my fields are indexed or available when you expand the log to view details. So what we end up with is message.message because our logs emit data, along with a message. It’s hard to query and find logs when the whole message block is treated as a string.

fmassot · April 12, 2024, 9:54pm

We (Quickwit) would love to parse those JSON logs, but we need to consider how to handle that correctly.

Taking inspiration from the OTEL log data model, here is what we propose to do:

we try to parse the log line.
If it’s successful, we put the JSON in the field body. All subfields of body will be tokenized so users can run full-text search queries on them. We also propose to extract attributes, resources, severity_text fields present in the JSON and populate the log accordingly. The values of those fields won’t be tokenized, and users will be able to run analytics queries on them. For example, this opens the possibility to do aggregations on status, method if those fields are under attributes or resources fields.
If the parsing fails, we fall back to the current behavior with a slight change, we put the log line in the field body.message.

Let’s take a concrete example with this JSON log:

{
  "pid": "#PID<0.624304.0>",
  "severity_text": "info",
  "attributes": {
      "request_id": "F8GVvVNs9-rHr5oAbNeB",
      "method": "GET",
      "duration": 0.319,
      "status": 304
   }
}

This will be transformed into the following log:

{
  "fly": {
    "app": {
      "id": 1,
      "instance": "instance-id",
      "name": "my-app"
    },
    "org": {
      "id": 1
    },
    "region": "fra"
  },
  "log": {
    "level": "info"
  },
  "body": {
     "pid": "#PID<0.624304.0>",
   }
  "attributes": {
      "request_id": "F8GVvVNs9-rHr5oAbNeB",
      "method": "GET",
      "duration": 0.319,
      "status": 304
   }
}

This way you will be able to execute those kind of queries:

attributes.method:GET attributes.status:304 body.pid:624304
do a date histogram + term aggregation on attributes.status so you can follow the evolution of log count per status

WDYT? (ping @Cade @tj1 @BrickInTheWall )

Topic		Replies	Views
Log Search down since 20:42 GMT? logs	6	140	December 6, 2024
No HTTP access logs (anymore) Questions / Help logs , grafana	1	24	November 25, 2024
Is Log Search down at the moment? logs , grafana , dashboard	8	66	February 17, 2025
No logs from my apps? metrics , logs , grafana	10	113	September 17, 2024
Preview: Managed Grafana Dashboards for Fly Apps metrics , announcement	24	3391	October 21, 2023

Searchable application logs in Grafana

Related topics