fly-log-shipper isn't parsing json message

I have an instance of fly-log-shipper deployed that sends logs to a Grafana Cloud Loki instance. My Elixir backend uses the Elixir logger_json library to output JSON log messages.

The log messages are appearing in Loki escaped like this:

{
  "event": {
    "provider": "app"
  },
  "fly": {
    "app": {
      "instance": "instance-id",
      "name": "app-name"
    },
    "region": "lax"
  },
  "host": "f2a8",
  "level": "info",
  "log": {
    "level": "info"
  },
  "message": "\u001b[1;38;5;2mdashboard          \u001b[0m | {\"time\":\"2022-03-09T04:03:01.711Z\",\"severity\":\"info\",\"message\":\"\",\"metadata\":{\"domain\":[\"elixir\"],\"erl_level\":\"info\",\"httpRequest\":{\"requestMethod\":\"GET\",\"requestUrl\":\"https://dashboard-composed-pr-1585-preview.fly.dev/login\",\"status\":200,\"userAgent\":\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.109 Safari/537.36\",\"remoteIp\":\"",\"referer\":\"https://fly.io/\",\"latency\":\"0.00694278s\"},\"node\":{\"hostname\":\"e5a049d5\",\"vm_pid\":536},\"phoenix\":{\"controller\":\"Elixir.VrepsWeb.SessionController\",\"action\":\"new\"},\"request_id\":\"Ftqalb8Q0WFmTw8AAAFR\"}}"
}

When running flyctl logs, I get output that looks like this:

app[b6748c8f] sea [info]dashboard           | {"time":"2022-03-09T09:00:58.546Z","severity":"info","message":"tzdata release in place is from a file last modified Tue, 22 Dec 2020 23:35:21 GMT. Release file on server was last modified Fri, 22 Oct 2021 02:20:47 GMT.","metadata":{"domain":["elixir"],"erl_level":"info"}}

It appears that fly-log-shipper is struggling to parse the message since it contains these leading instance id/region/app name identifiers. When I test this with a local version of Vector, it fails. Am I misunderstanding something or are other people having a similar issue?

This is the vector.toml that fly-log-shipper created for my instance:

data_dir = "/var/lib/vector/"

[api]
  enabled = true
  address = "0.0.0.0:8686"

[sources.fly_log_metrics]
  type = "internal_metrics"

[sources.fly_socket]
  type = "socket"
  mode = "unix" 
  path = "/var/run/vector.sock"

[transforms.log_json]
  type = "remap"
  inputs = ["fly_socket"]
  source = '''
  . = parse_json!(.message)
  '''

[sinks.fly_log_metrics_prometheus]
  type = "prometheus_exporter" # required
  inputs = ["fly_log_metrics"] # required
  address = "0.0.0.0:9598" # required
  default_namespace = "fly-logs" # optional, no default

[sinks.blackhole]
  type = "blackhole"
  inputs = ["log_json"]
  print_interval_secs = 100000
[transforms.loki_json]
  type = "remap"
  inputs = ["log_json"]
  source = '''
  .level = .log.level
  if starts_with(.message, "{") ?? false {
    # parse json messages
    json = object!(parse_json!(.message))
    del(.message)
    . |= json
  }
  '''

[sinks.loki]
  type = "loki"
  inputs = ["loki_json"]
  endpoint = "${LOKI_URL}"
  compression = "gzip"
  auth.strategy = "basic"
  auth.user = "${LOKI_USERNAME}"
  auth.password = "${LOKI_PASSWORD}"
  encoding.codec = "json"

  labels.event_provider = "{{event.provider}}"
  labels.fly_region = "{{fly.region}}"
  labels.fly_app_name = "{{fly.app.name}}"
  labels.fly_app_instance = "{{fly.app.instance}}"
  labels.host = "{{host}}"
  labels.level = "{{level}}"

Hi @Matt_Stewart-Ronnisc

Those weird characters at the start of your log message would indeed be causing it to fail. Try locating what’s causing those extra characters to be added to your log messages.

We’re successfully running fly-logs-shipper with Grafana Cloud Loki and are having no trouble parsing the json messages.

Upon further investigation, it looks like those extra characters might be getting added by fly.io if you use their processes feature.

@Matt_Stewart-Ronnisc can you please confirm if you use that feature and have a process called dashboard? (if you’re unsure you can post your fly.toml file)

@jsierles should the process name be part of the json provided by fly alongside app name and not in the message field?

Thanks! Your line of thought made me think of exactly what may be causing this. I use overmind for process management as recommended here. It is likely the one prepending these in the logs. I’ll dive into their documentation on how to fix this.

I’m having a very similar problem with my app, using the standard log-shipper image (Package fly-log-shipper · GitHub).

I’m using winstonjs in my Node app and Axiom as the sink for my deployed instance of the log-shipper.

Everything is showing up but the message field is not parsed as json, this is what Axiom receives as raw data:

{
  "event": {
    "provider": "app"
  },
  "fly": {
    "app": {
      "instance": "1111111a1111a1",
      "name": "my-app"
    },
    "region": "ams"
  },
  "host": "aaa1",
  "log": {
    "level": "info"
  },
  "message": "{\"job\":\"my-job\",\"level\":\"info\",\"message\":\"my-message\",\"service\":\"workers\",\"url\":\"random-metadata\"}",
  "timestamp": "2023-10-15T00:30:53.315678876Z"
}

The vector configs used by the log-shipper look fine, this log_json transform should parse the message as json:

[transforms.log_json]
  type = "remap"
  inputs = ["nats"]
  source = '''
  . = parse_json!(.message)
  '''

and is used by the axiom config:

[sinks.axiom]
  type = "axiom"
  inputs = ["log_json"]
  token = "${AXIOM_TOKEN}"
  dataset = "${AXIOM_DATASET}"

Any idea why this is not working?

For now I solved this by deploying a “custom” log-shipper forked from the main repo, where I modified the main vector config (vector-configs/vector.toml) by adding a new transform step:

[transforms.log_json_message]
type = "remap"
inputs = ["log_json"]
source = '''
if is_json(.message) ?? false {
  .message = parse_json!(.message)
}
'''

Then I can use this new transform as input to the sinks that I need, like this for Axiom (vector-configs/sinks/axiom.toml):

[sinks.axiom]
type = "axiom"
inputs = ["log_json_message"]
token = "${AXIOM_TOKEN}"
dataset = "${AXIOM_DATASET}"

Now the same example event that I posted before is parsed like this:

{
  "event": {
    "provider": "app"
  },
  "fly": {
    "app": {
      "instance": "1111111a1111a1",
      "name": "my-app"
    },
    "region": "ams"
  },
  "host": "aaa1",
  "log": {
    "level": "info"
  },
  "message": {
    "level": "info",
    "message": "my-message",
    "job": "my-job",
    "service": "workers",
    "url": "random-metadata"
  },
  "timestamp": "2023-10-15T00:30:53.315678876Z"
}