Adding honeycomb observation to postgres cluster

I want to get some observability into my postgres cluster and I’m planning to use honeycomb for my app. So I was thinking about just following this post, but thought I’d check first. Thoughts?

That post will probably work, but it’ll be a bit of a chore and, I think, require you run your own build of Postgres on Fly.

The RDS post seems close to something that’ll work. If I’m reading this right, you’d need two special settings on your Postgres cluster and a way to send logs to Honeycomb. Our NATs based log shipper can output to Honeycomb.

Oh, nice. So if I follow the RDS post then I should be able to get data to honeycomb?

That RDS post uses a custom utility to get logs from AWS. I think the first thing to do is setup our log shipper as an app in your org and point that at Honeycomb. It should be as simple as a fly launch with the right config.

We need to look at the safest way of changing PG settings, I’ll let you know what we think later today.

1 Like

What is the simplest way to configure Fly.io PostgreSQL logs shipping to Honeycomb.io today?

I assume that honeytail will ship the logs in a way that will make querying them easy. On the other hand, fly-log-shipper seems super simple. The bit which I am missing is how to configure PostgreSQL. Maybe a PR that contributes PostgreSQL logs to postgres-ha would make sense. WDYT?

Fly log shipper isn’t quite specific enough to get interesting query data into honeycomb from postgres at the moment, but I was able to cobble together shipper + vector remaps + file sink + honeytail to finally see something usable via queries + raw data tables that aren’t very useful…but it’s something.

There’s a few reasons this might not be worth it to you:

  1. Honeytail reports durations differently than Honeycomb document for events and it’s hard to get traceIds into the events, so this data will be missing from your spans which is probably what you actually want.
  2. The normalizer doesn’t work if you or your underlying ORM writes queries with table/column quotes like "public"."column", which means even raw data tables have too much cardinality to see meaningful trends.

If you want to try it:

  • Clone the shipper
  • Update the Dockerfile to install honeytail
  • Update the entrypoint to launch honeytail, e.g.:
honeytail \
  --file="/var/honeytail.sink" \
  --dataset=$DATASET \
  --writekey=$HONEYCOMB_WRITE_KEY \
  --postgres.log_line_prefix="%m [%p] " \
  --add_field env=$ENV \
  --add_field service.name=$SERVICE &
/usr/local/bin/fly-logs | socat -u - UNIX-CONNECT:/var/run/vector.sock
  • Update vector.toml to clean up as much as you want so that honeytail can actually parse the logs, and sink them to the file above:
[transforms.msg_no_prefix]
  type = "remap"
  inputs = ["fly_socket"]
  source = '''
  . = parse_json!(.message)
  . = parse_regex!(.message, r'(?:.+ \| )(?P<log>.+)').log
  '''

[transforms.filter_out_keeper]
type = "filter"
inputs = ["msg_no_prefix"]
condition = '''contains!(.message, "cmd/keeper.go") == false'''

[sinks.honeytail]
  type = "file"
  inputs = ["filter_out_keeper"] 
  path = "/var/honeytail.sink"

[sinks.honeytail.encoding]
  codec = "text"

fly pg connect and make sure the db is configured correctly:

ALTER SYSTEM SET log_min_duration_statement=0;
ALTER SYSTEM SET log_statement='none';
SELECT pg_reload_conf();
2 Likes