sending metrics to datadog agent via UDP

I have the following datadog agent config:

app = "black-bush-2163"

kill_signal = "SIGINT"
kill_timeout = 5

[experimental]
  auto_rollback = true

[env]
  DD_SITE="datadoghq.eu"
  DD_APM_NON_LOCAL_TRAFFIC = "true"
  DD_LOGS_ENABLED="true"
  DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL="true"
  DD_CONTAINER_EXCLUDE_LOGS="name:datadog-agent"
  DD_API_KEY="xxx"
  DD_APM_ENABLED="true"
  DD_PROCESS_AGENT_ENABLED="true"
  DD_DOGSTATSD_SOCKET="fly-global-services"

[build]
  image = "datadog/agent:7"

[[services]]
  internal_port = 8125
  processes = ["app"]
  protocol = "udp"

  [[services.ports]]
    port = 8125

[[services]]
  internal_port = 8126
  processes = ["app"]
  protocol = "tcp"

  [[services.ports]]
    port = "8126"

# [[services]]
#   internal_port = 8126
#   processes = ["app"]
#   protocol = "tcp"

# [[services.ports]]
#   handlers = ["udp"]
#   port = 8125

I in the logs that the agent appears to be bind correctly:

2022-07-24T13:15:44.685 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/forwarder/forwarder.go:356 in Start) | Forwarder started, sending to 1 endpoint(s) with 1 worker(s) each: "https://7-37-1-app.agent.datadoghq.eu" (1 api key(s))

2022-07-24T13:15:44.685 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/dogstatsd/listeners/uds_common.go:142 in Listen) | dogstatsd-uds: starting to listen on fly-global-services

2022-07-24T13:15:44.685 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/dogstatsd/listeners/udp.go:95 in Listen) | dogstatsd-udp: starting to listen on 127.0.0.1:8125

2022-07-24T13:15:44.686 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/tagger/collectors/workloadmeta_main.go:115 in stream) | workloadmeta tagger collector started

2022-07-24T13:15:44.711 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/collector/runner/runner.go:95 in ensureMinWorkers) | Runner 1 added 4 workers (total: 4)

2022-07-24T13:15:44.711 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/collector/python/init.go:322 in resolvePythonExecPath) | Using '/opt/datadog-agent/embedded' as Python home

2022-07-24T13:15:44.711 app[8e14cce6] ams [info] 2022-07-24 13:15:44 UTC | CORE | INFO | (pkg/collector/python/init.go:389 in Initialize) | Initializing rtloader with Python 3 /opt/datadog-agent/embedded

I can send traces correctly to DD (mostly because the agent listens to it via TCP) but metrics (StatsD) are sent to the agent via UDP and I can’t for the life of me make it work

I try play around with DD_AGENT_HOST env var but nothing seems to work. I tried black-bush-2163.internal, black-bush-2163.fly.dev, but metrics via UDP are simply not being sent

So far I got:

  • Sending traces via this agent
  • Sending logs via fly-log-shipper

but metrics are a mystery yet.

Has anyone got any luck with a dedicated DD agent via UDP?

1 Like

UDP services need to listen on the fly-global-services address, but your config is setting DD_DOGSTATSD_SOCKET to set fly-global-services as a Unix Domain Socket (uds) path which won’t work (dogstatsd-udp is still binding to 127.0.0.1:8125).

Maybe try setting DD_BIND_HOST = "fly-global-services" instead?

1 Like

I read the agent docs thousand of times and missed the DD_BIND_HOST :man_facepalming:

anyway, still didn’t work. tried different address for the client (fly-global-services, fly-global-services.internal, black-bush-2163.internal, black-bush-2163.fly.dev, the IP address), none of them sent the metrics as expected and as weird as it gets, using black-bush-2163.internal resulted in failure to resolve the address…

I just gave and will probably use a different app for metrics

thanks a lot @wjordan

1 Like

Sorry to revive an old post, but curious if anyone has found a solution to this? I’m in the same boat as @luizkowalski , we’ve had success forwarding logs and traces to DD, but not metrics.

I think I’ve setup the datadog-agent to properly listen to fly-global-services:8125, I see this in the logs when the agent boots up:

2022-09-02T20:35:32Z app[918be65d] iad [info]2022-09-02 20:35:32 UTC | CORE | INFO | (pkg/dogstatsd/listeners/udp.go:95 in Listen) | dogstatsd-udp: starting to listen on [::]:8125

However I can’t successfully send a metric from another fly app running in the same organization. I’ve been trying using this command when ssh’ing into an app:

echo -n "custom_dd_metric:25|g|#shell" | nc -4u -w0 fly-global-services 8125

but its unclear what address I should be using to talk to the datadog-agent. fly-global-services? <app-name>.internal? <app-name>.fly.dev ?

Also @luizkowalski what did you end up going with for metrics if you don’t mind me asking?

You’d need to connect over IPv6 (nc -4u is v4, I believe? ref).

If the agent is running on port 8125 in a Fly app named dd007 which may be running multiple instances (VM), say. You’d connect to it (from another Fly app / Fly maching running in the same 6pn network of the same org) either allocating and using a Flycast IP (over port 8125); or by explicitly connecting to any running instance of dd007 using its 6pn DNS name (on port 8125 as well) (1, 2), which should look like <alloc-id>.vm.appname.internal (a specific instance of that app in any region but same 6pn network and org), or region.appname.internal (all instances of the app in that region).

6pn has been finicky in the past: .internal DNS returns large number of invalid IP addresses

2 Likes

hey @acr13
I ended up giving up on datadog, I’m using newrelic for logs and stuffs and for metrics, I instrumented the app with prometheus and I’m sending them (or actually Fly is sending it hahaha) to Fly’s dedicated grafana (see this)

Working great tho I miss datadog

I think it will only work properly if Fly creates some kind of integration out of the box or something

2 Likes

@ignoramous - thanks for this. That was helpful, I actually finally got metrics to work from a NodeJs
server (Fly app) to the datadog-agent running on Fly from your comment! Rereading this Running Fly.io Apps On UDP and TCP · Fly Docs made it finally click. I had to use the external port, so this now works when ssh’d into a Fly app in the same org.

echo -n "custom_dd_metric:25|g|#shell" | nc -4u -w0 <my-datadog-agent>fly.dev 8125

@luizkowalski don’t give up, its possible :smile:

My datadog agent fly.toml:

app = ".."
kill_signal = "SIGINT"
kill_timeout = 5

[build]
  image = "datadog/agent:7"

[env]
  DD_APM_ENABLED="true"
  DD_APM_NON_LOCAL_TRAFFIC="true"
  DD_BIND_HOST="fly-global-services"
  DD_DOGSTATSD_NON_LOCAL_TRAFFIC="true"
  DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL="true"
  DD_PROCESS_AGENT_ENABLED="true"
  DD_API_KEY="...."

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  internal_port = 8125
  processes = ["app"]
  protocol = "udp"

  [[services.ports]]
    port = 8125

[[services]]
  internal_port = 8126
  processes = ["app"]
  protocol = "tcp"

  [[services.ports]]
    port = "8126"

and in the NodeJS app, I’m using hot-shots for metrics:

    const dogstatsd = new StatsD({ host: '<datadog-agent-app-name>.fly.dev' });
    const tags = ['...'];
    const stat = 'node.fastify.router';

    fastify.addHook('onSend', async (req, reply) => {
      const { statusCode } = reply;
      const responseTime = reply.getResponseTime();

      dogstatsd.increment(`${stat}.response_code.${statusCode}`, 1, statTags);
      dogstatsd.increment(`${stat}.response_code.all`, 1, statTags);
      dogstatsd.histogram(`${stat}.response_time`, responseTime, 1, statTags);
    });

Logs are being shipped via fly-log-shipper, and traces being shipped via dd-trace.

3 Likes

Nice! I am glad my rather incohesive drivel helped.

I must point out though, my comment was about connecting to a datadog-agent over Fly’s internal network (aka 6pn), while the glimpses of your solution suggests that you’re connecting to it over the public internet (<datadog-agent-app>.fly.dev:8125 instead of top1.nearest.of.<datadog-agent-app>.internal:8125).

You might want to pass DD_API_KEY either as a build-time secret or as a runtime secret (Fly supports both).

@ignoramous Correct, I was connecting over the public internet for my tests locally, and was assuming I need to do that on Fly’s network also as this makes it seem like UDP over Ipv6 is not supported.

However I think I spoke too soon. I can’t get a fly app → fly app (datadog-agent) UDP message to actually work.

From a local NodeJS server or from my local shell however, I can send UDP packets as expected:

echo "good_metric:13|g|#shell" | nc -4u -w0 <app-name>.fly.dev 8125

However I can’t seem to actually run this same command when ssh’d into a Fly App in the same org. For the hostname, I’ve tried:

  • <app-name>.fly.dev
  • getting an ipv4 address via fly ips list -a <app-name>

I’m going to look into your suggestion around multi-tenant apps, though.

1 Like

This works on nodejs with hot-shots

const client = new StatsD({host: '<datadog-app-name>.internal', udpSocketOptions: {ipv6Only: true, type: 'udp6'}})

[env]
DD_APM_ENABLED = "true"
DD_APM_NON_LOCAL_TRAFFIC = "true"
DD_BIND_HOST = "fly-global-services"
DD_LOG_LEVEL = "info"
DD_DOGSTATSD_NON_LOCAL_TRAFFIC = "true"


[experimental]
allowed_public_ports = []
auto_rollback = true

[[services]]
internal_port = 8125
processes = ["app"]
protocol = "udp"

[[services]]
internal_port = 8126
processes = ["app"]
protocol = "TCP"

related: Add udpSocketOptions as a StatsD option by hjr3 · Pull Request #231 · brightcove/hot-shots · GitHub

2 Likes