Metrics from Go app hosted on fly.io not ending up in datadog

I have a go app hosted in fly.io, and I created a new app in fly.io app for the datadog agent, created this fly.toml for it:

app = "dd-agent"

kill_signal = "SIGINT"
kill_timeout = 5 

[experimental]
  auto_rollback = true

[env]
  DD_SITE = "datadoghq.com"
  DD_APM_NON_LOCAL_TRAFFIC = true

[build]
  image = "datadog/agent:7"

[[services]]
  internal_port = 8126
  protocol = "tcp"

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 8126

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 8126

  [[services.tcp_checks]]
    grace_period = "30s"
    interval = "15s"
    restart_limit = 0 
    timeout = "10s"

and my main app has this fly.toml

app = "service-name"

[env]
  DD_ENV = "xxxx"
  DD_AGENT_HOST = "<hostname>.internal"
  USE_DATADOG_APM = true
  DD_SERVICE = "xxxx"

[[services]]
  internal_port = <service-port>
  protocol = "tcp"

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20

  [[services.ports]]
    handlers = ["http"]
    port = "80"

  [[services.ports]]
    handlers = ["tls", "http"]
    port = "443"

  [[services.tcp_checks]]
    interval = 10000
    timeout = 2000

I tried setting the DD_AGENT_HOSTNAME to the linux hostname, that + .internal, I also tried dd-agent.internal & sjc.dd-agent.internal, but no data is showing up in Datadog.

The agent is spitting out these logs repeatedly:

2022-05-10T02:29:40.205 app[d8351925] sjc [info] 2022-05-10 02:29:40 UTC | CORE | INFO | (pkg/serializer/serializer.go:450 in SendProcessesMetadata) | Sent processes metadata payload, size: XXX bytes.
2022-05-10T02:29:40.276 app[d8351925] sjc [info] 2022-05-10 02:29:40 UTC | CORE | INFO | (pkg/forwarder/transaction/transaction.go:374 in internalProcess) | Successfully posted payload to "https://XXX-app.agent.datadoghq.com/intake/?api_key=<api_key>"
2022-05-10T02:30:37.239 app[d8351925] sjc [info] 2022-05-10 02:30:37 UTC | TRACE | INFO | (pkg/trace/info/stats.go:104 in LogStats) | No data received

with an abundance of the No data received ones.

The app logs show all the calls being made to it but I don’t see any logs about datadog.

In the app instrumentation, I call start on the tracer, with service & with version, defer stop on the tracer, and I’ve instrumented datadog with the gin middleware the same way I’ve done with other services that are successfully sending metrics.

Maybe you also need to set DD_APM_ENABLED=true?

From https://docs.datadoghq.com/agent/docker/apm/?tab=linux#docker-network

Thanks for the followup, in that same doc it seems that env var is set to true by default

In any case, I opened a support ticket with Datadog, and they’ve been able to discern from the logs that my traces aren’t being received by the agent. I’m currently working on instrumenting debug logging into the datadog tracer in my go app. That should give more insight into why they’re not being sent

Also worth noting that the internal network uses IPV6, so server and client should take this into account.

I finally figured out what was wrong. I was using a wrapper function to call tracer.Start, and that wrapper function checked the env var “USE_DATADOG_APM” was set explicitly to “true”. I provided the value true without quotes to the fly.toml file which resulted in the env var USE_DATADOG_APM=t which failed evaluation in my wrapper function, so it returned without starting the tracer.

Once I put quotes around “true” in my fly.toml, it started up and started sending traces as expected.