Skip to content

feat(correctness): add agent telemetry correctness test#1637

Draft
thieman wants to merge 6 commits into
mainfrom
thieman/agenttelemetry-correctness-test
Draft

feat(correctness): add agent telemetry correctness test#1637
thieman wants to merge 6 commits into
mainfrom
thieman/agenttelemetry-correctness-test

Conversation

@thieman
Copy link
Copy Markdown
Contributor

@thieman thieman commented May 13, 2026

Human Summary / Notes

This PR adds two correctness tests attempting to validate both COAT and customer-facing metrics being emitted from Agent Telemetry, see https://datadoghq.atlassian.net/jira/software/c/projects/DADP/boards/25544?search=71&selectedIssue=DADP-71

I was able to use these to convince myself that the changes in #1638 and DataDog/datadog-agent#50750 successfully get the ADP versions of datadog.agent.point.sent and datadog.agent.point.dropped working in both COAT and non-COAT paths. However, this PR is not currently mergeable because it relies on an unreleased Agent version with changes from DataDog/datadog-agent#50750. Additionally, the COAT test is very finnicky with timing and I haven't figured out how to get it to consistently pass. The main issue there is getting DDA and ADP flush intervals to align when they're both based on 15 second timers from process start and the processes don't start simultaneously.

I hope to return to this and get it merged once we have a released Agent version with the above fixes, but backburnering it for now to focus on getting the functional changes out for the next release code freeze later today.

Agentic Summary

Adds a correctness test for the Datadog Agent's agenttelemetry component. The agenttelemetry component collects internal Prometheus metrics from the agent and periodically ships them to /api/v2/apmtelemetry at instrumentation-telemetry-intake.<site>. This endpoint is completely separate from the regular metrics pipeline — dd_url has no effect on it — so no existing correctness infrastructure captured it.

The test surfaces behavioral gaps where ADP intercepts agent traffic (DogStatsD, distributions) without updating the corresponding Go forwarder telemetry. The initial failure mode is point.sent being ~80× higher on the baseline than on the comparison, because ADP handles DogStatsD forwarding through its own HTTP client and never increments the Go forwarder's PointCountTelemetry.

Key changes

datadog-intake

  • New agent_telemetry module: POST /api/v2/apmtelemetry stores raw JSON payloads; GET /agent-telemetry/dump returns them for analysis.
  • Self-signed TLS cert generated at startup via rcgen; a TCP-level HTTPS proxy on port 2050 decrypts and forwards to the existing HTTP intake on port 2049. Required because the agent's agenttelemetry sender hardcodes https:// regardless of the configured URL scheme.

panoramic

  • New AnalysisMode::AgentTelemetry: compares (metric_name, tags) → value contexts between baseline and comparison. Handles both agent-metrics and message-batch envelope types. Reports context mismatches and value mismatches separately.
  • ±1 point gauge tolerance with a WARN log when values differ within tolerance. The sole source of residual between-run gauge variance is datadog.agent.running, which is appended unconditionally to every 15-second aggregator flush in pkg/aggregator/aggregator.go:appendDefaultSeries. Whether its last firing falls just before or just after the start_after: 67 snapshot boundary produces a ±1 point swing. There is no agent config to disable it; it is hardcoded. Within a single run both agents start at the same second and always hit the same flush count, so the intra-run comparison is always exact.
  • New flush_wait_secs config field (default: 32s). The agent-telemetry test uses 90s to give the agenttelemetry component time to fire after all traffic is flushed.
  • CollectedData extended with agent telemetry payload collection.
  • Diagnostic logging: per-flush series point breakdown (grouped by timestamp) with metric-name labels on small-context buckets, used to identify non-DSD background sources.

test/correctness/agent-telemetry

  • Custom agenttelemetry profile scoped to forwarder metrics, routed to https://datadog-intake:2050 with skip_ssl_validation: true.
  • iterations: 1, start_after: 67 — chosen to land between 15-second aggregator flush boundaries.
  • apm_config.enabled: false — removes ~60 internal trace-agent DogStatsD contexts (datadog.trace_agent.*, datadog.dogstatsd.client.*) from every flush bucket, reducing per-bucket context count from 160→101.
  • Count-only millstone traffic eliminates sketches_v2 as a confounding variable.

Agent telemetry metrics observed

All contexts present on both baseline and comparison unless noted. Profiling is scoped to forwarder metrics only.

Metric Type Tags Baseline Comparison Notes
point.sent gauge (none) ~918 ~11 Primary gap. ADP forwards DogStatsD through its own HTTP client; Go forwarder never increments PointCountTelemetry for those points.
point.dropped gauge (none) 0 0 Matches.
transactions.input_count counter (none) 29 28 Baseline +1 from sketches_v2 (ADP intercepts distributions before they reach Go forwarder).
transactions.dropped counter (none) 10 10 Matches — both hit 403s on process.datadoghq.com.
transactions.success counter domain, endpoint check_run_v1=4, intake=4, metadata_v1=6, series_v2=4 same (no sketches_v2) Both agents make identical Go-forwarder transactions for non-DogStatsD endpoints.
transactions.http_errors counter code, endpoint 403/container=7, 403/process_discovery=3 same Both attempt process agent endpoints that return 403.

Series flush structure (baseline, with trace agent disabled)

Each flush bucket contains 101 contexts: 100 from millstone count metrics (with zero-value continuation after first bucket) + 1 from ntp.offset or similar. The datadog.agent.running metric fires at every 15-second flush tick with its exact wall-clock timestamp (not DSD-bucket-aligned), producing 3–4 single-context off-boundary timestamps per run.

Test plan

make build-correctness-tools-image
make test-correctness-case CASE=agent-telemetry

Expected: test fails with point.sent value mismatch (baseline ~918, comparison ~11) until ADP is updated to route DogStatsD point accounting through the Go forwarder's PointCountTelemetry.

@dd-octo-sts dd-octo-sts Bot added the area/test All things testing: unit/integration, correctness, SMP regression, etc. label May 13, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 13, 2026

Binary Size Analysis (Agent Data Plane)

Target: a7a8935 (baseline) vs d38bf07 (comparison) diff
Analysis Type: Stripped binaries (debug symbols excluded)
Baseline Size: 37.15 MiB
Comparison Size: 37.15 MiB
Size Change: +0 B (+0.00%)
Pass/Fail Threshold: +5%
Result: PASSED ✅

Changes by Module

Module File Size Symbols
anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.8940699831907001114 +129 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.7700950232534575509 -129 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.8940699831907001114 +114 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.7700950232534575509 -114 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.8940699831907001114 +108 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.7700950232534575509 -108 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.8940699831907001114 +96 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.7700950232534575509 -96 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.8940699831907001114 +94 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.7700950232534575509 -94 B 1

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW]    +129  [NEW]     +40    anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.8940699831907001114
  [NEW]    +114  [NEW]     +25    anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.8940699831907001114
  [NEW]    +108  [NEW]     +19    anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.8940699831907001114
  [NEW]     +96  [NEW]      +7    anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.8940699831907001114
  [NEW]     +94  [NEW]      +5    anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.8940699831907001114
  [DEL]     -94  [DEL]      -5    anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.7700950232534575509
  [DEL]     -96  [DEL]      -7    anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.7700950232534575509
  [DEL]    -108  [DEL]     -19    anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.7700950232534575509
  [DEL]    -114  [DEL]     -25    anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.7700950232534575509
  [DEL]    -129  [DEL]     -40    anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.7700950232534575509
  [ = ]       0  [ = ]       0    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 13, 2026

Regression Detector (Agent Data Plane)

Run ID: 3f19fb0c-82be-45d5-a01a-f8a77179ba65
Baseline: a7a89359 · Comparison: d38bf078 · Diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ +14.92 metrics profiles logs
quality_gates_rss_idle memory ⚪ +1.12 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ -0.98 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ +0.67 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +0.55 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ +0.55 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.50 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ +0.48 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ +0.30 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ +0.20 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +0.15 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ +0.14 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.09 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.03 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.01 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.01 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ -0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ +0.02 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.04 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.05 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ +0.11 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -0.23 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.25 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ -0.25 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ -0.33 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -0.71 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -0.87 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ -1.05 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ -1.94 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ -2.52 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ -4.95 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ -15.19 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 121 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.6 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.2 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 178 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 27 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

thieman added 2 commits May 14, 2026 11:05
Adds a new correctness test for the Datadog Agent's agenttelemetry
component, which collects internal Prometheus metrics and periodically
ships them to an intake endpoint. This test surfaces gaps where ADP
intercepts traffic (DogStatsD, distributions) without updating the
corresponding Go forwarder telemetry.

Changes:
- datadog-intake: new /api/v2/apmtelemetry handler (stores raw JSON
  payloads) + /agent-telemetry/dump endpoint; HTTPS proxy on port 2050
  using a self-signed rcgen cert so the agent's hardcoded-https sender
  can reach the local intake (requires skip_ssl_validation: true)
- panoramic: AgentTelemetry analysis mode comparing (metric, tags) ->
  value contexts between baseline and comparison; configurable
  flush_wait_secs per test to accommodate the agenttelemetry schedule
- test/correctness/agent-telemetry: test case using a custom
  agenttelemetry profile (forwarder metrics only, iterations: 1,
  start_after: 67) routed to the local intake; count-only millstone
  traffic to eliminate sketches_v2 as a confounding variable
- Disable trace agent (apm_config.enabled: false) in the test's
  datadog.yaml to eliminate ~60 internal DogStatsD contexts
  (datadog.trace_agent.*, datadog.dogstatsd.client.*) from every flush
  bucket, reducing per-bucket context count from 160 to 101 and making
  the series point structure far simpler to reason about.

- Add ±1 gauge tolerance to the AgentTelemetryAnalyzer. The sole
  source of residual between-run variance is datadog.agent.running,
  which is appended unconditionally to every 15-second aggregator flush
  in pkg/aggregator/aggregator.go:appendDefaultSeries. Whether its last
  firing lands just before or just after the start_after: 67 snapshot
  boundary produces a ±1 point swing. There is no agent config to
  disable it; it is hardcoded. Within a single run both agents start at
  the same second and always capture the same flush count so the
  intra-run comparison is always exact. A WARN is emitted if values
  differ within tolerance so any unexpected intra-run delta is visible.

- Add per-metric-name labels to small-context (≤3) flush buckets in
  the series breakdown log to aid debugging non-DSD metric sources.
@thieman thieman force-pushed the thieman/agenttelemetry-correctness-test branch from e440d21 to 4eb2f96 Compare May 14, 2026 09:56
@dd-octo-sts dd-octo-sts Bot added area/core Core functionality, event model, etc. area/io General I/O and networking. area/components Sources, transforms, and destinations. encoder/datadog-events Datadog events encoder. encoder/datadog-logs Datadog Logs encoder. encoder/datadog-metrics Datadog Metrics encoder. encoder/datadog-service-checks Datadog Service Checks encoder. encoder/datadog-stats Datadog APM Stats encoder. encoder/datadog-traces Datadog Traces encoder. forwarder/datadog Datadog forwarder. labels May 14, 2026
thieman added 4 commits May 14, 2026 12:11
Add a `focus_metrics` field to the correctness test config that, when
non-empty, replaces the standard internal-telemetry filter with an
allowlist. Only metrics whose names appear in the list are kept for
comparison; everything else (including other `datadog.*` metrics and
all user DSD traffic) is discarded.

This enables correctness tests that validate specific agent-emitted
metrics such as `datadog.agent.point.sent` and
`datadog.agent.point.dropped`, which would otherwise be stripped by
the default filter.

Also fix a test reference to `MetricsEndpoint::Series` that was
renamed to `MetricsEndpoint::SeriesV1` in #1646.
Agent images >= v112974386 already include a built-in `data-plane`
s6 service that starts the ADP binary. Copying our own s6-services
entry alongside it caused a double-start crash. Only cont-init.d is
copied now; the built-in service manages ADP lifecycle.
- Lower start_after from 67s to 37s so the COAT snapshot lands after
  both sides' first user DSD flush but before their second, giving each
  side exactly one flush cycle to compare. Both Go and ADP flush every
  ~15s but are offset by ~4s; t=37s sits safely in the (30s, 44s)
  window between first and second flushes on each side.
- Lower flush_wait_secs from 90s to 60s to match the shorter window.
- Remove transactions.success from the profile: baseline sends
  user+internal metrics in one Go payload per flush cycle while
  comparison splits them across separate ADP and Go payloads, so
  transaction counts are structurally different by design.
- Remove transactions.errors from the profile: ADP eagerly registers
  per-error-type counters that have no equivalent in the Go-only
  baseline, so these contexts will never line up.
- Add dogstatsd_context_expiry_seconds: 120 to keep DSD contexts
  alive across the test window for consistent flush behavior.
Add a new correctness test that validates the customer-facing
`datadog.agent.point.sent` and `datadog.agent.point.dropped` metrics
emitted by the Go agent's telemetry check.

The test uses the new `focus_metrics` allowlist to discard all user
DSD and other internal metrics, comparing only these two names between
baseline (Go-only DSD) and comparison (ADP DSD). Both sides run the
telemetry check via a mounted `conf.d/telemetry.d/conf.yaml` (not yet
bundled in the agent image) and have `data_plane.telemetry_enabled:
true` so ADP registers its TelemetryProvider with the Go agent. This
allows the telemetry check's `collectMergeMetrics` to pull ADP's
`point__sent` gauge from the non-default registry via RAR and sum it
with the Go forwarder's contribution, producing a merged
`datadog.agent.point.sent` that should match the baseline.
@thieman thieman force-pushed the thieman/agenttelemetry-correctness-test branch from 4eb2f96 to d38bf07 Compare May 14, 2026 10:11
@dd-octo-sts dd-octo-sts Bot removed area/core Core functionality, event model, etc. area/io General I/O and networking. area/components Sources, transforms, and destinations. encoder/datadog-events Datadog events encoder. encoder/datadog-logs Datadog Logs encoder. encoder/datadog-metrics Datadog Metrics encoder. encoder/datadog-service-checks Datadog Service Checks encoder. encoder/datadog-stats Datadog APM Stats encoder. encoder/datadog-traces Datadog Traces encoder. labels May 14, 2026
@dd-octo-sts dd-octo-sts Bot removed the forwarder/datadog Datadog forwarder. label May 14, 2026
gh-worker-dd-mergequeue-cf854d Bot pushed a commit to DataDog/datadog-agent that referenced this pull request May 14, 2026
## Human Summary

We are working with a handful of design partners to start using the Agent Data Plane (ADP) to handle DogStatsD metrics in customer orgs. This will shift custom metric payloads from the core agent to ADP, which will then forward them along to the Datadog backend. This will affect the `datadog.agent.point.sent` and `datadog.agent.point.dropped` metrics which are currently sent both to the customer org and to Datadog via Cross-Org Agent Telemetry (COAT). These can be sourced from ADP via the Remote Agent Registry (RAR) and ADP's TelemetryProvider.

However, the sources of these two metrics will not be _entirely_ shifted to ADP. Core Agent will still submit _some_ points on its own from checks and other internal metrics such as `datadog.agent.running`. This presents a new functional requirement where we need to be able to _merge_ both the Core Agent and ADP versions of these metrics before forwarding them along.

This PR does three things:

1. Within the customer-facing `telemetry` check flow, "regular" metrics are now gathered. This includes RAR-sourced metrics from remote agents. Selected regular metrics (currently just these two) are merged into the existing "default" metric set before being forwarded.

2. Within the internal COAT flow (`agenttelemetry`), we perform a similar merge. A big difference here is that `agenttelemetry` is already sourcing "regular" metrics, but previously there were no cases where a single metric came from both the core agent _and_ RAR. Now that's supported and metrics are merged.

3. The COAT versions of these two metrics are updated to include their `domain` and `remote_agent` tags. This will allow us to differentiate Core Agent traffic from ADP traffic for Datadog internal telemetry.

See [DADP-71](https://datadoghq.atlassian.net/jira/software/c/projects/DADP/boards/25544?search=71&selectedIssue=DADP-71) for further rationale.

### Rationale on Required Labels and QA Actions

This is my first Agent PR so it's entirely possible I misunderstood some of the intention behind these, but here's my first pass:

- Applied `changelog/no-changelog`. This change is intended to maintain compatibility with existing behavior when enabling the Agent Data Plane, so user-facing functionality is unaffected. Datadog-internal COAT metrics have some tagging changes which does not seem changelog-worthy.
- Applied `qa/done`. See this PR on Saluki, behavior was verified using differential tests comparing the Agent with and without Agent Data Plane enabled: DataDog/saluki#1637
- Applied `need-change/agenttelemetry-governance` and commented on existing governance card [ASUP-31](https://datadoghq.atlassian.net/jira/software/c/projects/ASUP/boards/8889/?selectedIssue=ASUP-31) pinging owner @carlosroman 

## Agentic Summary

### What does this PR do?

Adds Core Agent support for ADP point telemetry from Remote Agent Registry in both Agent telemetry paths:

- COAT preserves `domain` and `remote_agent` tags for `point.sent` / `point.dropped` by updating the default `logs-and-metrics` profile.
- COAT coalesces compatible metric families gathered from the regular and default telemetry registries before profile aggregation. This prevents duplicate metric families with the same name, such as ADP `point.sent` and Core Agent `point.sent`, from overwriting each other in the Agent telemetry payload map.
- The customer-facing telemetry core check merges allowlisted regular-registry metrics into the existing default telemetry output. Initially this covers `point.sent` and `point.dropped`, grouped by `domain`.
- The customer-facing metric names and tag shape remain compatible:
  - `datadog.agent.point.sent{domain:...}`
  - `datadog.agent.point.dropped{domain:...}`
- If gathering regular/RAR telemetry fails in the customer telemetry path, the check logs a warning and continues with Core Agent values only.

### Motivation

DADP-71: when Agent Data Plane forwards metrics, Core Agent forwarder telemetry alone undercounts `point.sent` and `point.dropped`. ADP exposes equivalent point counts through RAR; Core Agent needs to include them in customer-facing Agent telemetry while preserving customer metric compatibility, and in COAT while retaining `remote_agent` attribution.

### Describe how you validated your changes

- `dda inv install-tools`
- `dda inv test --targets=./pkg/collector/corechecks/telemetry`
- `dda inv test --targets=./comp/core/agenttelemetry/impl`
- `dda inv test --targets=./comp/core/agenttelemetry/impl --test-run-name TestRun`
- Verified `TestCoalescesDefaultAndNoDefaultMetricFamiliesBeforeAggregation` fails without the COAT metric-family coalescing change and passes with it.

### Additional Notes

The customer path intentionally keeps RAR/regular-registry metrics allowlisted. It does not expose arbitrary remote-agent telemetry to customer orgs.


[DADP-71]: https://datadoghq.atlassian.net/browse/DADP-71?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

Co-authored-by: jesse.szwedko <jesse.szwedko@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/test All things testing: unit/integration, correctness, SMP regression, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant