Skip to content

TESTING Make DogStatsD stats collection lazy#1727

Draft
aqian01 wants to merge 1 commit into
mainfrom
andrewq/lazy-dsd-stats-collector
Draft

TESTING Make DogStatsD stats collection lazy#1727
aqian01 wants to merge 1 commit into
mainfrom
andrewq/lazy-dsd-stats-collector

Conversation

@aqian01
Copy link
Copy Markdown
Contributor

@aqian01 aqian01 commented May 22, 2026

Summary

Replace the always-connected DogStatsD stats destination with a shared lazy collector. The /dogstatsd/stats API remains available, but normal DogStatsD ingest no longer fans out every metric batch into a stats-only topology destination.

What changed

  • Added DogStatsDStatsCollector with an atomic inactive fast path and mutex-protected active collection state.
  • Kept /dogstatsd/stats response behavior, max-duration validation, 429 AlreadyRunning, and cancellation cleanup.
  • Wired the collector into the DogStatsD source so decoded metric events are recorded only when a stats request is active.
  • Removed the permanent dsd_stats_out destination and topology connection.
  • Added collector unit tests and DogStatsD source wiring coverage.

Why

The previous topology connected dsd_stats_out as a second consumer of dsd_in.metrics, which forced dispatcher fanout cloning on normal DogStatsD traffic even when no stats request was collecting. This keeps the API available while reducing normal hot-path work to an inactive atomic check.

Validation

  • cargo check --workspace && cargo check --workspace --tests
  • cargo nextest run -p saluki-components (584 passed, 1 skipped)
  • make fmt
  • git diff --check
  • make check-all reached formatting and Clippy successfully, then stopped because local vale is not installed: Please install Vale: https://vale.sh/docs/install

@dd-octo-sts dd-octo-sts Bot added area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. destination/dogstatsd-stats DogStatsD Statistics destination. labels May 22, 2026
@aqian01 aqian01 changed the title [codex] Make DogStatsD stats collection lazy TESTING Make DogStatsD stats collection lazy May 22, 2026
@datadog-datadog-prod-us1
Copy link
Copy Markdown

datadog-datadog-prod-us1 Bot commented May 22, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

Semantic PR Title Check | Check For Semantic PR Title   View in Datadog   GitHub Actions

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. No release type found in pull request title. Add a prefix to indicate the type of release.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 2175437 | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 22, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 54dc37e · Comparison: 2175437 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 37.68 MiB (baseline) vs 37.64 MiB (comparison)
Size Change: -37.46 KiB (-0.10%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
hyper -62.07 KiB 218
hyper_util +41.92 KiB 57
alloc -31.02 KiB 445
anyhow -26.92 KiB 445
core +20.28 KiB 2176
prost +19.97 KiB 236
serde_with -17.60 KiB 20
tonic -17.06 KiB 196
saluki_components::transforms::dogstatsd_mapper +15.87 KiB 8
figment -15.33 KiB 144
http_body_util +14.79 KiB 41
[sections] -13.52 KiB 7
rustls +12.68 KiB 139
h2 +12.11 KiB 126
std +12.04 KiB 130
tokio -11.74 KiB 1137
saluki_components::forwarders::datadog +8.86 KiB 6
saluki_components::common::datadog -8.24 KiB 77
saluki_components::destinations::prometheus -8.21 KiB 8
tokio_rustls -7.93 KiB 12
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +66.6Ki  [NEW] +66.4Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h1b011ad4ddc53a6d
  +0.2% +18.5Ki  +0.6% +56.0Ki    [11708 Others]
  [NEW] +25.1Ki  [NEW] +24.9Ki    saluki_components::sources::dogstatsd::drive_stream::_{{closure}}::h5c096335e4534134
  [NEW] +22.4Ki  [NEW] +22.2Ki    _<saluki_components::sources::dogstatsd::DogStatsDConfiguration as saluki_core::components::sources::builder::SourceBuilder>::build::_{{closure}}::hf2af4e9852048fee
  [NEW] +15.3Ki  [NEW] +15.2Ki    saluki_components::transforms::trace_sampler::TraceSampler::process_trace::h83e3ece73dd4ddf4
  [NEW] +14.3Ki  [NEW] +14.2Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::hf0b9bc13d27eb0ad
 +28e2% +13.8Ki +33e2% +13.8Ki    prost::message::Message::encode_to_vec::h38c2d0079ee57f8f
  [NEW] +13.5Ki  [NEW] +13.4Ki    saluki_components::transforms::apm_stats::span_concentrator::SpanConcentrator::new_stat_span_from_span::h87cd34edb479fa6c
 +28e2% +13.4Ki +33e2% +13.4Ki    prost::message::Message::encode::hf34018aeb1a7a39a
 +53e2% +12.2Ki +90e2% +12.2Ki    std::sys::backtrace::__rust_begin_short_backtrace::h770c6768cbe27e37
 -31.8% -11.5Ki -32.0% -11.5Ki    _<saluki_components::transforms::apm_stats::ApmStats as saluki_core::components::transforms::Transform>::run::_{{closure}}::h84b9d860b81c1dae
 -93.5% -11.7Ki -94.3% -11.7Ki    agent_data_plane::state::metrics::rules::get_datadog_agent_remappings::h377928bc659a9572
  -1.2% -12.8Ki  -1.2% -12.8Ki    [section .gcc_except_table]
  [DEL] -13.2Ki  [DEL] -13.1Ki    saluki_components::sources::dogstatsd::replay::writer::run_capture_loop::h03554b186c4c1ae5
  [DEL] -13.8Ki  [DEL] -13.6Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h59e4ebbb57eb7f91
 -98.3% -14.2Ki -99.7% -14.2Ki    _<saluki_components::transforms::trace_sampler::TraceSampler as saluki_core::components::transforms::SynchronousTransform>::transform_buffer::h95266a14ee7bdae6
  [DEL] -19.5Ki  [DEL] -19.3Ki    _<saluki_components::sources::dogstatsd::DogStatsDConfiguration as saluki_core::components::sources::builder::SourceBuilder>::build::_{{closure}}::hf04f3aa1c0fae428
  [DEL] -21.2Ki  [DEL] -21.0Ki    _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::h1495d50bfb28ee14
  [DEL] -22.0Ki  [DEL] -21.9Ki    saluki_components::sources::dogstatsd::drive_stream::_{{closure}}::h22208f5cc9094b54
  [DEL] -44.5Ki  [DEL] -44.4Ki    saluki_components::transforms::apm_stats::ApmStats::process_trace::h319e78c858c0338a
  [DEL] -68.2Ki  [DEL] -68.1Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h41fee3c69708d7c8
  -0.1% -37.5Ki  +0.0%    +120    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 22, 2026

Regression Detector (Agent Data Plane)

Run ID: e40dc9d8-a48e-4254-b856-386dbd95c4ea
Baseline: 54dc37e7 · Comparison: 21754377 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +1.36 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ +0.78 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +0.70 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ +0.53 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ +0.53 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.45 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +0.23 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ +0.15 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.15 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ +0.13 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.10 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +0.07 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.01 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.01 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ -0.02 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ +0.03 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.13 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.14 metrics profiles logs
quality_gates_rss_idle memory ⚪ -0.34 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ +0.34 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ -0.38 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.42 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ -0.67 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ -0.89 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ -0.89 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -1.04 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -2.52 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +2.59 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -2.88 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu 🟢 -5.68 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu 🟢 -7.84 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 123 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 40.2 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 59.9 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 179 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.9 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/components Sources, transforms, and destinations. destination/dogstatsd-stats DogStatsD Statistics destination. source/dogstatsd DogStatsD source.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant