SymDB: extraction performance benchmark#5698
Draft
p-datadog wants to merge 3 commits into
Draft
Conversation
8e4b46b to
6341a20
Compare
Generates 2500 user-code classes in a tmpdir, requires them, then runs
Extractor#extract_all once and captures memory + CPU + wall time. Outputs
symbol_database_extraction-results.json.
Wires into existing benchmark infrastructure:
- new symbol_database_ prefix added to benchmarks/README.md
- symbol_database_extraction.rb added to the &other group in execution.yml
so it runs in dtr CI
- validate spec at spec/datadog/symbol_database/validate_benchmarks_spec.rb
runs the harness with VALIDATE_BENCHMARK=true (10 classes) to catch bitrot
Verifies the performance requirements in projects/symdb/requirements.md
(memory < 50 MB overhead during extraction; CPU < 5%). Plan in
projects/symdb/testing/performance-test-plan.md.
Local run on Ruby 3.2.3, x86_64-linux: 269 ms wall, 27 MB peak overhead.
Stdlib only (Process.times, /proc/self/status, GC.stat) — no new gem deps.
6341a20 to
9e875c6
Compare
Measures how SymDB extraction running on a background thread impacts a concurrent main-thread workload. Runs three arms (baseline / treatment / baseline_post) and reports the p99 latency ratio as the headline statistic. Addresses the deferred sub-item of the SymDB performance plan: "Extraction must not block the application's request handling" (requirements.md item 23). Originally scoped to require Rails; a synthetic non-allocating CPU workload is sufficient since the impact vector is GVL hold time during ObjectSpace traversal. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 8d1101b | Docs | Datadog PR Page | Give us feedback! |
The benchmark read /proc/self/status to get VmRSS, which doesn't exist on macOS. Test failed on all seven macOS CI configs with Errno::ENOENT @ rb_sysopen - /proc/self/status. Fall back to `ps -o rss=` on platforms without /proc. Both return RSS in KB. Linux fast-path (file read, no fork) preserved. Verified VALIDATE_BENCHMARK=true on x86_64-linux-gnu Ruby 3.2.3 — both benchmarks pass and produce the same fields. macOS execution to be verified by CI re-run since this host has no macOS.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
JIRA: DEBUG-5671
What does this PR do?
Adds a Symbol Database extraction performance benchmark. The harness generates 2500 user-code classes in a tmpdir, requires them, then runs
Extractor#extract_allonce and captures memory + CPU + wall time, writingsymbol_database_extraction-results.json.Wires into existing benchmark infrastructure:
symbol_database_prefix inbenchmarks/README.mdsymbol_database_extraction.rbadded to the&othergroup inbenchmarks/execution.ymlso it runs in dtr CIspec/datadog/symbol_database/validate_benchmarks_spec.rbruns the harness withVALIDATE_BENCHMARK=true(10 classes) to catch bitrotStdlib only (
Process.times,/proc/self/status,GC.stat) — no new gem deps.Motivation:
Verifies the SymDB performance requirements: memory overhead < 50 MB and CPU overhead < 5% during extraction. Backlog item "Performance testing" in the symdb project.
Change log entry
None.
Additional Notes:
Local run on Ruby 3.2.3, x86_64-linux: 269 ms wall, 27 MB peak memory overhead. The CPU% emitted by the harness is single-core utilisation (near 100% by construction for a one-shot CPU-bound operation); the
< 5%requirement is interpretable only when amortized over a long-running process — the harness emits raw CPU time and wall time, results doc decides PASS/FAIL.Branched off
symbol-database-upload(#5431) so the perf measurement targets the same code as the main tracer PR.How to test the change?