Overhaul set benchmarks: split Immutable / SingleThreaded, add Set.copyOf by dougqh · Pull Request #11721 · DataDog/dd-trace-java

dougqh · 2026-06-23T21:46:47Z

What This Does

Overhauls the internal-api set membership benchmarks, mirroring the map-benchmark overhaul (#11679). Replaces the single SetBenchmark with two classes that each pick the correct threading model for their use case (@State scope can't vary by @Param, so one class can't host both):

ImmutableSetBenchmark — fixed, read-only membership shared across threads (@State(Scope.Benchmark); sharing is realistic and contention-free since nothing mutates). Compares array / sortedArray / HashSet / TreeSet / Set.copyOf (the JDK's compact, array-backed ImmutableCollections.SetN, via CollectionUtils.tryMakeImmutableSet — what the agent actually uses for fixed config sets). hit/miss split, per-thread lookup cursor. Sets in the tracer skew strongly toward this shape.
SingleThreadedSetBenchmark — per-thread mutable lifecycle (@State(Scope.Thread)): create/clone + contains/iterate, plus a Collections.synchronizedSet case for the uncontended synchronization tax (each thread owns its set → monitor only ever locked by one thread → the biased-locking story, read across JVM versions). Unsynchronized HashSet is the in-harness baseline.

Motivation

The old SetBenchmark used a shared mutable rotation counter under @Threads(8) (turning fast structures into a contention measurement) and had a contains_treeSet that actually queried HASH_SET. The split fixes both, and the Set.copyOf case answers a real question: for our ~10 fixed static final HashSet config sites, is the JDK's compact immutable set better than HashSet on speed/footprint?

Additional Notes

Run at default JVM flags, across versions. Set.copyOf only materializes the compact SetN on Java 10+ (falls back to HashSet pre-10); the synchronizedSet biased-locking delta shows across Java 11 → 17. Result blocks are intentionally empty pending a fresh multi-JVM run.
StringIndex (Add StringIndex: a generic open-addressed string set #11660) rows fold into these later — kept out so this lands independent of that data structure.

🤖 Generated with Claude Code

@threads

…pyOf Mirror the map-benchmark overhaul for sets. Replace the single SetBenchmark (shared mutable counter under @threads(8); contains_treeSet bug that queried HASH_SET) with two classes that each pick the right threading model: - ImmutableSetBenchmark: fixed read-only membership shared across threads (@State(Scope.Benchmark)); array / sortedArray / HashSet / TreeSet / Set.copyOf (the JDK compact SetN the agent actually uses for config sets, via CollectionUtils.tryMakeImmutableSet). hit/miss split, per-thread cursor. - SingleThreadedSetBenchmark: per-thread mutable lifecycle (@State(Scope.Thread)); create/clone + contains/iterate, plus a Collections.synchronizedSet case for the uncontended synchronization tax (per-thread => bias never revoked; biased-locking story across JVMs). StringIndex rows fold in later. Result blocks empty pending a fresh multi-JVM run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…edMapBenchmark StringIndex's benchmark integration is moving to the dedicated benchmark PRs (set overhaul #11721, map overhaul #11679) and will be folded in there later. Revert both benchmark files to master so this PR is purely the StringIndex data structure + tests. Avoids the #11679/#11721 deletions-vs-edits conflicts too. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dd-octo-sts · 2026-06-23T22:09:58Z

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite	Status
Startup	🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results

Scenario	Candidate	master	Δ (95% CI of mean)
startup:insecure-bank:iast:Agent	14.02 s	13.94 s	[-0.4%; +1.4%] (no difference)
startup:insecure-bank:tracing:Agent	12.90 s	12.96 s	[-1.2%; +0.4%] (no difference)
startup:petclinic:appsec:Agent	16.79 s	16.70 s	[-0.3%; +1.4%] (no difference)
startup:petclinic:iast:Agent	16.84 s	16.88 s	[-0.9%; +0.5%] (no difference)
startup:petclinic:profiling:Agent	16.70 s	16.35 s	[-2.6%; +6.8%] (no difference)
startup:petclinic:sca:Agent	16.88 s	16.77 s	[-0.4%; +1.8%] (no difference)
startup:petclinic:tracing:Agent	15.96 s	16.14 s	[-1.9%; -0.3%] (maybe better)

Commit: e1a6de34 · CI Pipeline · Benchmarking Platform UI

Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

ImmutableSetBenchmark: HashSet fastest; Set.copyOf (SetN) ~10% behind on hit, the compact form the agent uses for fixed config sets. SingleThreadedSetBenchmark: uncontended synchronizedSet tax ~37% on contains (biased locking off, Java 17), near-zero on iterate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dd-octo-sts · 2026-06-24T14:39:36Z

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

bric3 · 2026-06-24T14:42:06Z

+    Arrays.sort(sortedArray);
+    hashSet = new HashSet<>(Arrays.asList(STRINGS));
+    treeSet = new TreeSet<>(Arrays.asList(STRINGS));
+    copyOfSet = CollectionUtils.tryMakeImmutableSet(Arrays.asList(STRINGS));


suggestion: Maybe a better name instead of copyOf, e.g. tracerImmutableSet

Yeah, I debated that. It was so named because of the underlying mechanism, but tracerImmutableSet might be better.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5a14e62e10

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-24T14:43:33Z

+  public Set<String> clone_synchronizedSet() {
+    return Collections.synchronizedSet(new HashSet<>(hashSet));


Clone the synchronized set under test

In the clone benchmark for the synchronized-set variant, this copies the plain hashSet instead of the prebuilt synchronizedSet, unlike the other clone benchmarks that copy their corresponding structure. This means the reported clone_synchronizedSet result only measures a HashSet copy plus wrapper allocation, not the cost of cloning the synchronized-set variant that the benchmark name and setup imply.

Useful? React with 👍 / 👎.

Per bric3 review: copyOf named the mechanism; tracerImmutableSet names the role (the agent's fixed-config-set representation, i.e. Set.copyOf / SetN). Prose keeps the Set#copyOf reference for the mechanism. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…n one) Per Codex review: it copied `hashSet`, unlike the other clone_* methods which copy their own structure. Copy `synchronizedSet` so it faithfully measures cloning the synchronized variant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dougqh marked this pull request as ready for review June 24, 2026 14:39

dougqh requested a review from a team as a code owner June 24, 2026 14:39

dougqh requested a review from PerfectSlayer June 24, 2026 14:39

dd-octo-sts Bot added the tag: ai generated Largely based on code generated by an AI or LLM label Jun 24, 2026

dougqh added comp: core Tracer core type: refactoring tag: performance Performance related changes labels Jun 24, 2026

bric3 approved these changes Jun 24, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 24, 2026

View reviewed changes

dougqh and others added 2 commits June 24, 2026 13:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Overhaul set benchmarks: split Immutable / SingleThreaded, add Set.copyOf#11721

Overhaul set benchmarks: split Immutable / SingleThreaded, add Set.copyOf#11721
dougqh wants to merge 4 commits into
masterfrom
dougqh/set-benchmark

dougqh commented Jun 23, 2026 •

edited

Loading

Uh oh!

dd-octo-sts Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

dd-octo-sts Bot commented Jun 24, 2026

Uh oh!

bric3 Jun 24, 2026

Uh oh!

dougqh Jun 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		public Set<String> clone_synchronizedSet() {
		return Collections.synchronizedSet(new HashSet<>(hashSet));

Uh oh!

Conversation

dougqh commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What This Does

Motivation

Additional Notes

Uh oh!

dd-octo-sts Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟢 Java Benchmark SLOs — All performance SLOs passed

Uh oh!

dd-octo-sts Bot commented Jun 24, 2026

Uh oh!

bric3 Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

dougqh Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dougqh commented Jun 23, 2026 •

edited

Loading

dd-octo-sts Bot commented Jun 23, 2026 •

edited

Loading