bench(sirun): benchmark harness foundation#8721
Draft
BridgeAR wants to merge 4 commits into
Draft
Conversation
Lower the report threshold from 5% to 2%. The analyzer only flags a change when its confidence interval is entirely outside the threshold, so a tight (low variance) interval is required to surface a 2% move while noisy intervals stay under the bar. Also add BENCHMARK_DASHBOARD_URL (empty by default) which the reporter renders as a link in the PR comment when set.
…uite With BENCHMARKS_FROM=candidate the baseline runs this PR's benchmark code on the older source. A baseline failure is skipped only when the same variant passed on the candidate run, confirming the failure is specific to the older source rather than a broken benchmark. The run still fails when the PR also changes non-benchmark source (docs, CODEOWNERS, CI config and tests excluded) -- the A/B comparison is then incomplete and the benchmark should land on its own first.
Contributor
Overall package sizeSelf size: 6.06 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.1 | 82.56 kB | 817.39 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | dc-polyfill | 0.1.11 | 25.74 kB | 25.74 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
|
BenchmarksBenchmark execution time: 2026-05-31 19:35:56 Comparing candidate commit b1d2306 in PR branch Found 1 performance improvements and 1 performance regressions! Performance is the same for 1485 metrics, 106 unstable metrics. scenario:plugin-bluebird-with-tracer-20
scenario:scope-manager-scope_enabled-20
|
…hip) Foundation the per-bench PRs stack on: 1. startup-guard.js: the load-vs-loop share assertion every bench calls. 2. runall.sh: auto-shard from variant count x available cores, failing with the exact SPLITS to configure. 3. CODEOWNERS: route each benchmark directory to its owning team.
Their benches still run on this branch, so narrowing the PLUGINS install set must pair with removing the benches -- which happens in the integrations PR, not the harness. Restores the master plugin list to avoid a missing-module failure.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Foundation for the split-up sirun benchmark work; the per-bench PRs stack on this and should merge after it.
startup-guard.js— the load-vs-loop share assertion every bench calls.runall.sh— tolerate a new benchmark failing on the older baseline source (only when it passed on the candidate), and auto-shard from variant count × available cores with an exact fix message when the matrix is too small..gitlab/benchmarks.yml— report changes at 2% instead of 5%, and add the dashboard-link variable..github/CODEOWNERS— route each benchmark directory to its owning team.Test plan