Skip to content

fix: preserve hand-written notes in query benchmark reports#551

Merged
carlos-alm merged 2 commits into
mainfrom
fix/query-benchmark-notes-preservation
Mar 20, 2026
Merged

fix: preserve hand-written notes in query benchmark reports#551
carlos-alm merged 2 commits into
mainfrom
fix/query-benchmark-notes-preservation

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Add <!-- NOTES_START --> / <!-- NOTES_END --> preservation to update-query-report.js, matching the existing pattern in update-benchmark-report.js
  • Add footnote explaining that 3.3.0 → 3.3.1 fnDeps/fnImpact deltas are not comparable due to hub target change (PR perf: reduce query latency regression from 3.1.4 to 3.3.0 #528 pinned hub from auto-selected src/types.ts to buildGraph)

Context

PR #548 showed ↑157-192% regressions in fnDeps/fnImpact — but this is an artifact of the hub target changing, not an engine regression. diffImpact actually improved 20-44% in the same release.

Without the notes preservation fix, adding a footnote to the report would be overwritten on the next benchmark run.

Test plan

  • Verify update-query-report.js preserves notes between <!-- NOTES_START --> and <!-- NOTES_END -->
  • Verify footnote renders correctly in QUERY-BENCHMARKS.md

Add NOTES_START/NOTES_END block preservation to update-query-report.js,
matching the existing pattern in update-benchmark-report.js. Without
this, any hand-written footnotes in QUERY-BENCHMARKS.md get overwritten
on the next benchmark run.

Add note explaining that 3.3.0→3.3.1 fnDeps/fnImpact deltas are not
comparable due to hub target change from auto-selected src/types.ts to
pinned buildGraph (PR #528).
@claude
Copy link
Copy Markdown

claude Bot commented Mar 20, 2026

Claude finished @carlos-alm's task —— View job


Root Cause Analysis

Problem Being Solved: The update-query-report.js script was overwriting hand-written footnotes in QUERY-BENCHMARKS.md on every benchmark run, losing valuable context about non-comparable version deltas (specifically 3.3.0→3.3.1 hub target changes).

Why This Approach: Mirrors the existing pattern in update-benchmark-report.js:261-272 which already preserves / blocks. This is a proven, consistent solution across the benchmark reporting ecosystem.

Risk Assessment: Very low risk - this is a pure additive change that preserves existing functionality while adding note preservation capability.

Backlog Compliance

  • Zero-dep: ✓ — No new dependencies, uses existing Node.js fs APIs
  • Foundation-aligned: ✓ — Improves documentation quality without violating any principles
  • Problem-fit: 3 — Useful for maintaining benchmark context but doesn't directly address core "lost AI" problem
  • Breaking: No — Purely additive, existing behavior unchanged
  • Tier: 1

Code Quality Assessment

Pattern Consistency

The implementation exactly matches the established pattern in update-benchmark-report.js:261-272:

  • Same regex: /\n([\s\S]*?)/
  • Same preservation logic and variable naming
  • Same conditional output formatting
  • Correctly placed between latest results and JSON data comment

Implementation Quality

// lines 149-160 in update-query-report.js
let notes = '';
if (fs.existsSync(reportPath)) {
    const existing = fs.readFileSync(reportPath, 'utf8');
    const notesMatch = existing.match(/\n([\s\S]*?)/);
    if (notesMatch) {
        notes = notesMatch[1];
    }
}
if (notes) {
    md += `\n${notes}\n\n`;
}

The code is:

  • Safe: Gracefully handles missing files and no-match cases
  • Correct: Preserves exact whitespace and formatting within notes
  • Efficient: Only reads file once, already checked for existence above in the function

Footnote Content Quality

The added footnote is well-written and factually accurate:

  • Clearly explains the 157-192% apparent regression is measurement artifact, not performance degradation
  • References specific PR perf: reduce query latency regression from 3.1.4 to 3.3.0 #528 that caused the hub target change
  • Notes that diffImpact actually improved 20-44% in same release
  • Sets expectation for future version comparisons being stable

Critical Concerns

None identified. This is a straightforward, low-risk fix that follows established patterns.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Clean implementation that follows existing patterns, solves a real problem (lost documentation), and includes valuable context about benchmark interpretation. The only reason it's not 5⭐ is that it's a relatively minor maintenance fix rather than substantial feature work.

Minor Suggestions

  1. Consider adding a comment in the code similar to update-benchmark-report.js about the purpose: // ── Preserve hand-written notes from existing file ────────────────────
  2. The footnote could potentially be shortened, but the current length provides helpful context for future readers

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 20, 2026

Greptile Summary

This PR adds two closely related improvements: notes preservation to update-query-report.js (matching the existing pattern in update-benchmark-report.js), and a hand-written footnote to QUERY-BENCHMARKS.md explaining that the 3.3.1 fnDeps/fnImpact regression spike is an artifact of the hub target change (PR #528) rather than an engine regression.

  • scripts/update-query-report.js: Adds a notes-preservation block that reads <!-- NOTES_START -->/<!-- NOTES_END --> from the existing report before overwriting it, then re-injects the captured content into the freshly generated file. The logic is an exact copy of the pattern already in update-benchmark-report.js (lines 261–272), keeping the two scripts consistent.
  • generated/benchmarks/QUERY-BENCHMARKS.md: Adds the first use of the new preservation mechanism — a single-line note warning readers that the 3.3.1 fnDeps/fnImpact deltas are not engine-comparable due to the hub target pin, while noting that diffImpact actually improved 20–44% in the same release.
  • The note will survive future benchmark runs because the script now reads it back before regenerating the file.
  • No logic changes to benchmarking, regression detection, or history management.

Confidence Score: 5/5

  • Safe to merge — straightforward, additive change with no impact on production logic.
  • The notes-preservation logic is a direct copy of the already-proven pattern in update-benchmark-report.js, with no changes to benchmarking, regression detection, or history management. The markdown footnote is documentation-only. No risk of data loss or behavioral regression.
  • No files require special attention.

Important Files Changed

Filename Overview
scripts/update-query-report.js Adds notes-preservation block (lines 149–160) that reads back hand-written notes from the existing file before overwriting it; implementation is a direct copy of the pattern in update-benchmark-report.js and is correct.
generated/benchmarks/QUERY-BENCHMARKS.md Adds a NOTES_START/END block with a footnote explaining why the 3.3.1 fnDeps/fnImpact deltas are not directly comparable to 3.3.0 due to the hub target change in PR #528.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Run update-query-report.js] --> B{reportPath exists?}
    B -- Yes --> C[Read existing file]
    B -- No --> E[notes = empty string]
    C --> D{Match NOTES_START...NOTES_END?}
    D -- Yes --> F[Capture notes content]
    D -- No --> E
    F --> G{notes non-empty?}
    E --> G
    G -- Yes --> H[Append NOTES_START block to md]
    G -- No --> I[Skip notes block]
    H --> J[Append QUERY_BENCHMARK_DATA block]
    I --> J
    J --> K[Write QUERY-BENCHMARKS.md]
Loading

Last reviewed commit: "fix: preserve hand-w..."

@carlos-alm carlos-alm merged commit a792652 into main Mar 20, 2026
15 checks passed
@carlos-alm carlos-alm deleted the fix/query-benchmark-notes-preservation branch March 20, 2026 10:01
@github-actions github-actions Bot locked and limited conversation to collaborators Mar 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant