Clarify PR-B cache and measurement telemetry by ictechgy · Pull Request #126 · ictechgy/context-guard

ictechgy · 2026-06-04T04:17:26Z

Summary

Bump audit feasibility schema to contextguard.metric-feasibility.v1.1 for additive cache-friendliness fields.
Emit explicit cache-friendliness overlap telemetry and partial confidence when prefix/tail windows overlap or evidence is partial.
Memoize build_cache_friendliness through shared audit output paths to avoid redundant recomputation.
Add regressions for cache-discount recommendations staying separate from token reduction.
Clarify benchmark/provider-cache caveats and example report shape-only guidance.

Verification

python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k cache_friendliness
python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k transcript_audit
python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k benchmark
python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k benchmark_report
python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k benchmark_runner_rejects_incompatible_existing_csv_schema
python3 -m py_compile context-guard-kit/claude_transcript_cost_audit.py plugins/context-guard/bin/context-guard-audit context-guard-kit/benchmark_runner.py plugins/context-guard/bin/context-guard-bench tests/test_context_guard_kit.py
cmp -s context-guard-kit/claude_transcript_cost_audit.py plugins/context-guard/bin/context-guard-audit
cmp -s context-guard-kit/benchmark_runner.py plugins/context-guard/bin/context-guard-bench
python3 -m json.tool docs/benchmark-report.example.json >/dev/null
git diff --check
python3 scripts/release_smoke.py
python3 scripts/prepublish_check.py

Notes

G010 PR-B only: no artifact receipt, diet scanner, or context packer follow-ups included.
OMX team startup was attempted first for G002 but produced no worker completion evidence; fallback was recorded in the ultragoal ledger and implementation was completed with native sidecar review.

Constraint: G010 PR-B scope is limited to PR #118/#122 cache and measurement interpretation follow-ups.\nRejected: Changing benchmark claim semantics beyond provider-cache caveats | Current behavior already separates paired token/cost claims and only needed clearer guardrails.\nConfidence: high\nScope-risk: moderate\nDirective: Treat provider cache fields as diagnostic telemetry; do not fold cache discounts into token-reduction claims.\nTested: python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k cache_friendliness; python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k transcript_audit; python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k benchmark; python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k benchmark_report; python3 -m unittest discover -s tests -p 'test_context_guard_kit.py' -k benchmark_runner_rejects_incompatible_existing_csv_schema; python3 -m py_compile context-guard-kit/claude_transcript_cost_audit.py plugins/context-guard/bin/context-guard-audit context-guard-kit/benchmark_runner.py plugins/context-guard/bin/context-guard-bench tests/test_context_guard_kit.py; cmp -s context-guard-kit/claude_transcript_cost_audit.py plugins/context-guard/bin/context-guard-audit; cmp -s context-guard-kit/benchmark_runner.py plugins/context-guard/bin/context-guard-bench; python3 -m json.tool docs/benchmark-report.example.json; git diff --check; python3 scripts/release_smoke.py; python3 scripts/prepublish_check.py\nNot-tested: GitHub CI before PR creation

Constraint: PR-B bumps feasibility JSON to contextguard.metric-feasibility.v1.1 while existing Mac consumer enforces exact schema compatibility.\nRejected: Accept every future v1.x schema automatically | unsupported future additive fields still need an explicit compatibility decision.\nConfidence: high\nScope-risk: narrow\nDirective: Update contextGuardSupportedFeasibilitySchemaVersions whenever the audit feasibility contract gets another compatible minor version.\nTested: swift package clean && swift test; PR-B targeted cache_friendliness/transcript_audit/benchmark unittest slices; py_compile/cmp/json/diff-check; release_smoke.py; prepublish_check.py.\nNot-tested: Live macOS app UI manual launch.

Constraint: PR-B review found status=partial could still report observed confidence for low-record cache-friendliness evidence.\nRejected: Leave as advisory-only MEDIUM | PR-B explicitly owns partial-evidence confidence framing, so fixing now reduces claim ambiguity.\nConfidence: high\nScope-risk: narrow\nDirective: Keep cache_friendliness status and confidence semantics aligned when adding new partial-evidence conditions.\nTested: unittest cache_friendliness 10 tests; unittest transcript_audit 41 tests; unittest benchmark 21 tests; swift test; py_compile/cmp/json/diff-check; release_smoke.py; prepublish_check.py 392 tests.\nNot-tested: Manual transcript review in a live Claude Code session.

ictechgy added 3 commits June 4, 2026 13:04

ictechgy merged commit 677a0e5 into main Jun 4, 2026
3 checks passed

ictechgy deleted the feature/g010-pr-b-cache-measurement-hardening branch June 4, 2026 05:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify PR-B cache and measurement telemetry#126

Clarify PR-B cache and measurement telemetry#126
ictechgy merged 3 commits into
mainfrom
feature/g010-pr-b-cache-measurement-hardening

ictechgy commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ictechgy commented Jun 4, 2026

Summary

Verification

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant