Mutation testing, cost visibility, CI, and docs#40
Conversation
- storage_json: tests for refresh_summary, get_event_stats (empty/single/by_repo), prune boundary - docs/mutation-testing.md: how to run, equivalent mutants, pre-push vs CI, file globs - CI: mutation job on *storage_json* with timeout and missed-count baseline (15) - README/CLAUDE: pre-push vs CI mutation note; optional --no-verify - Analytics: show est. cost (reviews) from useEventStats when available - web: cost.test.ts for formatCost, estimateCost, totalCost Made-with: Cursor
PR SummaryMedium Risk Overview Improves GitHub workflow reliability by publishing Tightens Written by Cursor Bugbot for commit 767a059. This will update automatically on new commits. Configure here. |
|
|
||
| let removed = backend.prune(max_age, 1000).await.unwrap(); | ||
| assert_eq!( | ||
| removed, 0, | ||
| "exactly at boundary (now - max_age) should not be pruned" | ||
| ); | ||
|
|
||
| let list = backend.list_reviews(10, 0).await.unwrap(); | ||
| assert_eq!(list.len(), 1); | ||
| assert_eq!(list[0].id, "boundary"); | ||
| } |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Addressed: extracted prune_at(max_age_secs, max_count, now_secs) so the test can pass a single now and avoid the second-boundary race. prune() calls prune_at(..., SystemTime::now()); the boundary test calls backend.prune_at(max_age, 1000, now).await with one now_ts().
…available - publish-image.yml: on push to main, build and push ghcr.io/evalops/diffscope:latest and :sha-<sha> - diffscope.yml: check image available before Run DiffScope; skip gracefully with notice if pull fails (job no longer fails) Made-with: Cursor
…ilure never fails job Made-with: Cursor
…re-pull failing job GitHub pre-pulls container actions before any steps run; that pull was failing the job. Run DiffScope via 'docker run' only when image is already available from Check image step. Made-with: Cursor
… single timestamp - Extract prune_at(max_age_secs, max_count, now_secs) so test and production share logic - prune() calls prune_at(..., SystemTime::now()); test calls prune_at(..., now_ts()) once Made-with: Cursor
| - name: Mutation test (storage_json) | ||
| run: | | ||
| timeout 900 cargo mutants -f '*storage_json*' 2>&1 | tee mutation.log || true | ||
| MISSED=$(grep -E '[0-9]+ missed' mutation.log | tail -1 | grep -oE '[0-9]+' | head -1 || echo "0") |
There was a problem hiding this comment.
Grep extracts wrong number from cargo-mutants summary
High Severity
The grep chain parses the wrong number from the cargo-mutants summary line. The summary format is e.g. 14 mutants tested in 0:08: 2 missed, 9 caught, 3 unviable. The first grep -E '[0-9]+ missed' matches the full line, then grep -oE '[0-9]+' extracts all numbers (14, 0, 08, 2, 9, 3), and head -1 picks "14" (total mutants) instead of "2" (the actual missed count). This causes MISSED to be set to the total mutant count, making the baseline check nearly always fail spuriously.


Summary
-f.mutationjob runscargo mutants -f '*storage_json*'with 15min timeout; fails if missed count > 15 (baseline documented in doc).git push --no-verifyfor quick pushes.useEventStats().total_cost_estimatewhen available.cost.test.tsfor formatCost, estimateCost (server preferred), totalCost.Test plan
cargo test(all pass)cd web && npm run test(all pass)Made with Cursor