Generated 2026-03-29 by punishing audit. Updated 2026-04-04 after CRG C blitz. CRG B achieved 2026-04-04: Scanned 6 diverse external repos with real output.
| Target Repo | Language | What Was Tested | Result |
|---|---|---|---|
| gossamer | Gleam/Rust/Idris2 | Full hypatia scan — security, policy, workflow audit |
70 findings (12 CRITICAL, 7 HIGH, 47 MEDIUM) |
| protocol-squisher | Rust (29 crates) | Full hypatia scan — code safety, banned languages |
209 findings (18 CRITICAL, 16 HIGH, 175 MEDIUM) |
| burble | Elixir/ReScript/Idris2 | Full hypatia scan — believe_me detection, deprecated APIs |
59 findings (2 CRITICAL, 5 HIGH, 48 MEDIUM) |
| stapeln | Idris2/Zig/Rust/ReScript | Full hypatia scan — container system, multi-language |
492 findings (34 CRITICAL, 209 HIGH, 249 MEDIUM) |
| boj-server | ReScript/Deno/Idris2 | Full hypatia scan — MCP server, workflow audit |
73 findings (11 CRITICAL, 7 HIGH, 51 MEDIUM) |
| standards | Mixed (Rust/ReScript/Nickel/TS) | Full hypatia scan — monorepo, multi-standard |
163 findings (30 CRITICAL, 46 HIGH, 87 MEDIUM) |
1. gossamer (Gleam/Rust — window manager)
- Command:
hypatia scan /var/mnt/eclipse/repos/gossamer --format text --severity low - Key findings: 7 banned TypeScript test files, 4
believe_mein Idris2 ABI proofs,unwrap_or(0)in Rust bindings, missing scorecard/dependabot workflows
2. protocol-squisher (Rust — 29-crate workspace)
- Command:
hypatia scan /var/mnt/eclipse/repos/protocol-squisher --format text --severity low - Key findings: 9 banned Python files (PyO3 integration), 2 proof holes (Admitted + sorry), 8
unwrap_or(0)across crates, 16unwrap()DoS risks in examples/benches
3. burble (Elixir — WebRTC comms)
- Command:
hypatia scan /var/mnt/eclipse/repos/burble --format text --severity low - Key findings: 3
believe_mein Idris2 ABI, deprecatedJs.Array2/Js.Dict/Js.String2in ReScript signaling module, missing scorecard workflow
4. stapeln (Idris2/Zig — container system)
- Command:
hypatia scan /var/mnt/eclipse/repos/stapeln --format text --severity low - Key findings: 8 banned TypeScript files, 8
believe_me+ 5unsafePerformIOacross cerro-torre/vordr verification, 14getExnin svalinn ReScript, 15unwrap_or(0)in Rust runtime
5. boj-server (ReScript/Deno — MCP server)
- Command:
hypatia scan /var/mnt/eclipse/repos/boj-server --format text --severity low - Key findings: 5 banned TypeScript test files, 6
believe_mein safety ABI (SafeHTTP, SafeCORS, etc.), 78unwrap()in cartridge-minter tool
6. standards (Mixed — multi-standard monorepo)
- Command:
hypatia scan /var/mnt/eclipse/repos/standards --format text --severity low - Key findings: 4 banned Python files (avow/consent-aware), 7 banned TypeScript (a2ml/axel/k9-svc/lol/rsr-certifier), 7 JSON decode without validation in lol crawlers, HTTP URL in Nickel config
| Category | Count | Notes |
|---|---|---|
| Unit tests | 32 | Elixir: recipe_matcher, triangle_router, fleet_dispatcher, neural, vcl, safety, etc. |
| Integration | 7 | Rust: ci_simulation, arangodb, fleet, forge, hooks, registry + adapter/seam tests |
| E2E | 15 | test/e2e_pipeline_test.exs — all 6 pipeline stages wired with real verisim-data |
| P2P (property-based) | 26 | test/p2p_recipe_vql_test.exs — RecipeMatcher invariants, VCL parser totality, dispatch_strategy monotonicity |
| Concurrency | 16 | test/concurrency_test.exs — parallel recipe evaluation, RateLimiter/Quarantine/VCL under load |
| Smoke | 13 | test/zig_ffi_smoke_test.exs — Zig FFI source integrity, data path structure, dispatch strategy spec |
| Benchmarks | 1 file (7 real benches) | fixer/benches/hypatia_bench.rs — catalog construction/lookup, SHA pins, scan result summary, fixer construction |
Source modules: ~66 Elixir modules (15 rules, 7 neural networks, VCL, safety, data, fleet) + Rust adapters + 10 Idris2 ABI + 5 Zig FFI.
Total new tests added (2026-04-04): 70 tests across 4 new files. All pass.
Achieved 2026-04-04:
- Unit tests: 32 (existing, Elixir)
- Integration tests: 7 (Rust adapters)
- E2E tests: 15 (all 6 pipeline stages)
- P2P property-based tests: 26 (RecipeMatcher + VCL invariants)
- Concurrency tests: 16 (parallel evaluation, GenServer safety)
- Smoke tests: 13 (Zig FFI, data path, ABI files)
- Benchmarks: real (7 benches in fixer/benches/hypatia_bench.rs)
- No mocks — all tests use real ExUnit contexts + verisim-data on disk
Benchmark note: The root benches/hypatia_bench.rs stub is intentional (workspace has no
[package]). Real benchmarks run via cargo bench -p hypatia-fixer.
recipe_matcher_test.exs: 3 failures —recipe-shell-quote-vars.jsonis empty/corrupted in verisim-datatriangle_router_test.exs: 1 failure — depends on the corrupted recipe abovesafety_test.exs: 1 flaky failure — RateLimiter enqueue timing under OTP load
- Neural network outputs: property tests for bounded confidence scores (needs mock training data)
- Pattern registry: invariant tests for deduplication correctness at scale
- Fleet dispatch multi-bot round-trip (needs PAT with repo scope)
- Learning scheduler full training cycle (too slow for CI)
- Performance: No in-process latency measurement (Elixir has no criterion equivalent in test suite)
- Error handling: No tests for ArangoDB connection failure (integration dependency)
- Zig FFI runtime: No tests calling compiled .so (Zig build not in CI yet)
-
mix testrunner verification — passes (minus pre-existing data quality failures) -
cargo testfor fixer — passes - Workspace
cargo test— blocked by pre-existingbincode 3.0.0compile error - Zig FFI integration tests —
zig build testnot wired to CI - Container integration test (compose.test.yaml exists but needs validation)
- Rule module self-validation (15 rules, none self-tested)
- Neural model loading verification