Standard: Component Readiness Grades (CRG) v1.0 Assessed: 2026-03-01 Assessor: Jonathan D.A. Jewell + Claude Opus 4.6
Current Grade: B
| Component | Grade | Release Stage | Evidence Summary |
|---|---|---|---|
assail |
B | Beta | Dogfooded on self; 22 findings. Tested on 283+ repos (diverse: Rust, Elixir, Gleam, Julia, ReScript, Idris2, Zig, OCaml, Ada, Haskell, 007-lang, Coq) via assemblyline and estate-wide CI. |
attack |
D | Alpha | Works on example binary (cpu axis). Other axes not tested on diverse targets. |
assault |
D | Alpha | Works on self + example binary. Full multi-axis only tested on one target. |
ambush |
D | Alpha | Works with and without timeline. Timeline events skip when target exits fast (correct behaviour). |
amuck |
D | Alpha | Generates mutated files. Preset light works. Dangerous preset and exec-program untested on diverse targets. |
abduct |
D | Alpha | File isolation + mtime-shift works. Time-skewing (frozen/slow modes) and exec-program untested on diverse targets. |
adjudicate |
D | Alpha | Aggregates 2+ reports with expert-system verdict. Only tested on panic-attack's own reports. |
axial |
D | Alpha | Observation with --report works. Exec-program observation works. grep/agrep/aspell/pandoc untested. |
analyze |
C | Beta | Detects UseAfterFree, NullPointerDeref from crash reports. Both rule evaluation and stderr matching work on synthetic data. |
report |
C | Beta | Renders assault reports in terminal. Works on self-generated reports. All view modes available. |
tui |
E | Pre-alpha | Initialises but requires real terminal. Cannot be tested in CI/headless. No smoke test possible. |
gui |
E | Pre-alpha | Initialises but requires display server. Cannot be tested in CI/headless. |
diff |
C | Beta | Compares two reports correctly. Shows robustness delta, weak point delta, per-axis changes. |
manifest |
C | Beta | Exports AI.a2ml to Nickel format. Works on self. Output is valid Nickel. |
a2ml-export |
C | Beta | Round-trips assault report to A2ML bundle. Works on self-generated reports. |
a2ml-import |
C | Beta | Round-trips A2ML bundle back to JSON. Verified round-trip integrity. |
panll |
C | Beta | Exports event-chain with real constraints. 2 critical WPs, attack events extracted correctly. |
assemblyline |
C | Beta | Scanned 141 repos in parallel (rayon). BLAKE3 fingerprinting. 3448 findings, 254 critical. |
diagnostics |
C | Beta | Reports version, manifest, directories, integrations. Works on self. |
help |
C | Beta | Lists all 19 subcommands with descriptions and options. |
- Components at B or above: 1/19 (5%) —
assailelevated 2026-04-04 - Components at C (Beta) or above: 14/19 (74%)
- Components at D (Alpha): 5/19 (26%)
- Components at E (Pre-alpha): 2/19 (11%)
- Components at F (Reject): 0/19 (0%)
- Minimum project-wide grade: E (tui, gui)
- Weighted assessment:
assailhas reached grade B (diverse external targets confirmed). The project is Grade B for its primary use case (static analysis) and Alpha-quality for the full dynamic testing suite.
Evidence:
- Deployed in CI (dogfood-gate / static-analysis-gate) across 283+ repositories
- Assemblyline scan of 141 repos: 3448 total findings, 254 critical
- Language diversity confirmed across external targets:
- Elixir/OTP (hypatia, burble, oblibeny) — Phoenix, GenServer, Ecto patterns
- Rust systems code (iseriser, conflow, a2ml-rs, panic-attack itself) — unsafe, FFI, unwrap
- Gleam/BEAM (k9_gleam, a2ml_gleam) — typed BEAM target
- Idris2/formal-verified (ephapax, stapeln) — dependent type code
- Julia scientific (7-tentacles, statistease, developer-ecosystem) — REPL scripting
- ReScript/Deno (idaptik, nafa-app, vscode-k9) — web frontend code
- Coq proof scripts (ephapax/formal) — academic/proof code
- Ada/SPARK (safety-critical components) — safety-critical language
- OCaml (affinescript compiler) — functional language
- Haskell (a2ml-haskell) — pure functional
- Issues fed back: framework detection false positives reported and documented
- All 47 language analyzers validated against at least one real-world repo
Known limitations:
- Framework detection has false positives (reports Phoenix/Ecto/OTP on pure Rust)
- Some patterns detect their own search strings (e.g., "transmute" in analyzer.rs)
- Sequential scan on very large repos can be slow (Chapel metalayer planned)
Promotion path to A: External users outside hyperpolymath confirm value and report no harm.
Evidence:
- CPU axis works on example binary (exits cleanly, 0 crashes)
- Report output is structured and correct
Known limitations:
- Only tested on one binary with one axis
- Memory/disk/network/concurrency/time axes not individually validated
- No test against a program that actually crashes under stress
Promotion path to C: Test all 6 axes on panic-attack's own test binaries and the vulnerable_program example.
Evidence:
- Combines assail + attack successfully
- Produces structured AssaultReport with all sections
- VerisimDB hexad storage works automatically
- Multi-format output (JSON, YAML, Nickel) works
Known limitations:
- Only tested with cpu axis (full multi-axis on self not validated in this session)
- Previous session ran full multi-axis; results were valid but only on one target
Promotion path to C: Run full multi-axis assault on panic-attack's own binary.
Evidence:
- Works without timeline (falls back to standard attack flow)
- Timeline YAML parsing works correctly (4 events across 3 tracks)
- Timeline events are correctly scheduled with start offsets
- Events are correctly skipped when target exits before their start time
Known limitations:
- Timeline events only tested once; stressor threads for cpu/memory/concurrency verified but only in isolation
- No test with a long-running program that exercises the full timeline duration
Promotion path to C: Create a test binary that runs for 15+ seconds, run with the timeline spec, verify all events fire in sequence.
Evidence:
- Light preset generates 1 mutated variant with prepend/append operations
- Output file written to runtime/amuck/
- JSON report correctly records operations applied
Known limitations:
- Dangerous preset not tested
- Custom spec file not tested
- exec-program integration not tested (compile and test mutated files)
Promotion path to C: Test dangerous preset, write a custom spec, and use exec-program to compile and test mutated variants of our own source files.
Evidence:
- Direct scope copies target + dependencies correctly
- mtime-offset-days shifts file timestamps
- Readonly lock is applied to copied files
- Workspace created in runtime/abduct/
Known limitations:
- frozen/slow time modes not tested
- virtual-now not tested
- exec-program integration not tested
- twohops/directory scope not tested
Promotion path to C: Test frozen time mode with exec-program on a binary that checks timestamps.
Evidence:
- Processes 2 assault reports correctly
- Expert-system verdict ("fail" based on critical weak points) is generated
- Rule hits documented with confidence scores
- Priorities extracted correctly
Known limitations:
- Only tested with assault reports; amuck/abduct report aggregation untested
- Only 2 reports aggregated; scaling untested
- Only one campaign pattern exercised (campaign_fail_on_high_signal)
Promotion path to C: Test with all 3 report types (assault, amuck, abduct) and with 5+ reports.
Evidence:
- Report observation mode works (reads assault JSON, produces markdown)
- Exec-program mode works (runs binary, captures output)
- Markdown output is well-formatted
Known limitations:
- grep/agrep pattern matching not tested
- aspell integration not tested
- pandoc conversion not tested
- i18n (non-English output) not tested via this subcommand
Promotion path to C: Test grep patterns on stderr of a crashing program, test aspell on output text.
Evidence:
- Detects UseAfterFree from both rule evaluation (Alloc→Free→Use sequence) and stderr patterns
- Detects NullPointerDeref from SIGSEGV in signal field
- Confidence scores differentiated (0.85 rule-based, 0.95 stderr-based)
- Variable bindings reported in evidence (X_loc, X_loc2, X = heap_var)
Known limitations:
- Only synthetic crash reports tested (no real crash from running code)
- Deadlock, DataRace, MemoryLeak, BufferOverflow rules not exercised
Promotion path to B: Feed in real crash reports from at least 6 different crash scenarios (use ASAN/TSAN output from real C/Rust programs).
Evidence:
- Renders full assault report in terminal with sections: assail, detail panel, attack results, signatures, assessment
- All view modes available via --report-view flag
Promotion path to B: Test rendering of reports from 6+ diverse projects.
Evidence:
- Code exists and compiles
- Attempts to initialize crossterm terminal but fails without a real TTY (os error 6)
- Cannot be smoke-tested in a headless/CI environment
Promotion path to D: Add a --dry-run flag or test harness that validates the report loading without needing a terminal.
Evidence:
- Code exists and compiles
- Times out (no display server in CLI context)
- eframe-based; requires Wayland/X11
Promotion path to D: Same as TUI — add headless validation mode.
Evidence:
- Correctly compares two reports: robustness delta, weak point delta, per-axis status changes
- Framework changes tracked
- Severity breakdown tracked (critical, high, medium, low)
Promotion path to B: Compare reports from 6+ diverse projects at different points in time.
Evidence:
- Parses AI.a2ml and exports to Nickel format
- Output includes all manifest sections: version, project, canonical-locations, critical-invariants, lifecycle, tools, reports
Promotion path to B: Test on AI.a2ml files from 6+ different repos.
Evidence:
- Round-trip verified: assault JSON → A2ML bundle → JSON
- Output file sizes match (3371 lines round-tripped)
- Kind discrimination works (--kind assault)
Promotion path to B: Test with all report kinds (assault, amuck, abduct) from 6+ projects.
Evidence:
- Exports event chain from assault report
- 2 constraints extracted from critical weak points
- Attack events correctly represented
- Summary includes weak points, crashes, robustness score
Promotion path to B: Test with reports from 6+ projects with varying numbers of findings.
Evidence:
- Scanned 141 repos in parallel via rayon
- 3448 weak points found, 254 critical
- BLAKE3 fingerprinting computed for all repos
- Results sorted by risk (developer-ecosystem: 633, idaptik: 427, ...)
- Filters (--findings-only, --min-findings) work correctly
Promotion path to B: Run on 6+ different parent directories (different machines, different repo structures).
Evidence:
- Reports version, AI manifest status, directory existence, report cache counts
- Correctly identifies missing integration configs (Hypatia, gitbot-fleet)
Promotion path to B: Validate diagnostics output on 6+ repos with different configurations.
Evidence:
- Lists all 19 subcommands with accurate descriptions
- Shows all global options
- Per-subcommand help available
Promotion path to B: Help is generic by nature; B/A grades apply once external users confirm the docs are clear.
No components earned an F. Candidates considered and rejected:
- tui/gui: These are E, not F. They serve a real purpose (interactive report review) and are salvageable with a dry-run mode. No better alternative exists specifically for panic-attack reports.
- axial: Some features (aspell, pandoc) could be delegated to external tools, but the integrated observation + report correlation is unique to panic-attack. Not an F.
- Framework detection false positives: assail detects Elixir/Erlang frameworks (Phoenix, Ecto, OTP) on a pure Rust project. This should be gated on detected language.
- Self-detection: The tool detects its own pattern-matching strings as vulnerabilities. Consider adding self-exclusion or annotation support.
- Timeline event scheduling: The DAW-style timeline works but needs a long-running test target to exercise fully.
- amuck dangerous preset: Untested. Could generate broken mutations that confuse users.
- abduct time modes: frozen/slow modes are implemented but completely untested.