Skip to content

feat: data-driven report/metrics config, list-overrides & C/C++/Go/C#/Markdown plugins#14

Merged
ffedoroff merged 28 commits into
mainfrom
feat/report-list-overrides
Jun 22, 2026
Merged

feat: data-driven report/metrics config, list-overrides & C/C++/Go/C#/Markdown plugins#14
ffedoroff merged 28 commits into
mainfrom
feat/report-list-overrides

Conversation

@ffedoroff

@ffedoroff ffedoroff commented Jun 18, 2026

Copy link
Copy Markdown
Owner

What this PR does

Makes the metric/report pipeline fully data-driven from project config, adds five new language plugins, and extends the viewer.

Config

  • List-override DSL for inherited lists (add/remove/replace/after/before/prepend/clear), generic in deep_merge.
  • [report] overrides for columns/card/stats/size/filter at language and project level (layered catalog → language → project); surfaces custom metrics in the UI.
  • [rules.thresholds.file] accepts custom [metrics.<key>] keys, validated at load; check fires on them.
  • [presets.<ID>] project Prompt-Generator presets merged over the plugin catalog.

Metrics

  • MetricDef gains warning/info two-tier severity and a calc for the live-derivation line.
  • top<N> / top<N>_<reducer> aggregate reducers.
  • Emit Halstead/AST base counts on every node so each formula renders as "formula = numbers".

New language plugins

  • go — import graph from the go.mod module path
  • c / cpp#include dependency graph via a shared cfamily text-scan module; fixes cognitive complexity dropping &&/|| (tree-sitter declares binary_expression on two symbols)
  • csharpusing/namespace dependency graph
  • markdown — grammar-free: link graph + measured doc metrics with a broken_links gate

Engine: adds Dialect::args and Dialect::unit_name hooks. config.rs is split into a config/ facade (parse/views/specs/lookup).

Viewer

  • Data-driven SVG map: size-modes and filters by any metric (buttons built per render, nothing hardcoded in HTML).
  • Content-sized tooltip with long-formula wrapping.

Tests & docs

  • Workspace coverage 94.79%; e2e goldens regenerated for all languages.
  • PRD/DESIGN/e2e/config docs synced; new docs/customization/ and contrib/unit-tests.md.

…c thresholds, transparent formulas

Make the metric/report pipeline fully data-driven from project config and
render every metric's derivation in the viewer.

Config (code-ranker.toml):
- list-override DSL for inherited lists (add/remove/replace/after/before/
  prepend/clear), generic in deep_merge and used for the report views.
- [report] columns/card/stats overrides at both language and project level,
  layered catalog -> language -> project; surfaces custom metrics in the UI.
- [rules.thresholds.file] now accepts custom [metrics.<key>] keys, validated at
  load; `check` fires on them with the metric's concern group.
- [presets.<ID>] project Prompt-Generator presets, merged over the plugin
  catalog (same id overrides, new id appends); drives --preset / scorecard.

Metrics:
- MetricDef gains warning/info two-tier severity thresholds and a `calc`
  (defaulted from the CEL formula for node-scope) for the live derivation line.
- top<N> / top<N>_<reducer> aggregate reducers.
- emit the Halstead/AST base counts (eta1/eta2/n1/n2/spaces/branches/span_sloc)
  on every node so every derived formula renders a "formula = numbers" line;
  formula_js added/rewritten for effort/length/vocabulary/cyclomatic/mi/mi_sei.
- per-language [specs.eta1..n2] descriptions enumerating the exact operator/
  operand tokens each grammar counts (shared ecmascript_metric_specs for JS/TS).

Viewer:
- tooltip box is content-sized (max-width min(560px,92vw)); long formulas /
  token lists wrap instead of clipping; position() clamps to the viewport.

Docs: new docs/customization/ (README + custom-field-example.toml worked
example); config.md, node_schema.md, metric-tiers.md, PRD, viewer DESIGN synced.
Goldens regenerated for all four languages.
Implements the two deferred map asks from §3 of the customization guide: size the
SVG nodes by any metric, and filter the map to a metric's population — both fully
data-driven, with the built-in loc/hk size modes and the cycle filter no longer
hardcoded in the HTML.

Config:
- [report] gains `size` and `filter` list-overrides (same DSL as columns/card/
  stats); built-in defaults live in builtin.toml `[mapview]` (size = sloc/hk,
  filter = cycle). build_ui emits `ui.size_metrics` / `ui.filter_metrics`, pruned
  to keys present on an internal node.

Viewer:
- size/filter buttons are built per render (renderMapControls) from those ui
  lists — index.html no longer hardcodes SLOC/HK/cycle; clicks handled by
  delegation on .size-controls.
- metricNodeVal/metricNodeDiam are generic over the active attribute key; a
  custom metric scales against the median of the rendered population
  (metricSizeBase, cached in window._sizeBaseCache); loc/hk keep calibrated bases.
- the cycle filter generalises to window.nodeFilter: `cycle` uses the cycle set,
  any other key keeps nodes whose value is non-zero.

Docs: customization §3 rewritten as implemented (+ §1.6/§2.2/quick-ref), viewer
PRD/DESIGN updated for the data-driven controls. Goldens regenerated (ui block now
carries size_metrics/filter_metrics). custom-field-example.toml renamed from
code-ranker.toml and wires size=tsr / filter=tsr_big.
…family cognitive

Five new language plugins, registered in the CLI and exercised by e2e goldens:
- go        — import graph from the go.mod module path
- c, cpp    — #include dependency graph via a shared `cfamily` text-scan module;
              metrics via the generic engine (params/name nest under the
              function_declarator, handled by `args`/`unit_name` Dialect hooks)
- csharp    — using/namespace dependency graph
- markdown  — grammar-free: link graph + measured doc metrics (headings,
              max_depth, code_lines, links, broken_links) with a broken_links gate

Engine: add `Dialect::args` and `Dialect::unit_name` hooks (default to the prior
behaviour) so languages whose parameters/name nest under a declarator can override
them without touching the shared walks.

Fix C/C++ cognitive complexity dropping `&&`/`||`: tree-sitter-c/-cpp declare the
name `binary_expression` on two symbols (ordinary + preprocessor `#if`), so the
single-id `[roles.one]` lookup resolved to a symbol absent from normal-code trees
and the boolean-run branch was dead. Resolve it as a `[roles.group]` SET instead.
Regression tests + c/cpp goldens updated (cognitive 3->4, stats 2->2.5).

Split `config.rs` (the per-language config reader, every plugin's fan_in hub, HK
~1.15M) into a `config/` facade over parse/views/specs/lookup submodules; the
resolver follows the `pub use` re-exports to the defining file, so fan_in now lands
per-concern (hottest config file ~25K). Lower the dogfood HK ceiling 2M -> 1M.

Tests: +unit tests across the new plugins and config (skip-node guards, non-UTF8
skips, scanner edge forms, resolution fallbacks); deliberate gaps marked inline
with `// COVERAGE:` notes. Workspace coverage 94.79%.

Docs: sync PRD/DESIGN/e2e and the CLI DESIGN to nine plugins + eight wired
grammars; add `contrib/unit-tests.md` raise-coverage playbook and link it from
the `/ud` command.
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

code-ranker

Verdict vs main: ➖ neutral

No new violations vs the baseline. 🎉

📦 Full HTML report: see the code-ranker-report artifact on this run.

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.12321% with 57 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.50%. Comparing base (0c6d58b) to head (ef81c2a).

Files with missing lines Patch % Lines
crates/code-ranker-cli/src/report.rs 91.72% 12 Missing ⚠️
crates/code-ranker-cli/src/check.rs 93.57% 9 Missing ⚠️
crates/code-ranker-cli/src/main.rs 90.00% 4 Missing ⚠️
crates/code-ranker-cli/src/templates.rs 98.05% 4 Missing ⚠️
crates/code-ranker-graph/src/builtin.rs 93.22% 4 Missing ⚠️
crates/code-ranker-cli/src/compose.rs 97.93% 3 Missing ⚠️
crates/code-ranker-cli/src/config/load.rs 94.00% 3 Missing ⚠️
crates/code-ranker-cli/src/pipeline.rs 95.16% 3 Missing ⚠️
crates/code-ranker-cli/src/recommend/scorecard.rs 98.97% 2 Missing ⚠️
crates/code-ranker-plugin-api/src/level.rs 33.33% 2 Missing ⚠️
... and 9 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #14      +/-   ##
==========================================
+ Coverage   96.30%   97.50%   +1.19%     
==========================================
  Files          63      119      +56     
  Lines        9070    13932    +4862     
==========================================
+ Hits         8735    13584    +4849     
- Misses        335      348      +13     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ffedoroff added 24 commits June 18, 2026 11:31
…t mod.rs

Audit every language against languages/README.md and resolve the findings.
All behaviour-preserving — goldens unchanged; make all green (coverage 94.73%).

go      - go.mod `module ` directive + `go.mod` filename -> [structure]
          (module_directive/module_file); drop dead `import_spec` key.
markdown- code-fence + link markers -> [markers] (fences/link_open/url_scheme/
          mailto_prefix); drop unused detect_markers.
ecmascript - `@/` source-root import alias -> `alias_prefix` config; fix stale
          `ecmascript.toml` comments (file is config.toml) in js/ts/dialect.
cfamily - `#include` keyword -> [structure].include_directive (c & cpp);
          drop unused detect_markers in c/cpp/csharp.

rust:
- mod.rs is now wiring-only (445 -> 213 lines): extract toolchain probing
  (toolchain.rs), the cargo-metadata driver (analyze.rs), the test-attr
  predicate (test_attr.rs, shared with module_graph/walk.rs -- was duplicated)
  and the test-stripper + file metrics (strip.rs).
- crate-root tie-break filename `lib.rs` -> config (`crate_root_file`).
- resolve.rs dedup keys on the EdgeKind enum (now Hash) instead of its Debug
  string.

c/cpp/csharp: delegate metrics()/function_units() to file_metrics/function_nodes
free fns (match the go/python pattern; was inline in the trait methods).

Verified: cargo test -p code-ranker-plugins (197) + --test e2e (37) + make all.
…ete gate

File-walk ignore (the user-facing `[ignore]` config):
- Add `gitignore` / `ignore_files` / `hidden` to `[ignore]` (CLI IgnoreConfig +
  `--set` overrides), all default true, threaded via PluginInput.
- New shared `crate::walk` (the `ignore` crate) replaces six duplicated
  `walkdir` loops across go/python/markdown/csharp/c-cpp/ecmascript; honours
  .gitignore (+ global + .git/info/exclude) / .ignore / hidden, scoped to the
  analyzed root (`parents(false)`) so an enclosing repo's rules never leak in.
  Git-faithful (only inside a repo) → e2e samples (non-git) unaffected, goldens
  unchanged. The Rust plugin resolves files via cargo metadata, so it is exempt.
- +4 unit tests for the walker (gitignore / .ignore / hidden / skip_dirs).

JS/TS ext->grammar cleanup:
- JavaScript: every extension maps to one grammar, so the per-ext match is
  redundant — collapse to a constant; the `extensions` config is the single
  source of truth. Incidentally fixes `.cjs` being dropped in `analyze` (it was
  in `extensions` + metrics but missing from the analyze grammar match).
- TypeScript: dedupe the three identical match arms into one `grammar_for`; the
  only literal left is `tsx` (it alone selects a different grammar TYPE).

Dependency hygiene:
- Remove unused deps surfaced by cargo-machete: `walkdir` (workspace + cli +
  plugins), `which` (cli), `anyhow` (graph).
- Add a `machete` target to `make lint` (hence `make all`) and a `cargo machete`
  step to CI `Test & lint` — fails on any unused dependency. Detect-only;
  `make machete-fix` removes. False positives whitelist under
  `[package.metadata.cargo-machete]`.

Docs: config.md ([ignore] keys), PRD §5 + DESIGN (shared walker + .gitignore),
contrib/ci.md (machete in the lint chain + CI job).

Verified: make all green (workspace 201 + e2e 37, clippy, machete, lychee,
markdownlint, self-check, coverage 94.66%).
…lf-registering plugins

Config defaults now live entirely in embedded TOML and are deep-merged with the
user's config, so a project overrides only what it spells out and inherits the
rest. No default value is hardcoded in Rust.

- CLI: new embedded config/defaults.toml is the single source of project-config
  defaults; load() deep-merges discovered/--config over it. Config::default() is
  just defaults.toml parsed; dropped the hardcoded IgnoreConfig/CycleRules Default
  impls and the report.rs path consts.
- Plugins: explicit inheritance chain defaults.toml ⊕ [base] ⊕ <lang> via
  load_chain — ecmascript is now the real base for js/ts, new cfamily/config.toml
  the base for c/cpp (each carries only its diffs).
- Graph/plugins: field-omission defaults (value_type/omit_at, roles one_named)
  sourced from [defaults] in builtin.toml / defaults.toml instead of Rust literals.
- New `report --export-full-config PATH`: dumps the full effective config
  ([project] = defaults ⊕ --config, [plugin] = merged language config).

Plugins self-register via inventory::submit!; the CLI works only through
code_ranker_plugin_api::registry() + the LanguagePlugin trait and never names a
language. Moved the generic TOML utilities (deep_merge → toml_merge, the
list-override DSL) into code-ranker-plugin-api so nothing reaches into the plugins
crate; its only mention in the CLI is the `extern crate code_ranker_plugins as _;`
link-anchor that lets the linker collect the submissions.

Docs (PRD/DESIGN, cli CLI/config/DESIGN, customization) synced to all of the above.
…nalyze level arg, PluginInput.options

The trait's is_test_path was never called in production: test-file dropping
happens inside each plugin's analyze (via free *_is_test_path predicates used by
the walk; Rust drops tests/benches via cargo-metadata target kinds). Remove the
trait method + 8 impls; keep the free functions (still tested directly). Rust's
orphaned test_dirs config + its test go too.

analyze's level param was always "files" and ignored by every impl (functions
are served by function_units), so drop it. PluginInput.options + the Options
alias were never read (only constructed empty), so drop them.

Tidy: ecmascript_is_test_path no longer re-exported (now pub(super)); stale doc
refs updated in CLI config + root PRD/DESIGN.
Overview reveal-depth now reaches a files level one step past the deepest
folder grouping: the deepest single folder level is kept as clusters
(subgraph cluster_files_N) but each is drawn with its real file nodes
inside, edges remapped file-to-file. Driven by maxUnderCrateDepth /
filesDig / isFilesDig; the dig ceiling auto-extends to it. DIG_MAX 6->8.

Fan-in/out connector arrows now land on a circular node's ellipse border
via nodeEdgePoint (measuring the <ellipse>, not the group bbox) instead of
the circumscribed square's corner; box nodes keep the bbox-edge behaviour.

Docs: viewer DESIGN/PRD updated; the P2 'individual files inline in the
overview' item is now implemented.
plugin.rs mixed audiences: the parser contract (trait + PluginInput + its
self-registration wiring) and two foreign concerns — prompt-generator data
(Preset) and a project-detection helper (detect_with_marker). That made it the
HK hub: 19 modules reached in, 6 only for Preset, never to parse.

Split off the foreign concerns along the audience seam:
  plugin.rs    -> trait LanguagePlugin + PluginInput + PluginRegistration/registry
  preset.rs    -> Preset            (prompt-generator domain; read by reporting)
  detection.rs -> detect_with_marker (generic project detection)

PluginInput and the registry stay with the trait: PluginInput is the contract's
own argument type, and PluginRegistration/registry() are typed on dyn
LanguagePlugin — the trait's own discovery wiring, same actors. All re-exported
at the crate root, so call sites read code_ranker_plugin_api::{Preset,
detect_with_marker, PluginRegistration, registry}.

HK(plugin.rs): 613.7K -> 311.0K (sloc 68->60, fan_in 19->12). preset.rs has
fan_in 18 but fan_out 0, so its HK is 0 — the coupling dissolves, not relocates.

Docs: DESIGN.md (module layout, Preset location, trait contract); ai-skill.md
(absolute link to the HK principle); principles/rust/henry-kafura-coupling.md
(diagnose-and-split workflow: measure one file, list fan_in/fan_out, find mixed
scenarios, split, verify with a before/after diff report). No behaviour change;
goldens unchanged.
… TOML

Hardcoded human text in the CLI violated 'all vocab/prompt text lives in
config'. Move it to the data-driven catalogs, behind a new spec field, and have
the CLI read it from the snapshot — so the CLI names no metric and stores no
prose, and a language can reword diagnostics with no Rust change.

- New `remediation` spec field (the check `fix` line) on AttributeSpec /
  CycleKindSpec / FieldDef / MetricDef.
- builtin.toml gains the metric `why`/`fix` (remediation on cyclomatic/cognitive),
  the coupling specs (fan_in/fan_out/hk/cycle, migrated out of coupling_specs()
  in lib.rs), a [cycles.*] vocab catalog, and a [prompt] scaffolding block.
- The orchestrator overlays cycle vocab onto each level's cycle_kinds; structural
  metrics (loc/items/unsafe/headings/links/…) carry description+remediation in
  their language configs.
- CLI: delete the RULES prose catalog; rule_doc() resolves why/fix/title from the
  snapshot's node_attributes + cycle_kinds (works on snapshot input too); threaded
  into human + SARIF diagnostics. Concern codes (CPX/CPL/SIZ) + rule_tuning stay.
- Prompt scaffolding: a PromptTemplate carried in the snapshot, sourced from
  builtin.toml [prompt], read by BOTH the CLI compose_prompt and the viewer JS
  composePrompt — single source, {id} substituted at render.
- Regenerated all 9-language e2e goldens; synced DESIGN/PRD/node_schema/ERRORS/
  viewer/customization docs.

make all + make e2e green (coverage 94.99%); behaviour unchanged.
Add [rules.checks.<id>] — a CEL boolean `when` predicate evaluated per file as a
second pass over the fully-built graph, producing `check.<id>` violations like a
threshold/cycle rule. The predicate sees node attributes, derived path fields
(path/name/stem/ext/dir), graph edges (deps/rdeps + depends_on/depended_on_by),
file collections (files/siblings + file_exists), string fns (ends_with/…/matches)
and CEL list macros. [rules.defs] adds reusable named helpers expanded into `when`
(cycle-checked). No engine/CEL change for callers — rides the existing check path.

Unify the report/ui vocabulary end to end: LevelUi/JSON fields drop the `_metrics`
suffix (card/size/filter/sort/summary) so config [report] keys = JSON ui keys = JS;
the metric catalog merges [tableview]/[cardview]/[mapview]/[report.json.aggregate]
into one [report] (+ [report.stats]) mirroring the project override, and `featured`
becomes `card`. The SVG status bar now reads ui.card (was hardcoded hk/sloc); dead
tooltip renderers removed. Snapshot schema_version 2 → 3 (field rename is breaking).

Fix: external-crate dependency labels resolved to their cargo-registry path, so
depends_on("ext:<crate>") never matched — external nodes now keep their ext:<name>
id for matching.

Docs: customization/README §1.8, cli config.md/ERRORS.md/PRD.md, root PRD/DESIGN,
node_schema, e2e synced; runnable examples linters-example.toml + architecture-linters.toml.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…/types/traits)

The Rust plugin now emits, per file node, six comma-joined string attributes from
its production AST (test items excluded): `derives` / `macros` / `attrs` /
`imports` ("uses X" sets) and `types` / `traits` (names defined in the file). The
module-graph syn walk already visited derives/uses/attributes; a FactsCollector
gathers them (imports reuses the bare-path collector) and collapse emits them.

This makes the symbol-level dylint rules — which edges alone could not reach —
expressible as config [rules.checks] via contains()/matches(): no serde/ToSchema
derive in a layer (DE0101/0102), no api_dto attr (DE0104), no schema_for! macro
(DE0110), no print macros (DE1301), DTO/Client naming (DE0201/0503). Added to
docs/customization/architecture-linters.toml.

String-valued so they stay off the numeric table/scorecard surfaces; specced in
rust/config.toml [node_attributes]. Rust-only (like unsafe/items) — rust golden
regenerated; other languages unaffected.

Docs: node_schema.md, cli config.md, customization/README.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…r deps

Make the two CEL contexts (metric formulas, check predicates) speak the same
language, and document it.

Engine:
- Drop the custom string stdlib (ends_with/starts_with/contains/matches) — cel-rust
  ships these natively (contains/startsWith/endsWith/matches regex, as method or
  function). Removes the direct `regex` dependency (kept transitively via `cel`).
- Math host functions (pow/log2/sqrt/…) now registered in checks too, not just
  metrics — shared `register_math`.
- Path fields (path/name/stem/ext/dir) now available in metric formulas, not just
  checks — so a formula can branch on location (`path.contains("/generated/") ? 0 : hk`).
  Shared path logic extracted to a `nodepath` module.
- `agg(metric, reducer, population)` now usable inside check predicates for
  relative thresholds (node vs project distribution, e.g.
  `cyclomatic.double() > agg('cyclomatic','p90','not_empty')`), memoized per run
  so each distinct aggregate is reduced once, not once per file.

Docs:
- New docs/customization/cel-reference.md — full agent-oriented CEL reference
  (language, cel-rust stdlib, what's in scope in metrics vs checks).
- Remove the linter example configs (architecture-linters/linters-example); the
  config-only-linter direction is not being pursued. README/config.md updated to
  native CEL names and the new reference.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
`--config <file>` is repeatable: previously only the first file was read and the
rest silently ignored (`files.first()`). Now every file is deep-merged over the
built-in defaults in command-line order, last wins; inline `KEY=VALUE` applies
after all files; passing any file still disables auto-discovery of
`code-ranker.toml`. `discover_user_table` → `discover_user_tables` returns the
ordered layers; the log line shows the merge order (`a.toml ⊕ b.toml`).

Docs:
- config.md — "Multiple --config files" section; a "full layer stack (with
  sources)" table linking each built-in layer on GitHub (plugin base ⊕ family ⊕
  <lang>, metric catalog, project defaults); the big example renamed to
  `code-ranker-rust-example.toml` with a note that auto-discovery needs the literal
  name `code-ranker.toml` and that a real config is partial.
- CLI.md — `--config` flag + precedence list updated for repeat/order.

Coverage: the new explicit-multi-file branch is unit-tested
(`load_layers_multiple_config_files_in_order_last_wins`). The retyped discovery
lines (cwd/workspace `code-ranker.toml`, `Cargo.toml` metadata) stay uncovered by
design — they read the process cwd / real filesystem and can't be isolated in a
unit test (the repo's own ./code-ranker.toml wins cwd discovery first).

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…-lens presets

- check: add --focus <PATH> — gate a subset of files (whole project still analyzed)
- check: add --output-format prompt — emit an AI fix-prompt from the gate's own
  violations (one command gates and, on failure, prints the prompt)
- report: replace --preset with --metric (narrow the scorecard by ranking axis);
  --output.prompt is auto-targeted and requires --top 1
- presets: remove the 4 metric-lens presets (HK/SLOC/FANIN/FANOUT) and the `slug`
  field — a preset's doc_url now resolves from its `id`
- principles: per-metric fix-prompt docs (HK/Fan-in/Fan-out/Cognitive/Cyclomatic/
  Cycles) with industry-case filenames; merge mutual/chain → Cycles; remediation
  now points to the metric's principle doc
- docs: move what-is-cycle → docs/cycles.md; delete redundant metric-thresholds &
  module-size; add USE-CASES.md; sync CLI/PRD/DESIGN/config/viewer/customization

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…splits

Tighten the dogfood gate in code-ranker.toml (hk, cyclomatic, cognitive) and
bring the codebase under it without behaviour change:

- Extract large inline `#[cfg(test)]` blocks into sibling `*_test.rs` files
  (wired via `#[cfg(test)] #[path = …] mod tests;`), satisfying the new
  `inline_tests_too_large` custom check.
- Split files over the per-file cyclomatic budget by relocating cohesive
  function groups into sibling submodules:
    registry.rs  -> registry/model.rs (leaf data types) + registry/eval.rs
    check.rs     -> check/values.rs
    pipeline.rs  -> pipeline/helpers.rs
    ecmascript/structure.rs -> ecmascript/imports.rs
  Shared types/statics moved DOWN into leaf modules (model.rs; MODULE/
  file_to_mod_path into imports.rs) so the parent depends on the leaf one-way —
  avoiding the parent<->child ADP cycle a naive extraction would create.

Docs:
- principles/rust/HK.md: add "When a hub is legitimate (accept, don't game)".
- principles/rust/Cyclomatic.md & Cognitive.md: were stray copies of KISS.md;
  rewrite as the real metric principles, incl. the split/cycle-trap workflow
  that the cyclomatic/cognitive `remediation` links now point to usefully.

make all green: build, test (cli/graph/plugins/e2e), clippy, lint-md,
self-check (0 violations), coverage >=90%.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…g; tighten gate

Viewer — restore the per-level switcher (was a `updateFilesTab(){}` stub):
- `[levels] functions = true` already produces a `functions` graph level in the
  JSON; the HTML viewer now exposes it. `updateFilesTab` builds a `.report-switch`
  tab row + clones the `files` `.view` per extra level (rendering is already
  level-agnostic). Hidden for single-level reports, so they look unchanged.

check — `--suggest-config` is now fully data-driven: the threshold metrics come
from `config::metrics` × the snapshot's attribute specs (numeric, not
higher-better), in report-column order — no hardcoded metric array.

Bring the codebase under the tightened dogfood gate (hk=500K, cyclomatic=100,
cognitive=70, sloc=400) via behaviour-preserving, cycle-safe module splits:
  cognitive: scorecard / collapse / python structure (lift nested blocks)
  cyclomatic: config/load→load/overrides, walk→visitors, check→check/sarif
  sloc:       pipeline→pipeline/assemble, walk→visitors, checks→checks/text,
              python/structure→python/imports (shared MODULE moved down to a leaf)

Docs: viewer PRD/DESIGN document the level switcher; config.md notes the viewer
surfaces the `functions` level.

make all + make e2e green; self-check 0 violations; coverage ≥90%.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…--focus-path

Advisory tiers now derive from the `check` gate, not language calibration:
- Remove the language-calibrated `[thresholds.*]` system (plugin `thresholds()`,
  `ThresholdCfg`, rust `[thresholds]`). `AttributeSpec.thresholds` is now built
  from `[rules.thresholds.file]`: the limit is the `warning` tier, a `[metrics.<k>]`
  `info` is kept only when below it. Scorecard/viewer/prompt show exactly what the
  gate enforces; an unconfigured metric has no breaches (drop the `{0,0}`/HK-fallback
  that read "everything breaches").
- Tune the project's own `hk` gate 500K→350K (ratchet just above plugin.rs + ~1
  language of growth); document the rationale.

Recommendation focus, mirroring `check`'s flag family:
- Replace report `--metric`/`--focus` with `--focus-rule` (a metric `hk`, a full
  rule id `threshold.file.hk`, or a principle id `LSP`; metrics matched by value)
  and add `--focus-path` (restrict ranked modules to a subtree; cycles stay global).
  Applies to both scorecard and prompt.
- A metric focus frames output by the metric itself (no SOLID wrapper) via a
  synthesized preset; fix the duplicate metric description in the metric-lens prompt.

Docs & principles:
- Delete `principles/rust/Cycles.md` (a verbatim copy of `ADP.md`); repoint the
  cycle metric remediation → `ADP.md`; drop stale cross-refs.
- Add `docs/installation.md` (extracted from README) with a "which channel" guide;
  slim the README install section; sync CLI/USE-CASES/PRD/DESIGN/config/ai-skill.

Move pipeline tests to `pipeline_test.rs` (inline block exceeded the TST gate).
Regenerate goldens; add unit tests for resolve_focus / synth_metric_preset /
in_focus / metric-lens prompt / principle-focus scorecard.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
… doc inheritance

Rename the principle/metric doc corpus principles/ → languages/ and add a
shared base/ corpus that languages without their own docs inherit, mirroring
the defaults.toml ⊕ <lang>.toml config inheritance.

- specs.rs: doc_url resolves to {doc_base}/{doc_lang}/{id}.md for the ids a
  language overrides (doc_overrides = "*" or [ids]), else {doc_base}/base/{id}.md
- defaults.toml: doc_base → .../main/languages; rust/python/typescript/javascript
  declare doc_overrides="*"; go/c/cpp/csharp/markdown inherit base/ (fixes their
  previously dead /principles/<lang>/ doc links)
- builtin.toml remediation URLs → languages/base/ (neutral metric docs)
- base/ seeded from rust/, then fully de-Rust-ified into language-neutral prose
- goldens, dialect/config tests, docs (PRD/DESIGN/CLI/viewer/node_schema/
  metric-correctness/config-resolution), and Makefile/markdownlint/CI globs updated

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
…_cel/formula_pretty/formula_js

Both config surfaces named the same three concepts differently: built-in
cel/formula_human/formula_js vs user-config formula/formula_pretty/calc, and the
bare `formula` key meant the executable CEL while the wire `formula` meant the
display string. Collapse to one family on both surfaces:

  formula_cel    executable CEL formula
  formula_pretty human-readable display formula
  formula_js     JS the viewer re-runs for the live derivation line

The wire encoding (AttributeSpec.formula/calc) is intentionally unchanged, so JSON
reports, the viewer, and golden fixtures are untouched. Pure rename — behaviour and
emitted values are identical.
…c inheritance

Two related strands on the tier-2/doc tooling.

hk → CEL (TIER1 → graph → TIER2):
- Move Henry–Kafura from a hardcoded Rust pass to a data-driven `[fields.hk]`
  `formula_cel = "sloc * pow(fan_in * fan_out, 2.0)"`. `hk.rs::annotate_hk` →
  `annotate_coupling` (fan_in/fan_out/fan_out_external only); a new post-graph
  `write_derived` stage evaluates graph-derived `[fields.*]` over node attrs once
  coupling is present. Engines split pre-graph (`DERIVED`, raw tier-1) vs post-graph
  (`GRAPH_DERIVED`), partitioned by whether a formula references a graph key.
- Values are bit-identical (e2e goldens unchanged except markdown, which has no
  `sloc` → no `hk`, regenerated surgically).
- Cycle-safe split: the metric-writing machinery moves to `builtin/write.rs` so
  `builtin.rs` drops back under the `hk` regression-ratchet gate.

Manifest-based doc inheritance:
- `compose.rs` reworked: a `languages/<lang>/<ID>.md` manifest lists sections in
  order — `<!-- doc:base "X" [from "P1"] [to "P2"] -->` pulls (a slice of) a base
  section, inline `## ` sections are written verbatim, and the manifest may carry
  its own H1/TL;DR head (else inherits base, suffixed `(in <Lang>)`). Replaces the
  `_overlay/` patch model. A doc is a manifest iff it contains a `doc:base` include.
- All `rust/` principle+metric docs migrated to manifests; `python/DRY`,
  `typescript/OCP` too. Theory/boilerplate sections reuse base; language-specific
  sections (examples, crate-level, extra refs) stay inline.
- Prompt scaffolding prose now lives in `metrics/prompt.md` (internal, not a
  published corpus doc).

Docs synced: templates.md, DESIGN.md, CLI.md, metric-tiers/correctness. make all +
e2e green; self-check passes.
…↔doc alignment

Prompt scaffolding & rendering (CLI `recommend/prompt.rs` + viewer `export-popup.js`,
kept in parity):
- doc-note now points at the offline `code-ranker report --doc {id}` command instead
  of a network URL — works without a connection. `{id}` substituted in `doc_note`
  too (`metrics/prompt.md`).
- Connection lists filter to FLOW edges only (`uses`) — `contains`/`reexports` are
  noise in the crossroads and don't drive HK/coupling (matches the viewer's
  `edgeIsFlow`).
- Single-focus (`--top 1`): drop the repeated focus path — `in` shows `dependant:line`,
  `out` shows `target (uses, line N)`; heading becomes `## Target module`. Several
  foci keep the full `source:line → target`.
- Metric values print exact integers (e.g. `295488`), never abbreviated — `abbreviate`
  is now a viewer-only display flag. The per-module `**Formula:**` line is dropped
  (it lives in `--doc`).

prompt ↔ doc consistency:
- HK doc H1 aligned to the preset title `HK — Henry–Kafura` (base + rust manifest).
- HK `description` reworded to echo the doc TL;DR, so the prompt Summary and the doc
  correspond.
- HK doc workflow uses the CLI (`--output.scorecard --focus-rule hk`, `--prompt HK`)
  instead of `jq`-parsing JSON.
- Drop the duplicated own-head from manifests whose TL;DR equalled base (HK, Cognitive,
  DRY, Fan-in, Fan-out) — they inherit the base head + `(in <Lang>)`.

Docs synced (templates.md, ai-skill.md, DESIGN.md). Goldens patched (doc_note +
description). make all + e2e green.
…flows

- code-ranker-cli/{CLI,DESIGN,ERRORS}.md: document the `--prompt <ID>` / `--doc <ID>`
  named prompt/doc output and the maintenance-only `docs` subcommand; note the
  offline `--doc` doc-read path; tighten the `hk` description.
- languages/base/{Cyclomatic,Cognitive}.md: replace the stale "dump a JSON snapshot
  and inspect it" step with the native `report --output.scorecard --focus-rule
  <metric>` triage (no jq), plus the function-level + HTML-viewer path for the
  per-function breakdown — consistent with HK.md's "no snapshot to parse" workflow.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
`collapse.rs` showed 0% line coverage because it is only exercised by the
subprocess-based e2e tests (llvm-cov can't see a spawned binary). Add direct
unit tests for the pure `collapse_to_files` transform — module/inline folding,
external-crate nodes, edge re-pointing (drop crate→crate and self edges), the
file-backed source-of-truth rule, and reexport visibility — lifting it to ~92%.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
- Cargo.toml/Cargo.lock: workspace version 3.0.0-alpha.1 → 3.0.0 (via `make bump`;
  all crates inherit `version.workspace`).
- docs: drop the stale `--version 1.1.0` pin from the cargo install commands
  (README + installation.md) and point Docker pulls at `:latest`, so the install
  docs don't go stale on every bump; refresh the example-snapshot version
  (`1.0.0-alpha.4` → `3.0.0`) in PRD/DESIGN; drop the now-inconsistent
  "Pin a specific version" line from the README status.
- Makefile: `make bump` now lists doc files that mention a version (cargo
  `--version`, alpha/beta/rc, and `code-ranker@`/`:` npm+docker tags) so version
  refs the bump can't auto-edit are surfaced for manual review.

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
`node-popup.js` interpolated `esc(n.id)` / `esc(node.id)` into `data-diag-node`
/ `data-node-id` attribute values, but `esc` (escHtml) does not escape `"` — a
node whose id contains a double quote (e.g. an oddly-named analyzed file) could
break out of the attribute. Use `escA` (escAttr), which also escapes `"`, for
both attributes. Clears CodeQL js/incomplete-html-attribute-sanitization (#18, #19).

(The high js/xss-through-dom on tooltip.js is a verified false positive — every
input is escHtml-escaped before innerHTML — dismissed in code scanning.)

Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7
Add unit tests for previously-uncovered branches:
- templates.rs: resolve_doc (base/manifest/override/metric/error), build_corpus, lang_display
- config/load.rs: code-ranker.toml + Cargo.toml metadata auto-discovery
- overrides.rs: templates.languages/prompt inline keys + remaining ignore flags
- assemble.rs: default-Level fallback, grouping-with-function
- plugins/specs.rs: selective doc_overrides array form
- recommend/prompt.rs: single-focus in/out edge abbreviation
- check/values.rs + check.rs: early-return/false branches in print helpers
@@ -0,0 +1,338 @@
//! The embedded doc corpus and per-file overrides.
The dogfood gate flags >N inline test lines per file; move the new
report.rs / templates.rs test modules into report_test.rs /
templates_test.rs (matching check_test.rs et al.).
@ffedoroff ffedoroff merged commit 0279d6b into main Jun 22, 2026
5 checks passed
@ffedoroff ffedoroff deleted the feat/report-list-overrides branch June 22, 2026 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants