feat: data-driven report/metrics config, list-overrides & C/C++/Go/C#/Markdown plugins by ffedoroff · Pull Request #14 · ffedoroff/code-ranker

ffedoroff · 2026-06-18T07:52:44Z

What this PR does

Makes the metric/report pipeline fully data-driven from project config, adds five new language plugins, and extends the viewer.

Config

List-override DSL for inherited lists (add/remove/replace/after/before/prepend/clear), generic in deep_merge.
[report] overrides for columns/card/stats/size/filter at language and project level (layered catalog → language → project); surfaces custom metrics in the UI.
[rules.thresholds.file] accepts custom [metrics.<key>] keys, validated at load; check fires on them.
[presets.<ID>] project Prompt-Generator presets merged over the plugin catalog.

Metrics

MetricDef gains warning/info two-tier severity and a calc for the live-derivation line.
top<N> / top<N>_<reducer> aggregate reducers.
Emit Halstead/AST base counts on every node so each formula renders as "formula = numbers".

New language plugins

go — import graph from the go.mod module path
c / cpp — #include dependency graph via a shared cfamily text-scan module; fixes cognitive complexity dropping &&/|| (tree-sitter declares binary_expression on two symbols)
csharp — using/namespace dependency graph
markdown — grammar-free: link graph + measured doc metrics with a broken_links gate

Engine: adds Dialect::args and Dialect::unit_name hooks. config.rs is split into a config/ facade (parse/views/specs/lookup).

Viewer

Data-driven SVG map: size-modes and filters by any metric (buttons built per render, nothing hardcoded in HTML).
Content-sized tooltip with long-formula wrapping.

Tests & docs

Workspace coverage 94.79%; e2e goldens regenerated for all languages.
PRD/DESIGN/e2e/config docs synced; new docs/customization/ and contrib/unit-tests.md.

…c thresholds, transparent formulas Make the metric/report pipeline fully data-driven from project config and render every metric's derivation in the viewer. Config (code-ranker.toml): - list-override DSL for inherited lists (add/remove/replace/after/before/ prepend/clear), generic in deep_merge and used for the report views. - [report] columns/card/stats overrides at both language and project level, layered catalog -> language -> project; surfaces custom metrics in the UI. - [rules.thresholds.file] now accepts custom [metrics.<key>] keys, validated at load; `check` fires on them with the metric's concern group. - [presets.<ID>] project Prompt-Generator presets, merged over the plugin catalog (same id overrides, new id appends); drives --preset / scorecard. Metrics: - MetricDef gains warning/info two-tier severity thresholds and a `calc` (defaulted from the CEL formula for node-scope) for the live derivation line. - top<N> / top<N>_<reducer> aggregate reducers. - emit the Halstead/AST base counts (eta1/eta2/n1/n2/spaces/branches/span_sloc) on every node so every derived formula renders a "formula = numbers" line; formula_js added/rewritten for effort/length/vocabulary/cyclomatic/mi/mi_sei. - per-language [specs.eta1..n2] descriptions enumerating the exact operator/ operand tokens each grammar counts (shared ecmascript_metric_specs for JS/TS). Viewer: - tooltip box is content-sized (max-width min(560px,92vw)); long formulas / token lists wrap instead of clipping; position() clamps to the viewport. Docs: new docs/customization/ (README + custom-field-example.toml worked example); config.md, node_schema.md, metric-tiers.md, PRD, viewer DESIGN synced. Goldens regenerated for all four languages.

Implements the two deferred map asks from §3 of the customization guide: size the SVG nodes by any metric, and filter the map to a metric's population — both fully data-driven, with the built-in loc/hk size modes and the cycle filter no longer hardcoded in the HTML. Config: - [report] gains `size` and `filter` list-overrides (same DSL as columns/card/ stats); built-in defaults live in builtin.toml `[mapview]` (size = sloc/hk, filter = cycle). build_ui emits `ui.size_metrics` / `ui.filter_metrics`, pruned to keys present on an internal node. Viewer: - size/filter buttons are built per render (renderMapControls) from those ui lists — index.html no longer hardcodes SLOC/HK/cycle; clicks handled by delegation on .size-controls. - metricNodeVal/metricNodeDiam are generic over the active attribute key; a custom metric scales against the median of the rendered population (metricSizeBase, cached in window._sizeBaseCache); loc/hk keep calibrated bases. - the cycle filter generalises to window.nodeFilter: `cycle` uses the cycle set, any other key keeps nodes whose value is non-zero. Docs: customization §3 rewritten as implemented (+ §1.6/§2.2/quick-ref), viewer PRD/DESIGN updated for the data-driven controls. Goldens regenerated (ui block now carries size_metrics/filter_metrics). custom-field-example.toml renamed from code-ranker.toml and wires size=tsr / filter=tsr_big.

…family cognitive Five new language plugins, registered in the CLI and exercised by e2e goldens: - go — import graph from the go.mod module path - c, cpp — #include dependency graph via a shared `cfamily` text-scan module; metrics via the generic engine (params/name nest under the function_declarator, handled by `args`/`unit_name` Dialect hooks) - csharp — using/namespace dependency graph - markdown — grammar-free: link graph + measured doc metrics (headings, max_depth, code_lines, links, broken_links) with a broken_links gate Engine: add `Dialect::args` and `Dialect::unit_name` hooks (default to the prior behaviour) so languages whose parameters/name nest under a declarator can override them without touching the shared walks. Fix C/C++ cognitive complexity dropping `&&`/`||`: tree-sitter-c/-cpp declare the name `binary_expression` on two symbols (ordinary + preprocessor `#if`), so the single-id `[roles.one]` lookup resolved to a symbol absent from normal-code trees and the boolean-run branch was dead. Resolve it as a `[roles.group]` SET instead. Regression tests + c/cpp goldens updated (cognitive 3->4, stats 2->2.5). Split `config.rs` (the per-language config reader, every plugin's fan_in hub, HK ~1.15M) into a `config/` facade over parse/views/specs/lookup submodules; the resolver follows the `pub use` re-exports to the defining file, so fan_in now lands per-concern (hottest config file ~25K). Lower the dogfood HK ceiling 2M -> 1M. Tests: +unit tests across the new plugins and config (skip-node guards, non-UTF8 skips, scanner edge forms, resolution fallbacks); deliberate gaps marked inline with `// COVERAGE:` notes. Workspace coverage 94.79%. Docs: sync PRD/DESIGN/e2e and the CLI DESIGN to nine plugins + eight wired grammars; add `contrib/unit-tests.md` raise-coverage playbook and link it from the `/ud` command.

github-actions · 2026-06-18T07:53:40Z

code-ranker

Verdict vs main: ➖ neutral

No new violations vs the baseline. 🎉

📦 Full HTML report: see the code-ranker-report artifact on this run.

codecov · 2026-06-18T07:53:50Z

Codecov Report

❌ Patch coverage is 99.12321% with 57 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.50%. Comparing base (0c6d58b) to head (ef81c2a).

Files with missing lines	Patch %	Lines
crates/code-ranker-cli/src/report.rs	91.72%	12 Missing ⚠️
crates/code-ranker-cli/src/check.rs	93.57%	9 Missing ⚠️
crates/code-ranker-cli/src/main.rs	90.00%	4 Missing ⚠️
crates/code-ranker-cli/src/templates.rs	98.05%	4 Missing ⚠️
crates/code-ranker-graph/src/builtin.rs	93.22%	4 Missing ⚠️
crates/code-ranker-cli/src/compose.rs	97.93%	3 Missing ⚠️
crates/code-ranker-cli/src/config/load.rs	94.00%	3 Missing ⚠️
crates/code-ranker-cli/src/pipeline.rs	95.16%	3 Missing ⚠️
crates/code-ranker-cli/src/recommend/scorecard.rs	98.97%	2 Missing ⚠️
crates/code-ranker-plugin-api/src/level.rs	33.33%	2 Missing ⚠️
... and 9 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #14      +/-   ##
==========================================
+ Coverage   96.30%   97.50%   +1.19%     
==========================================
  Files          63      119      +56     
  Lines        9070    13932    +4862     
==========================================
+ Hits         8735    13584    +4849     
- Misses        335      348      +13

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…t mod.rs Audit every language against languages/README.md and resolve the findings. All behaviour-preserving — goldens unchanged; make all green (coverage 94.73%). go - go.mod `module ` directive + `go.mod` filename -> [structure] (module_directive/module_file); drop dead `import_spec` key. markdown- code-fence + link markers -> [markers] (fences/link_open/url_scheme/ mailto_prefix); drop unused detect_markers. ecmascript - `@/` source-root import alias -> `alias_prefix` config; fix stale `ecmascript.toml` comments (file is config.toml) in js/ts/dialect. cfamily - `#include` keyword -> [structure].include_directive (c & cpp); drop unused detect_markers in c/cpp/csharp. rust: - mod.rs is now wiring-only (445 -> 213 lines): extract toolchain probing (toolchain.rs), the cargo-metadata driver (analyze.rs), the test-attr predicate (test_attr.rs, shared with module_graph/walk.rs -- was duplicated) and the test-stripper + file metrics (strip.rs). - crate-root tie-break filename `lib.rs` -> config (`crate_root_file`). - resolve.rs dedup keys on the EdgeKind enum (now Hash) instead of its Debug string. c/cpp/csharp: delegate metrics()/function_units() to file_metrics/function_nodes free fns (match the go/python pattern; was inline in the trait methods). Verified: cargo test -p code-ranker-plugins (197) + --test e2e (37) + make all.

…ete gate File-walk ignore (the user-facing `[ignore]` config): - Add `gitignore` / `ignore_files` / `hidden` to `[ignore]` (CLI IgnoreConfig + `--set` overrides), all default true, threaded via PluginInput. - New shared `crate::walk` (the `ignore` crate) replaces six duplicated `walkdir` loops across go/python/markdown/csharp/c-cpp/ecmascript; honours .gitignore (+ global + .git/info/exclude) / .ignore / hidden, scoped to the analyzed root (`parents(false)`) so an enclosing repo's rules never leak in. Git-faithful (only inside a repo) → e2e samples (non-git) unaffected, goldens unchanged. The Rust plugin resolves files via cargo metadata, so it is exempt. - +4 unit tests for the walker (gitignore / .ignore / hidden / skip_dirs). JS/TS ext->grammar cleanup: - JavaScript: every extension maps to one grammar, so the per-ext match is redundant — collapse to a constant; the `extensions` config is the single source of truth. Incidentally fixes `.cjs` being dropped in `analyze` (it was in `extensions` + metrics but missing from the analyze grammar match). - TypeScript: dedupe the three identical match arms into one `grammar_for`; the only literal left is `tsx` (it alone selects a different grammar TYPE). Dependency hygiene: - Remove unused deps surfaced by cargo-machete: `walkdir` (workspace + cli + plugins), `which` (cli), `anyhow` (graph). - Add a `machete` target to `make lint` (hence `make all`) and a `cargo machete` step to CI `Test & lint` — fails on any unused dependency. Detect-only; `make machete-fix` removes. False positives whitelist under `[package.metadata.cargo-machete]`. Docs: config.md ([ignore] keys), PRD §5 + DESIGN (shared walker + .gitignore), contrib/ci.md (machete in the lint chain + CI job). Verified: make all green (workspace 201 + e2e 37, clippy, machete, lychee, markdownlint, self-check, coverage 94.66%).

…lf-registering plugins Config defaults now live entirely in embedded TOML and are deep-merged with the user's config, so a project overrides only what it spells out and inherits the rest. No default value is hardcoded in Rust. - CLI: new embedded config/defaults.toml is the single source of project-config defaults; load() deep-merges discovered/--config over it. Config::default() is just defaults.toml parsed; dropped the hardcoded IgnoreConfig/CycleRules Default impls and the report.rs path consts. - Plugins: explicit inheritance chain defaults.toml ⊕ [base] ⊕ <lang> via load_chain — ecmascript is now the real base for js/ts, new cfamily/config.toml the base for c/cpp (each carries only its diffs). - Graph/plugins: field-omission defaults (value_type/omit_at, roles one_named) sourced from [defaults] in builtin.toml / defaults.toml instead of Rust literals. - New `report --export-full-config PATH`: dumps the full effective config ([project] = defaults ⊕ --config, [plugin] = merged language config). Plugins self-register via inventory::submit!; the CLI works only through code_ranker_plugin_api::registry() + the LanguagePlugin trait and never names a language. Moved the generic TOML utilities (deep_merge → toml_merge, the list-override DSL) into code-ranker-plugin-api so nothing reaches into the plugins crate; its only mention in the CLI is the `extern crate code_ranker_plugins as _;` link-anchor that lets the linker collect the submissions. Docs (PRD/DESIGN, cli CLI/config/DESIGN, customization) synced to all of the above.

…nalyze level arg, PluginInput.options The trait's is_test_path was never called in production: test-file dropping happens inside each plugin's analyze (via free *_is_test_path predicates used by the walk; Rust drops tests/benches via cargo-metadata target kinds). Remove the trait method + 8 impls; keep the free functions (still tested directly). Rust's orphaned test_dirs config + its test go too. analyze's level param was always "files" and ignored by every impl (functions are served by function_units), so drop it. PluginInput.options + the Options alias were never read (only constructed empty), so drop them. Tidy: ecmascript_is_test_path no longer re-exported (now pub(super)); stale doc refs updated in CLI config + root PRD/DESIGN.

Overview reveal-depth now reaches a files level one step past the deepest folder grouping: the deepest single folder level is kept as clusters (subgraph cluster_files_N) but each is drawn with its real file nodes inside, edges remapped file-to-file. Driven by maxUnderCrateDepth / filesDig / isFilesDig; the dig ceiling auto-extends to it. DIG_MAX 6->8. Fan-in/out connector arrows now land on a circular node's ellipse border via nodeEdgePoint (measuring the <ellipse>, not the group bbox) instead of the circumscribed square's corner; box nodes keep the bbox-edge behaviour. Docs: viewer DESIGN/PRD updated; the P2 'individual files inline in the overview' item is now implemented.

plugin.rs mixed audiences: the parser contract (trait + PluginInput + its self-registration wiring) and two foreign concerns — prompt-generator data (Preset) and a project-detection helper (detect_with_marker). That made it the HK hub: 19 modules reached in, 6 only for Preset, never to parse. Split off the foreign concerns along the audience seam: plugin.rs -> trait LanguagePlugin + PluginInput + PluginRegistration/registry preset.rs -> Preset (prompt-generator domain; read by reporting) detection.rs -> detect_with_marker (generic project detection) PluginInput and the registry stay with the trait: PluginInput is the contract's own argument type, and PluginRegistration/registry() are typed on dyn LanguagePlugin — the trait's own discovery wiring, same actors. All re-exported at the crate root, so call sites read code_ranker_plugin_api::{Preset, detect_with_marker, PluginRegistration, registry}. HK(plugin.rs): 613.7K -> 311.0K (sloc 68->60, fan_in 19->12). preset.rs has fan_in 18 but fan_out 0, so its HK is 0 — the coupling dissolves, not relocates. Docs: DESIGN.md (module layout, Preset location, trait contract); ai-skill.md (absolute link to the HK principle); principles/rust/henry-kafura-coupling.md (diagnose-and-split workflow: measure one file, list fan_in/fan_out, find mixed scenarios, split, verify with a before/after diff report). No behaviour change; goldens unchanged.

… TOML Hardcoded human text in the CLI violated 'all vocab/prompt text lives in config'. Move it to the data-driven catalogs, behind a new spec field, and have the CLI read it from the snapshot — so the CLI names no metric and stores no prose, and a language can reword diagnostics with no Rust change. - New `remediation` spec field (the check `fix` line) on AttributeSpec / CycleKindSpec / FieldDef / MetricDef. - builtin.toml gains the metric `why`/`fix` (remediation on cyclomatic/cognitive), the coupling specs (fan_in/fan_out/hk/cycle, migrated out of coupling_specs() in lib.rs), a [cycles.*] vocab catalog, and a [prompt] scaffolding block. - The orchestrator overlays cycle vocab onto each level's cycle_kinds; structural metrics (loc/items/unsafe/headings/links/…) carry description+remediation in their language configs. - CLI: delete the RULES prose catalog; rule_doc() resolves why/fix/title from the snapshot's node_attributes + cycle_kinds (works on snapshot input too); threaded into human + SARIF diagnostics. Concern codes (CPX/CPL/SIZ) + rule_tuning stay. - Prompt scaffolding: a PromptTemplate carried in the snapshot, sourced from builtin.toml [prompt], read by BOTH the CLI compose_prompt and the viewer JS composePrompt — single source, {id} substituted at render. - Regenerated all 9-language e2e goldens; synced DESIGN/PRD/node_schema/ERRORS/ viewer/customization docs. make all + make e2e green (coverage 94.99%); behaviour unchanged.

Add [rules.checks.<id>] — a CEL boolean `when` predicate evaluated per file as a second pass over the fully-built graph, producing `check.<id>` violations like a threshold/cycle rule. The predicate sees node attributes, derived path fields (path/name/stem/ext/dir), graph edges (deps/rdeps + depends_on/depended_on_by), file collections (files/siblings + file_exists), string fns (ends_with/…/matches) and CEL list macros. [rules.defs] adds reusable named helpers expanded into `when` (cycle-checked). No engine/CEL change for callers — rides the existing check path. Unify the report/ui vocabulary end to end: LevelUi/JSON fields drop the `_metrics` suffix (card/size/filter/sort/summary) so config [report] keys = JSON ui keys = JS; the metric catalog merges [tableview]/[cardview]/[mapview]/[report.json.aggregate] into one [report] (+ [report.stats]) mirroring the project override, and `featured` becomes `card`. The SVG status bar now reads ui.card (was hardcoded hk/sloc); dead tooltip renderers removed. Snapshot schema_version 2 → 3 (field rename is breaking). Fix: external-crate dependency labels resolved to their cargo-registry path, so depends_on("ext:<crate>") never matched — external nodes now keep their ext:<name> id for matching. Docs: customization/README §1.8, cli config.md/ERRORS.md/PRD.md, root PRD/DESIGN, node_schema, e2e synced; runnable examples linters-example.toml + architecture-linters.toml. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…/types/traits) The Rust plugin now emits, per file node, six comma-joined string attributes from its production AST (test items excluded): `derives` / `macros` / `attrs` / `imports` ("uses X" sets) and `types` / `traits` (names defined in the file). The module-graph syn walk already visited derives/uses/attributes; a FactsCollector gathers them (imports reuses the bare-path collector) and collapse emits them. This makes the symbol-level dylint rules — which edges alone could not reach — expressible as config [rules.checks] via contains()/matches(): no serde/ToSchema derive in a layer (DE0101/0102), no api_dto attr (DE0104), no schema_for! macro (DE0110), no print macros (DE1301), DTO/Client naming (DE0201/0503). Added to docs/customization/architecture-linters.toml. String-valued so they stay off the numeric table/scorecard surfaces; specced in rust/config.toml [node_attributes]. Rust-only (like unsafe/items) — rust golden regenerated; other languages unaffected. Docs: node_schema.md, cli config.md, customization/README. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…r deps Make the two CEL contexts (metric formulas, check predicates) speak the same language, and document it. Engine: - Drop the custom string stdlib (ends_with/starts_with/contains/matches) — cel-rust ships these natively (contains/startsWith/endsWith/matches regex, as method or function). Removes the direct `regex` dependency (kept transitively via `cel`). - Math host functions (pow/log2/sqrt/…) now registered in checks too, not just metrics — shared `register_math`. - Path fields (path/name/stem/ext/dir) now available in metric formulas, not just checks — so a formula can branch on location (`path.contains("/generated/") ? 0 : hk`). Shared path logic extracted to a `nodepath` module. - `agg(metric, reducer, population)` now usable inside check predicates for relative thresholds (node vs project distribution, e.g. `cyclomatic.double() > agg('cyclomatic','p90','not_empty')`), memoized per run so each distinct aggregate is reduced once, not once per file. Docs: - New docs/customization/cel-reference.md — full agent-oriented CEL reference (language, cel-rust stdlib, what's in scope in metrics vs checks). - Remove the linter example configs (architecture-linters/linters-example); the config-only-linter direction is not being pursued. README/config.md updated to native CEL names and the new reference. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

`--config <file>` is repeatable: previously only the first file was read and the rest silently ignored (`files.first()`). Now every file is deep-merged over the built-in defaults in command-line order, last wins; inline `KEY=VALUE` applies after all files; passing any file still disables auto-discovery of `code-ranker.toml`. `discover_user_table` → `discover_user_tables` returns the ordered layers; the log line shows the merge order (`a.toml ⊕ b.toml`). Docs: - config.md — "Multiple --config files" section; a "full layer stack (with sources)" table linking each built-in layer on GitHub (plugin base ⊕ family ⊕ <lang>, metric catalog, project defaults); the big example renamed to `code-ranker-rust-example.toml` with a note that auto-discovery needs the literal name `code-ranker.toml` and that a real config is partial. - CLI.md — `--config` flag + precedence list updated for repeat/order. Coverage: the new explicit-multi-file branch is unit-tested (`load_layers_multiple_config_files_in_order_last_wins`). The retyped discovery lines (cwd/workspace `code-ranker.toml`, `Cargo.toml` metadata) stay uncovered by design — they read the process cwd / real filesystem and can't be isolated in a unit test (the repo's own ./code-ranker.toml wins cwd discovery first). Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…-lens presets - check: add --focus <PATH> — gate a subset of files (whole project still analyzed) - check: add --output-format prompt — emit an AI fix-prompt from the gate's own violations (one command gates and, on failure, prints the prompt) - report: replace --preset with --metric (narrow the scorecard by ranking axis); --output.prompt is auto-targeted and requires --top 1 - presets: remove the 4 metric-lens presets (HK/SLOC/FANIN/FANOUT) and the `slug` field — a preset's doc_url now resolves from its `id` - principles: per-metric fix-prompt docs (HK/Fan-in/Fan-out/Cognitive/Cyclomatic/ Cycles) with industry-case filenames; merge mutual/chain → Cycles; remediation now points to the metric's principle doc - docs: move what-is-cycle → docs/cycles.md; delete redundant metric-thresholds & module-size; add USE-CASES.md; sync CLI/PRD/DESIGN/config/viewer/customization Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…splits Tighten the dogfood gate in code-ranker.toml (hk, cyclomatic, cognitive) and bring the codebase under it without behaviour change: - Extract large inline `#[cfg(test)]` blocks into sibling `*_test.rs` files (wired via `#[cfg(test)] #[path = …] mod tests;`), satisfying the new `inline_tests_too_large` custom check. - Split files over the per-file cyclomatic budget by relocating cohesive function groups into sibling submodules: registry.rs -> registry/model.rs (leaf data types) + registry/eval.rs check.rs -> check/values.rs pipeline.rs -> pipeline/helpers.rs ecmascript/structure.rs -> ecmascript/imports.rs Shared types/statics moved DOWN into leaf modules (model.rs; MODULE/ file_to_mod_path into imports.rs) so the parent depends on the leaf one-way — avoiding the parent<->child ADP cycle a naive extraction would create. Docs: - principles/rust/HK.md: add "When a hub is legitimate (accept, don't game)". - principles/rust/Cyclomatic.md & Cognitive.md: were stray copies of KISS.md; rewrite as the real metric principles, incl. the split/cycle-trap workflow that the cyclomatic/cognitive `remediation` links now point to usefully. make all green: build, test (cli/graph/plugins/e2e), clippy, lint-md, self-check (0 violations), coverage >=90%. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…g; tighten gate Viewer — restore the per-level switcher (was a `updateFilesTab(){}` stub): - `[levels] functions = true` already produces a `functions` graph level in the JSON; the HTML viewer now exposes it. `updateFilesTab` builds a `.report-switch` tab row + clones the `files` `.view` per extra level (rendering is already level-agnostic). Hidden for single-level reports, so they look unchanged. check — `--suggest-config` is now fully data-driven: the threshold metrics come from `config::metrics` × the snapshot's attribute specs (numeric, not higher-better), in report-column order — no hardcoded metric array. Bring the codebase under the tightened dogfood gate (hk=500K, cyclomatic=100, cognitive=70, sloc=400) via behaviour-preserving, cycle-safe module splits: cognitive: scorecard / collapse / python structure (lift nested blocks) cyclomatic: config/load→load/overrides, walk→visitors, check→check/sarif sloc: pipeline→pipeline/assemble, walk→visitors, checks→checks/text, python/structure→python/imports (shared MODULE moved down to a leaf) Docs: viewer PRD/DESIGN document the level switcher; config.md notes the viewer surfaces the `functions` level. make all + make e2e green; self-check 0 violations; coverage ≥90%. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…--focus-path Advisory tiers now derive from the `check` gate, not language calibration: - Remove the language-calibrated `[thresholds.*]` system (plugin `thresholds()`, `ThresholdCfg`, rust `[thresholds]`). `AttributeSpec.thresholds` is now built from `[rules.thresholds.file]`: the limit is the `warning` tier, a `[metrics.<k>]` `info` is kept only when below it. Scorecard/viewer/prompt show exactly what the gate enforces; an unconfigured metric has no breaches (drop the `{0,0}`/HK-fallback that read "everything breaches"). - Tune the project's own `hk` gate 500K→350K (ratchet just above plugin.rs + ~1 language of growth); document the rationale. Recommendation focus, mirroring `check`'s flag family: - Replace report `--metric`/`--focus` with `--focus-rule` (a metric `hk`, a full rule id `threshold.file.hk`, or a principle id `LSP`; metrics matched by value) and add `--focus-path` (restrict ranked modules to a subtree; cycles stay global). Applies to both scorecard and prompt. - A metric focus frames output by the metric itself (no SOLID wrapper) via a synthesized preset; fix the duplicate metric description in the metric-lens prompt. Docs & principles: - Delete `principles/rust/Cycles.md` (a verbatim copy of `ADP.md`); repoint the cycle metric remediation → `ADP.md`; drop stale cross-refs. - Add `docs/installation.md` (extracted from README) with a "which channel" guide; slim the README install section; sync CLI/USE-CASES/PRD/DESIGN/config/ai-skill. Move pipeline tests to `pipeline_test.rs` (inline block exceeded the TST gate). Regenerate goldens; add unit tests for resolve_focus / synth_metric_preset / in_focus / metric-lens prompt / principle-focus scorecard. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

… doc inheritance Rename the principle/metric doc corpus principles/ → languages/ and add a shared base/ corpus that languages without their own docs inherit, mirroring the defaults.toml ⊕ <lang>.toml config inheritance. - specs.rs: doc_url resolves to {doc_base}/{doc_lang}/{id}.md for the ids a language overrides (doc_overrides = "*" or [ids]), else {doc_base}/base/{id}.md - defaults.toml: doc_base → .../main/languages; rust/python/typescript/javascript declare doc_overrides="*"; go/c/cpp/csharp/markdown inherit base/ (fixes their previously dead /principles/<lang>/ doc links) - builtin.toml remediation URLs → languages/base/ (neutral metric docs) - base/ seeded from rust/, then fully de-Rust-ified into language-neutral prose - goldens, dialect/config tests, docs (PRD/DESIGN/CLI/viewer/node_schema/ metric-correctness/config-resolution), and Makefile/markdownlint/CI globs updated Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

…_cel/formula_pretty/formula_js Both config surfaces named the same three concepts differently: built-in cel/formula_human/formula_js vs user-config formula/formula_pretty/calc, and the bare `formula` key meant the executable CEL while the wire `formula` meant the display string. Collapse to one family on both surfaces: formula_cel executable CEL formula formula_pretty human-readable display formula formula_js JS the viewer re-runs for the live derivation line The wire encoding (AttributeSpec.formula/calc) is intentionally unchanged, so JSON reports, the viewer, and golden fixtures are untouched. Pure rename — behaviour and emitted values are identical.

…c inheritance Two related strands on the tier-2/doc tooling. hk → CEL (TIER1 → graph → TIER2): - Move Henry–Kafura from a hardcoded Rust pass to a data-driven `[fields.hk]` `formula_cel = "sloc * pow(fan_in * fan_out, 2.0)"`. `hk.rs::annotate_hk` → `annotate_coupling` (fan_in/fan_out/fan_out_external only); a new post-graph `write_derived` stage evaluates graph-derived `[fields.*]` over node attrs once coupling is present. Engines split pre-graph (`DERIVED`, raw tier-1) vs post-graph (`GRAPH_DERIVED`), partitioned by whether a formula references a graph key. - Values are bit-identical (e2e goldens unchanged except markdown, which has no `sloc` → no `hk`, regenerated surgically). - Cycle-safe split: the metric-writing machinery moves to `builtin/write.rs` so `builtin.rs` drops back under the `hk` regression-ratchet gate. Manifest-based doc inheritance: - `compose.rs` reworked: a `languages/<lang>/<ID>.md` manifest lists sections in order — `` pulls (a slice of) a base section, inline `## ` sections are written verbatim, and the manifest may carry its own H1/TL;DR head (else inherits base, suffixed `(in <Lang>)`). Replaces the `_overlay/` patch model. A doc is a manifest iff it contains a `doc:base` include. - All `rust/` principle+metric docs migrated to manifests; `python/DRY`, `typescript/OCP` too. Theory/boilerplate sections reuse base; language-specific sections (examples, crate-level, extra refs) stay inline. - Prompt scaffolding prose now lives in `metrics/prompt.md` (internal, not a published corpus doc). Docs synced: templates.md, DESIGN.md, CLI.md, metric-tiers/correctness. make all + e2e green; self-check passes.

…↔doc alignment Prompt scaffolding & rendering (CLI `recommend/prompt.rs` + viewer `export-popup.js`, kept in parity): - doc-note now points at the offline `code-ranker report --doc {id}` command instead of a network URL — works without a connection. `{id}` substituted in `doc_note` too (`metrics/prompt.md`). - Connection lists filter to FLOW edges only (`uses`) — `contains`/`reexports` are noise in the crossroads and don't drive HK/coupling (matches the viewer's `edgeIsFlow`). - Single-focus (`--top 1`): drop the repeated focus path — `in` shows `dependant:line`, `out` shows `target (uses, line N)`; heading becomes `## Target module`. Several foci keep the full `source:line → target`. - Metric values print exact integers (e.g. `295488`), never abbreviated — `abbreviate` is now a viewer-only display flag. The per-module `**Formula:**` line is dropped (it lives in `--doc`). prompt ↔ doc consistency: - HK doc H1 aligned to the preset title `HK — Henry–Kafura` (base + rust manifest). - HK `description` reworded to echo the doc TL;DR, so the prompt Summary and the doc correspond. - HK doc workflow uses the CLI (`--output.scorecard --focus-rule hk`, `--prompt HK`) instead of `jq`-parsing JSON. - Drop the duplicated own-head from manifests whose TL;DR equalled base (HK, Cognitive, DRY, Fan-in, Fan-out) — they inherit the base head + `(in <Lang>)`. Docs synced (templates.md, ai-skill.md, DESIGN.md). Goldens patched (doc_note + description). make all + e2e green.

…flows - code-ranker-cli/{CLI,DESIGN,ERRORS}.md: document the `--prompt <ID>` / `--doc <ID>` named prompt/doc output and the maintenance-only `docs` subcommand; note the offline `--doc` doc-read path; tighten the `hk` description. - languages/base/{Cyclomatic,Cognitive}.md: replace the stale "dump a JSON snapshot and inspect it" step with the native `report --output.scorecard --focus-rule <metric>` triage (no jq), plus the function-level + HTML-viewer path for the per-function breakdown — consistent with HK.md's "no snapshot to parse" workflow. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

`collapse.rs` showed 0% line coverage because it is only exercised by the subprocess-based e2e tests (llvm-cov can't see a spawned binary). Add direct unit tests for the pure `collapse_to_files` transform — module/inline folding, external-crate nodes, edge re-pointing (drop crate→crate and self edges), the file-backed source-of-truth rule, and reexport visibility — lifting it to ~92%. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

- Cargo.toml/Cargo.lock: workspace version 3.0.0-alpha.1 → 3.0.0 (via `make bump`; all crates inherit `version.workspace`). - docs: drop the stale `--version 1.1.0` pin from the cargo install commands (README + installation.md) and point Docker pulls at `:latest`, so the install docs don't go stale on every bump; refresh the example-snapshot version (`1.0.0-alpha.4` → `3.0.0`) in PRD/DESIGN; drop the now-inconsistent "Pin a specific version" line from the README status. - Makefile: `make bump` now lists doc files that mention a version (cargo `--version`, alpha/beta/rc, and `code-ranker@`/`:` npm+docker tags) so version refs the bump can't auto-edit are surfaced for manual review. Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

`node-popup.js` interpolated `esc(n.id)` / `esc(node.id)` into `data-diag-node` / `data-node-id` attribute values, but `esc` (escHtml) does not escape `"` — a node whose id contains a double quote (e.g. an oddly-named analyzed file) could break out of the attribute. Use `escA` (escAttr), which also escapes `"`, for both attributes. Clears CodeQL js/incomplete-html-attribute-sanitization (#18, #19). (The high js/xss-through-dom on tooltip.js is a verified false positive — every input is escHtml-escaped before innerHTML — dismissed in code scanning.) Claude-Session: https://claude.ai/code/session_013R7NfedZh9uEkUQrSRGWR7

Add unit tests for previously-uncovered branches: - templates.rs: resolve_doc (base/manifest/override/metric/error), build_corpus, lang_display - config/load.rs: code-ranker.toml + Cargo.toml metadata auto-discovery - overrides.rs: templates.languages/prompt inline keys + remaining ignore flags - assemble.rs: default-Level fallback, grouping-with-function - plugins/specs.rs: selective doc_overrides array form - recommend/prompt.rs: single-focus in/out edge abbreviation - check/values.rs + check.rs: early-return/false branches in print helpers

@@ -0,0 +1,338 @@
+//! The embedded doc corpus and per-file overrides.


The dogfood gate flags >N inline test lines per file; move the new report.rs / templates.rs test modules into report_test.rs / templates_test.rs (matching check_test.rs et al.).

ffedoroff added 3 commits June 17, 2026 17:06

ffedoroff added 24 commits June 18, 2026 11:31

github-advanced-security AI found potential problems Jun 22, 2026

View reviewed changes

Comment thread crates/code-ranker-cli/src/templates.rs

@@ -0,0 +1,338 @@

//! The embedded doc corpus and per-file overrides.

test: split report/templates inline tests into sibling _test.rs files

4e7d81c

The dogfood gate flags >N inline test lines per file; move the new report.rs / templates.rs test modules into report_test.rs / templates_test.rs (matching check_test.rs et al.).

ffedoroff merged commit 0279d6b into main Jun 22, 2026
5 checks passed

ffedoroff deleted the feat/report-list-overrides branch June 22, 2026 00:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: data-driven report/metrics config, list-overrides & C/C++/Go/C#/Markdown plugins#14

feat: data-driven report/metrics config, list-overrides & C/C++/Go/C#/Markdown plugins#14
ffedoroff merged 28 commits into
mainfrom
feat/report-list-overrides

ffedoroff commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,338 @@
		//! The embedded doc corpus and per-file overrides.

Conversation

ffedoroff commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Config

Metrics

New language plugins

Viewer

Tests & docs

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

code-ranker

Uh oh!

codecov Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ffedoroff commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

codecov Bot commented Jun 18, 2026 •

edited

Loading