I build deterministic, local-first tools for governing AI agents: policy engines, MCP runtime enforcement, PR-time drift scanners, transcript review, and evidence-backed reports. I also apply the same design style to repository analysis and conservative health/training decision-support tools.
What I build
- AI-agent governance — policy decisions, runtime MCP enforcement, PR-time drift detection, transcript audits, and consolidated verdicts.
- Repository & supply-chain analysis — codebase orientation, stale-repo autopsies, capability scanning, and dependency provenance.
- Evidence-backed health/training workflows — conservative decision support that exposes inputs, rules, and evidence. Not diagnosis, not treatment.
Pasadena, CA · TypeScript · Rust · Python · React · Node · FastAPI GitHub @Conalh · X @conalhck · dev.to/conalh · conal.hg@gmail.com Open to AI-agent infrastructure/safety, developer-tooling, and full-stack roles.
| Project | What it does |
|---|---|
| warden | Rust policy DSL engine that decides allow / deny / ask for agent actions. Zero-dependency core. Live playground. |
| barbican | MCP stdio proxy that binds warden's verdicts before a tool call reaches the server. |
| CapabilityEcho | PR-time scanner for executable capability drift — new network, subprocess, eval, and lifecycle signals on the exact added lines. |
| project-autopsy | Evidence-backed autopsy reports for stale repos: findings, stall hypotheses, and revival tasks. Full-stack TypeScript. |
| recovery-trail | 100% client-side recovery briefing from an Apple Health export, with ACSM-aligned verdicts and full rule traces. Live demo. |
| fit-ontology | Client intelligence layer for personal trainers — wearables, intake, and ACSM guidelines unified into one explainable, rules-based ontology. Live demo. |
A local-first stack with one job per tool and one shared schema, so the pieces compose instead of overlap: decide → enforce → detect → consolidate → observe. There is no LLM in the decision path.
Full walkthrough: AGENT_GOVERNANCE_STACK.md — the whole suite end to end, with a diagram, a failure-mode map, and an adoption path.
| Tool | Role |
|---|---|
| warden | decide — allow / deny / ask policy engine (Rust, zero-dependency). |
| barbican | enforce — binds verdicts on the MCP wire before the call lands. |
| ScopeTrail | config diff — what changed in agent config files. |
| PolicyMesh | current policy contradictions — across MCP, Claude, Cursor, VS Code, Codex, Aider. |
| CapabilityEcho | executable capability drift — new network / subprocess / eval / lifecycle signals on added lines. |
| TaskBound | task-vs-diff scope creep — stated task compared to the actual change. |
| SessionTrail | runtime transcript audit — Cursor / Claude Code / Codex sessions for risky behavior. |
| GovVerdict | merge/dedupe verdicts — one consolidated PR result from the detector suite. |
| AgentPulse | live trajectory observation — converging, exploring, stuck, drifting, done, idle. |
| agent-gov-core | shared schema/parsers — canonical Finding schema and JSONC/TOML/MCP/shell/transcript parsers. |
Architecture diagram
flowchart TB
subgraph runtime["Runtime · on the MCP wire"]
direction LR
warden["warden — decide<br/>allow · deny · ask"] --> barbican["barbican — enforce<br/>before the call reaches the server"]
end
subgraph detect["PR time · detect"]
ScopeTrail
PolicyMesh
CapabilityEcho
TaskBound
SessionTrail
end
detect --> GovVerdict["GovVerdict — consolidate<br/>one PR verdict"]
barbican -.->|audit findings| GovVerdict
AgentPulse["AgentPulse — observe<br/>live session trajectory"]
core["agent-gov-core — one Finding schema, shared by every tool above"]
classDef hl fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe
class warden,barbican,GovVerdict,AgentPulse,core hl
Field-tested. Beyond the synthetic agent-gov-demo, I ran the whole stack against a real open-source background-agent coding platform I did not write — runtime MCP enforcement, credential-broker authorization, and a pre-PR capability gate — and fixed a cross-component bug it surfaced. Anonymized writeup: agent-gov-fieldtest.
Standalone tools for understanding codebases, reviewing risky changes, finding documentation drift, and verifying dependency provenance.
- repo-brief — orientation layer for unfamiliar repos: architecture map, key files, hotspots, run commands, and where to start.
- project-autopsy — evidence-backed autopsy reports for stale repositories, over a deterministic, CI-tested core; CLI plus a Next.js report UI.
- docs-debt-radar — scans repositories for stale, missing, and drifting documentation claims.
- overreach — Rust capability scanner for diffs, files, and repos: network calls, subprocesses, sensitive-file reads,
curl | sh, disabled TLS, hardcoded secrets. - tofulock — Go. Locks and verifies Terraform/OpenTofu module sources by commit digest.
- cpan-integ — Perl. Consumer-side, install-time artifact-hash verification for CPAN distributions. Experimental.
- timecal — cross-agent time-calibration corpus served over MCP, countering the engineer-weeks prior agents inherit. On PyPI.
Conservative decision-support tools — not diagnosis, not treatment recommendation. Each one exposes its inputs, the rules that fired, confidence limits, and the raw evidence, so a human stays in the loop on every call.
- fit-ontology — trainer-facing client intelligence: unifies wearables, intake, and ACSM guidelines into a queryable model with explainable rules traceable back to the metric rows that fired them.
- recovery-trail — athlete-facing recovery briefing from an Apple Health export. Runs 100% client-side; shows HRV, RHR, sleep, load, ACSM-aligned verdicts, and rule traces. Live demo.
- nutrition-experiment-lab — personal n-of-1 nutrition experiment notebook with adherence tracking, confounder notes, confidence, and transparent next-test suggestions.
- injury-return-to-play-tracker — clinician- and coach-facing workflow for phase progress, functional-test evidence, workload tolerance, and human clearance decisions.
- academic-load-burnout-monitor — student workload planner with explainable pressure signals, check-ins, and recovery-aware next actions.
- Deterministic first — the important decisions are reproducible and inspectable, not model-dependent.
- Local-only when possible — tools run on your machine; no data leaves it unless you opt in.
- Evidence-backed reports — every verdict traces back to the inputs, rules, and lines that produced it.
- No LLM in governance decision paths unless explicitly opt-in.



