Conal Hickey Conalh

Conal (Connor) Hickey

I build deterministic, local-first tools for governing AI agents: policy engines, MCP runtime enforcement, PR-time drift scanners, transcript review, and evidence-backed reports. I also apply the same design style to repository analysis and conservative health/training decision-support tools.

What I build

AI-agent governance — policy decisions, runtime MCP enforcement, PR-time drift detection, transcript audits, and consolidated verdicts.
Repository & supply-chain analysis — codebase orientation, stale-repo autopsies, capability scanning, and dependency provenance.
Evidence-backed health/training workflows — conservative decision support that exposes inputs, rules, and evidence. Not diagnosis, not treatment.

Pasadena, CA · TypeScript · Rust · Python · React · Node · FastAPI GitHub @Conalh · X @conalhck · dev.to/conalh · conal.hg@gmail.com Open to AI-agent infrastructure/safety, developer-tooling, and full-stack roles.

Start here

Project	What it does
warden	Rust policy DSL engine that decides allow / deny / ask for agent actions. Zero-dependency core. Live playground.
barbican	MCP stdio proxy that binds warden's verdicts before a tool call reaches the server.
CapabilityEcho	PR-time scanner for executable capability drift — new network, subprocess, eval, and lifecycle signals on the exact added lines.
project-autopsy	Evidence-backed autopsy reports for stale repos: findings, stall hypotheses, and revival tasks. Full-stack TypeScript.
recovery-trail	100% client-side recovery briefing from an Apple Health export, with ACSM-aligned verdicts and full rule traces. Live demo.
fit-ontology	Client intelligence layer for personal trainers — wearables, intake, and ACSM guidelines unified into one explainable, rules-based ontology. Live demo.

Agent governance stack

A local-first stack with one job per tool and one shared schema, so the pieces compose instead of overlap: decide → enforce → detect → consolidate → observe. There is no LLM in the decision path.

Full walkthrough: AGENT_GOVERNANCE_STACK.md — the whole suite end to end, with a diagram, a failure-mode map, and an adoption path.

Tool	Role
warden	decide — allow / deny / ask policy engine (Rust, zero-dependency).
barbican	enforce — binds verdicts on the MCP wire before the call lands.
ScopeTrail	config diff — what changed in agent config files.
PolicyMesh	current policy contradictions — across MCP, Claude, Cursor, VS Code, Codex, Aider.
CapabilityEcho	executable capability drift — new network / subprocess / eval / lifecycle signals on added lines.
TaskBound	task-vs-diff scope creep — stated task compared to the actual change.
SessionTrail	runtime transcript audit — Cursor / Claude Code / Codex sessions for risky behavior.
GovVerdict	merge/dedupe verdicts — one consolidated PR result from the detector suite.
AgentPulse	live trajectory observation — converging, exploring, stuck, drifting, done, idle.
agent-gov-core	shared schema/parsers — canonical Finding schema and JSONC/TOML/MCP/shell/transcript parsers.

Architecture diagram

flowchart TB
    subgraph runtime["Runtime · on the MCP wire"]
      direction LR
      warden["warden — decide<br/>allow · deny · ask"] --> barbican["barbican — enforce<br/>before the call reaches the server"]
    end
    subgraph detect["PR time · detect"]
      ScopeTrail
      PolicyMesh
      CapabilityEcho
      TaskBound
      SessionTrail
    end
    detect --> GovVerdict["GovVerdict — consolidate<br/>one PR verdict"]
    barbican -.->|audit findings| GovVerdict
    AgentPulse["AgentPulse — observe<br/>live session trajectory"]
    core["agent-gov-core — one Finding schema, shared by every tool above"]

    classDef hl fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe
    class warden,barbican,GovVerdict,AgentPulse,core hl

Field-tested. Beyond the synthetic agent-gov-demo, I ran the whole stack against a real open-source background-agent coding platform I did not write — runtime MCP enforcement, credential-broker authorization, and a pre-PR capability gate — and fixed a cross-component bug it surfaced. Anonymized writeup: agent-gov-fieldtest.

Repository-analysis tools

Standalone tools for understanding codebases, reviewing risky changes, finding documentation drift, and verifying dependency provenance.

repo-brief — orientation layer for unfamiliar repos: architecture map, key files, hotspots, run commands, and where to start.
project-autopsy — evidence-backed autopsy reports for stale repositories, over a deterministic, CI-tested core; CLI plus a Next.js report UI.
docs-debt-radar — scans repositories for stale, missing, and drifting documentation claims.
overreach — Rust capability scanner for diffs, files, and repos: network calls, subprocesses, sensitive-file reads, curl | sh, disabled TLS, hardcoded secrets.
tofulock — Go. Locks and verifies Terraform/OpenTofu module sources by commit digest.
cpan-integ — Perl. Consumer-side, install-time artifact-hash verification for CPAN distributions. Experimental.
timecal — cross-agent time-calibration corpus served over MCP, countering the engineer-weeks prior agents inherit. On PyPI.

Health/training decision-support

Conservative decision-support tools — not diagnosis, not treatment recommendation. Each one exposes its inputs, the rules that fired, confidence limits, and the raw evidence, so a human stays in the loop on every call.

fit-ontology — trainer-facing client intelligence: unifies wearables, intake, and ACSM guidelines into a queryable model with explainable rules traceable back to the metric rows that fired them.
recovery-trail — athlete-facing recovery briefing from an Apple Health export. Runs 100% client-side; shows HRV, RHR, sleep, load, ACSM-aligned verdicts, and rule traces. Live demo.
nutrition-experiment-lab — personal n-of-1 nutrition experiment notebook with adherence tracking, confounder notes, confidence, and transparent next-test suggestions.
injury-return-to-play-tracker — clinician- and coach-facing workflow for phase progress, functional-test evidence, workload tolerance, and human clearance decisions.
academic-load-burnout-monitor — student workload planner with explainable pressure signals, check-ins, and recovery-aware next actions.

Working style

Deterministic first — the important decisions are reproducible and inspectable, not model-dependent.
Local-only when possible — tools run on your machine; no data leaves it unless you opt in.
Evidence-backed reports — every verdict traces back to the inputs, rules, and lines that produced it.
No LLM in governance decision paths unless explicitly opt-in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conal Hickey Conalh

Achievements

Achievements

Highlights

Block or report Conalh