Skip to content

Epic: Project-specific harness toolkit for large-codebase agent work #35

@sungjunlee

Description

@sungjunlee

Context

Anthropic's large-codebase Claude Code guidance frames successful agent adoption as a project-specific harness problem: lean root context, local conventions, on-demand skills, hooks/scripts, plugins, MCPs, LSP, and subagents where they actually earn their keep.

This likely belongs near CraftKit because the job is not dev-backlog-specific. It is about helping a project design its own harness: what belongs in CLAUDE.md / AGENTS.md, what belongs in repo-local skills, what should become hooks/scripts, and when the package deserves plugin shape.

Initial craft-skill-spec judgment

  • Artifact class: start as a skill suite design, not a new plugin immediately.
  • Why not smaller: one single skill would overload context mapping, skill authoring, hook/script selection, and reassess cadence into one muddy trigger.
  • Escalation note: promote to plugin only if the toolkit needs bundled hooks, MCP config, installable repo templates, or subagents as first-class surfaces.

Artifact thesis

  • Job: help a project design and maintain a project-specific agent harness for large-codebase work.
  • Non-goals: replacing dev-backlog, relay, or CraftKit's existing skill-design workflows; auto-editing project specs without a human gate.
  • Target users/agents: maintainers building agent workflows for real codebases, especially multi-skill or multi-repo systems.
  • Success condition: the output tells a maintainer exactly which artifacts to add, which to avoid, and how to evaluate whether the harness improves agent work.

Scope Candidates

  • Context map guidance: what belongs in root CLAUDE.md / AGENTS.md vs subdirectory files vs skills.
  • Project-local skill authoring: when a convention deserves a skill instead of more root prose.
  • Hook/script guidance: deterministic checks, advisory reflections, and no silent self-editing.
  • Periodic harness reassess: model/tool release and 3-6 month review cadence.
  • Eval plan: prompts that prove the harness helps navigation, planning, review, and handoff.

Non-goals

  • No live-source browsing on every run.
  • No provider-specific core unless isolated in references.
  • No new repo until the skill suite vs plugin boundary is proven.
  • No automatic project rewrites.

Acceptance Criteria

  • Child issues cover radar/source update, artifact-class spec, and MVP file/package shape.
  • The design applies CraftKit radar: concise spine, progressive disclosure, explicit evals, calibrated autonomy.
  • The first implementation path starts smaller than a plugin unless hooks/MCP/subagents are required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: opsOperational safety and eval workflowarea: skillsSkill authoring assetscraft-reviewWork derived from CraftKit skill reviewpriority: mediumMedium prioritytype: epicEpic tracking issue

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions