Skip to content

Commit 8cfb28a

Browse files
authored
Feat user skills (#139)
* fix: marked v15 + marked-terminal v7 incompat in markdown-renderer marked v15's use() iterates 'for (prop in pack.renderer)' and validates every enumerable key against its known renderer method list, throwing "renderer 'o' does not exist" at module init. The legacy 'new TerminalRenderer(opts)' route assigns config to own enumerable properties (this.o, this.tab, ...), so the first iteration hits an unknown key and crashes. This broke the agent on every 'just start' since PR #135 landed; CI never noticed because no test imports the module. Switch to the modern markedTerminal() factory which returns a clean MarkedExtension containing only renderer method keys, and add a regression test that import-loads the module and smoke-tests rendering so a future bump can't reintroduce this class of crash. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * feat: user-generated skills from session learnings Lets users persist what the agent learned in a session as a reusable skill at ~/.hyperagent/skills/<name>/SKILL.md, surviving upgrades and overriding system skills with the same name. Triggered via: - /save-skill [name] - slash command that builds a synthetic prompt from session context (tool history, MCP servers, modules registered, recent errors) and asks the LLM to call generate_skill() - 'save this as a skill' (natural language) - system message documents the generate_skill tool so the LLM can call it directly Components added: - src/agent/skill-writer.ts: validation + CRUD for user skills, with HYPERAGENT_USER_SKILLS_DIR env override for tests - src/agent/session-context.ts: pure extractor that rolls up tool history, MCP servers, modules registered, and recent errors into a prompt-ready string - generate_skill tool: registered in all three gating points (tools[], ALLOWED_TOOLS, availableTools[]) with interactive approval - /skills enhanced with 'info <name>', 'edit <name>', 'delete <name>', override-detection badge for user skills - skill-loader now supports loading from multiple directories with override semantics (later dirs win) - state.ts tracks toolCallHistory (capped FIFO), mcpServersUsed, modulesRegistered, pendingPrompt; populated by onPostToolUse hook and registerModuleImpl - system-message.ts documents the saving workflow for the LLM - docs/SKILLS.md adds 'User Skills (Persist What You Learn)' section Tests: 39 new (skill-writer 22, session-context 9, skill-loader +8). All 2443 TS tests pass; 124 Rust tests pass; lint clean. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * docs: add hand-off test plan for user-generated skills Standalone walkthrough at docs/TESTING-USER-SKILLS.md covering smoke test, full workout, override behaviour, boundary cases, and likely failure modes. Intended to be passed to reviewers / testers who want to exercise the feature without reading the implementation. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: address PR #139 review feedback (18 issues) Security & correctness - skill-writer: cap on UTF-8 byte length (not String.length) so a multi-byte payload can't bypass the 64 KB limit - skill-writer: reject reserved /skills subcommand names (info, edit, delete, list) to prevent shadowing the CLI surface - skill-writer: reject description/triggers containing newlines or a bare '---' line so they can't break out of YAML frontmatter - slash-commands /skills info|edit|delete: validate <name> via validateSkillName before any filesystem join — closes the path traversal vector pointed out by the reviewer UX correctness - index.ts generate_skill: surface an 'Overwrite existing user skill?' confirmation when overwrite=true and the file already exists - slash-commands /save-skill: pass skipAutoSuggest=true so the synthetic prompt's scaffolding terms don't trigger unrelated skills via runSuggestApproach - slash-commands /new: also reset currentUserPrompt + lastGuidance - slash-commands /resume: reset toolCallHistory, mcpServersUsed, modulesRegistered, currentUserPrompt, lastGuidance — local session-learning state can't be reconstructed from a resumed remote session - slash-commands /save-skill: fix 'distinct tools' status line to count the full tool history, not the bounded topTools view - session-context: truncate currentUserPrompt to 2000 chars with an ellipsis so a giant paste can't dominate the prompt MCP session-learning correctness - mcp/plugin-adapter: add optional onCall observer; agent wires it to state.mcpServersUsed so calls made from inside execute_javascript via host:mcp-<name> imports are now tracked - state.ts: add skipNextAutoSuggest flag (consumed in onUserPromptSubmitted) Documentation - docs/TESTING-USER-SKILLS.md: drop branch-name reference, switch override example from non-existent 'code-review' to bundled 'kql-expert', clarify '/skills edit' prints a path (no $EDITOR), describe the now-correct overwrite confirmation flow, note that the override badge surfaces in '/skills' list view, fix approval prompt wording (summary, not full content) Tests - Reserved-name rejection - YAML-unsafe newline rejection (description + trigger) - UTF-8 byte-length cap (32 KB of 4-byte chars) - User-prompt truncation contract Quality gate: 2448 TS tests pass (+5), 124 Rust tests pass. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> --------- Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
1 parent 2e17ac2 commit 8cfb28a

18 files changed

Lines changed: 2288 additions & 48 deletions

docs/SKILLS.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,65 @@ They work well together:
304304
hyperagent --skill web-scraper --profile web-research
305305
```
306306

307+
## User Skills (Persist What You Learn)
308+
309+
Hyperagent ships with a curated set of *system skills* (the table above) — and
310+
also lets you grow your own library of *user skills* during real work. When you
311+
and the agent have just spent ten minutes figuring out how to do something
312+
non-obvious, save the lesson so the next session starts where you left off.
313+
314+
### Saving a Skill
315+
316+
In any REPL session, run:
317+
318+
```
319+
/save-skill # let the LLM pick a name
320+
/save-skill teams-transcript-finder
321+
```
322+
323+
What happens:
324+
325+
1. Hyperagent collects a structured summary of the session — tool calls,
326+
MCP servers used, modules registered, recent errors.
327+
2. That summary plus instructions is sent to the LLM as a synthetic user
328+
turn.
329+
3. The LLM calls the `generate_skill` tool with a proposed SKILL.md (and
330+
optionally a companion module). You approve before anything is written.
331+
4. The skill is persisted to `~/.hyperagent/skills/<name>/SKILL.md`.
332+
333+
User skills are loaded automatically on every startup. If a user skill has
334+
the same name as a system skill, the user version wins (overrides are
335+
flagged with `👤 (overrides built-in)` in `/skills`).
336+
337+
### Managing User Skills
338+
339+
```
340+
/skills # list system + user skills (👤 = user)
341+
/skills info <name> # show the SKILL.md contents
342+
/skills edit <name> # print the path for your $EDITOR
343+
/skills delete <name> # remove a user skill (system ones are immutable)
344+
```
345+
346+
User skills live in `~/.hyperagent/skills/<name>/SKILL.md` and follow the
347+
same format as system skills documented above. Edit them in your editor of
348+
choice — changes apply on the next `/suggest_approach` invocation.
349+
350+
### When to Save vs Not Save
351+
352+
Save a skill when:
353+
354+
- The workflow took non-trivial effort to figure out and is likely to recur.
355+
- The lesson would be lost between sessions (modules + skill capture it
356+
together).
357+
- A few well-chosen triggers will reliably match future prompts on the same
358+
topic.
359+
360+
Don't save a skill when:
361+
362+
- The task was one-off (no expected recurrence).
363+
- The lesson is generic ("use the right tool"). Skills are for *specific*
364+
domain knowledge.
365+
307366
## See Also
308367
309368
- [PATTERNS.md](PATTERNS.md) - Code generation patterns

docs/TESTING-USER-SKILLS.md

Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
# Testing the User-Generated Skills Feature
2+
3+
A walkthrough for verifying the **user skills** feature end-to-end. The
4+
feature lets a user persist what HyperAgent learned in a session as a
5+
reusable skill at `~/.hyperagent/skills/<name>/SKILL.md`, surviving
6+
upgrades and overriding system skills with the same name.
7+
8+
---
9+
10+
## Prerequisites
11+
12+
- A working HyperAgent checkout
13+
- `just setup` already run (Rust addons built, deps installed) — see the
14+
project [README](../README.md) and [DEVELOPMENT.md](DEVELOPMENT.md)
15+
- A terminal where `just start` launches the agent successfully
16+
- A working GitHub Copilot login for the agent's LLM calls
17+
18+
---
19+
20+
## 1. Smoke Test (~2 minutes)
21+
22+
This is the minimum bar — if this works, the feature is wired up
23+
end-to-end.
24+
25+
```bash
26+
# Use a throwaway skills dir so you don't pollute ~/.hyperagent/skills/
27+
export HYPERAGENT_USER_SKILLS_DIR=/tmp/ha-skills-test
28+
mkdir -p "$HYPERAGENT_USER_SKILLS_DIR"
29+
30+
just start
31+
```
32+
33+
In the agent REPL:
34+
35+
```text
36+
> /skills
37+
```
38+
39+
Confirms baseline — only **system** skills should appear, none with the
40+
👤 (user) badge.
41+
42+
Now do some work the agent will remember:
43+
44+
```text
45+
> use the fetch plugin to grab https://example.com and tell me the title
46+
```
47+
48+
Let it run to completion. Then ask the agent to save what it learned:
49+
50+
```text
51+
> /save-skill fetch-page-title
52+
```
53+
54+
**Expected behaviour:**
55+
56+
1. The agent receives a synthetic prompt summarising the session
57+
context (tools used, MCP servers, modules registered, recent errors)
58+
2. The LLM calls the `generate_skill(...)` tool
59+
3. You see an interactive approval prompt showing a **summary** — the
60+
skill name, the one-line description, a preview of the first few
61+
triggers, the allowed-tools list, and a byte count for the guidance
62+
body. (The full content is *not* echoed to stdout.)
63+
4. Hit `y` to approve
64+
65+
Verify the file landed on disk:
66+
67+
```bash
68+
cat /tmp/ha-skills-test/fetch-page-title/SKILL.md
69+
```
70+
71+
You should see a valid SKILL.md with YAML frontmatter (`name`,
72+
`description`, `triggers`, etc.) and a markdown guidance body.
73+
74+
If that file exists, **the feature works.** 🎉
75+
76+
---
77+
78+
## 2. Full Workout
79+
80+
Exercise every command path. From a fresh `just start`:
81+
82+
```text
83+
> /skills # list both system + user skills
84+
> /skills info kql-expert # show full detail for a bundled system skill
85+
> /save-skill # no name → LLM picks one
86+
> /skills # user skill now shows with 👤
87+
> /skills info fetch-page-title # user skill detail
88+
> /skills edit fetch-page-title # prints the user-skill path; open it in your editor
89+
> exit
90+
```
91+
92+
> `/skills edit <name>` does **not** spawn `$EDITOR`. It just prints
93+
> the absolute path to the user-skill `SKILL.md` so you can open it
94+
> in your own editor of choice. Save the file, then restart (or run
95+
> `/suggest_approach`) and the change takes effect.
96+
97+
Then restart the agent and repeat the original task — the matching
98+
`/suggest_approach` should surface the saved skill via its triggers.
99+
100+
---
101+
102+
## 3. Override Test
103+
104+
User skills must override system skills with the same name. Drop a user
105+
skill that shadows an existing system one (pick any skill that `ls
106+
skills/` shows — here we use `kql-expert`):
107+
108+
```bash
109+
mkdir -p "$HYPERAGENT_USER_SKILLS_DIR/kql-expert"
110+
cat > "$HYPERAGENT_USER_SKILLS_DIR/kql-expert/SKILL.md" << 'EOF'
111+
---
112+
name: kql-expert
113+
description: My customised KQL skill
114+
triggers: [kql, kusto, query]
115+
allowed-tools: [execute_javascript]
116+
---
117+
This overrides the system version.
118+
EOF
119+
120+
just start
121+
```
122+
123+
In the REPL:
124+
125+
```text
126+
> /skills
127+
```
128+
129+
**Expected:** the `kql-expert` row appears with the **`👤 (overrides
130+
built-in)`** badge in the list view. Running `/skills info kql-expert`
131+
then shows the **user** description ("My customised KQL skill").
132+
133+
---
134+
135+
## 4. Negative / Boundary Tests
136+
137+
Validation should reject bad input cleanly without crashing the agent:
138+
139+
| Input | Expected outcome |
140+
|-------|------------------|
141+
| `/save-skill BadName` | Rejected — not kebab-case |
142+
| `/save-skill ../escape` | Rejected — path traversal |
143+
| `/save-skill thisnameisreallylongandshouldfailitsbeyondsixtyfourcharactersnowforsure` | Rejected — exceeds 64 chars |
144+
| `/save-skill info` | Rejected — reserved subcommand name |
145+
| `/save-skill fetch-page-title` (second time, fresh session) | `generate_skill` first errors with "already exists — set overwrite=true"; the LLM retries with `overwrite=true`, and you get an **"Overwrite existing user skill?"** confirmation before the file is replaced |
146+
147+
---
148+
149+
## 5. Cleanup
150+
151+
```bash
152+
rm -rf /tmp/ha-skills-test
153+
unset HYPERAGENT_USER_SKILLS_DIR
154+
```
155+
156+
---
157+
158+
## Verification Checklist
159+
160+
| Symptom | Confirms |
161+
|---------|----------|
162+
| `generate_skill` appears in the tool log after `/save-skill` | LLM picked up the system-message guidance ✅ |
163+
| Approval prompt shows a skill preview | Tool handler validation working ✅ |
164+
| `.md` file lands on disk under `$HYPERAGENT_USER_SKILLS_DIR` | `writeUserSkill()` working ✅ |
165+
| `/skills` shows the 👤 badge for the new skill | Multi-dir loader + `source` field working ✅ |
166+
| `/skills` shows `👤 (overrides built-in)` for shadowed system skills | Name-collision detection working ✅ |
167+
| Restarting the agent matches the skill on similar prompts | `loadSkillsFromDirs` + boot wiring working ✅ |
168+
169+
---
170+
171+
## Likely Failure Modes & Where to Look
172+
173+
- **`/save-skill` runs but the LLM never calls `generate_skill`** — the
174+
synthetic prompt from `submitToLLM` may be too weak. See
175+
[src/agent/slash-commands.ts](../src/agent/slash-commands.ts) (the
176+
`/save-skill` handler) and
177+
[src/agent/system-message.ts](../src/agent/system-message.ts)
178+
("SAVING WHAT YOU LEARN" section).
179+
- **Tool not allowed** — every new tool needs registration at THREE
180+
points: `tools[]` array, `ALLOWED_TOOLS` in
181+
[src/agent/tool-gating.ts](../src/agent/tool-gating.ts), and
182+
`availableTools[]` in the session config. Triple-check.
183+
- **File written but `/skills` doesn't list it**
184+
`loadSkillsFromDirs()` in
185+
[src/agent/skill-loader.ts](../src/agent/skill-loader.ts) may not be
186+
reading the user dir. Verify `skillDirectories` in
187+
[src/agent/index.ts](../src/agent/index.ts) includes
188+
`getUserSkillsDir()`.
189+
190+
---
191+
192+
## Reporting Results
193+
194+
If something doesn't work, please capture:
195+
196+
1. The full agent REPL transcript
197+
2. Contents of `$HYPERAGENT_USER_SKILLS_DIR` after the test (`ls -laR`)
198+
3. The agent's debug log (`~/.hyperagent/logs/debug-*.log`)
199+
4. The output of `just check` from the same checkout
200+
201+
…and share with the implementer. Good hunting. 🎯

src/agent/approach-resolver.ts

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import type { Skill } from "./skill-loader.js";
99
import type { Pattern } from "./pattern-loader.js";
1010
import { matchIntent } from "./intent-matcher.js";
11-
import { loadSkills } from "./skill-loader.js";
11+
import { loadSkills, loadSkillsFromDirs } from "./skill-loader.js";
1212
import { loadPatterns } from "./pattern-loader.js";
1313
import { loadModule, type ModuleHints } from "./module-store.js";
1414

@@ -323,20 +323,25 @@ export function formatGuidance(guidance: MaterialisedGuidance): string {
323323
*
324324
* @param prompt - The user's prompt text
325325
* @param preLoadedSkills - Pre-loaded skill names (from --skill flag)
326-
* @param skillsDir - Path to skills/ directory
326+
* @param skillsDir - Path to skills directory(ies). A single string loads
327+
* only system skills; pass an array of `{ dir, source }` records to load
328+
* user skills alongside system skills (user wins on name collision).
327329
* @param patternsDir - Path to patterns/ directory
328330
* @param debugLog - Optional debug logger
329331
*/
330332
export function runSuggestApproach(
331333
prompt: string,
332334
preLoadedSkills: string[],
333-
skillsDir: string,
335+
skillsDir: string | Array<{ dir: string; source: "system" | "user" }>,
334336
patternsDir: string,
335337
debugLog?: (msg: string) => void,
336338
): SuggestApproachResult {
337339
const log = debugLog ?? (() => {});
338340

339-
const skills = loadSkills(skillsDir);
341+
const skills =
342+
typeof skillsDir === "string"
343+
? loadSkills(skillsDir)
344+
: loadSkillsFromDirs(skillsDir);
340345
const patterns = loadPatterns(patternsDir);
341346

342347
let matchedSkillNames: string[];

src/agent/commands.ts

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -488,12 +488,28 @@ const COMMANDS: readonly CommandEntry[] = Object.freeze([
488488
help: "List and invoke available skills",
489489
group: "General",
490490
detail:
491-
"/skills — list all available skills\n" +
491+
"/skills — list system + user skills (user-authored marked 👤)\n" +
492492
"/skills <name> — invoke a skill (injects domain expertise)\n" +
493+
"/skills info <name> — show the full SKILL.md\n" +
494+
"/skills edit <name> — print path of user skill for $EDITOR\n" +
495+
"/skills delete <name> — remove a user skill (system ones are immutable)\n" +
493496
"Skills are SKILL.md files in the skills/ directory.\n" +
494-
"Invoke a skill to get specialised instructions for a task.\n" +
497+
"User skills live in ~/.hyperagent/skills/ and override system ones.\n" +
495498
"Example: /skills pptx-expert — expert at building PPTX presentations.",
496499
},
500+
{
501+
completion: "/save-skill",
502+
help: "Save session learnings as a reusable skill",
503+
group: "General",
504+
detail:
505+
"/save-skill — capture what we learned this session as a SKILL.md\n" +
506+
"/save-skill <name> — same, but suggest a name to the LLM\n" +
507+
"Sends a structured summary of the session's tool activity, MCP\n" +
508+
"servers used, modules registered, and errors hit to the LLM,\n" +
509+
"which then calls the generate_skill tool to write SKILL.md to\n" +
510+
"~/.hyperagent/skills/<name>/. The skill is loaded on next start\n" +
511+
"and triggered automatically by /suggest_approach.",
512+
},
497513
{
498514
completion: "/help",
499515
help: "Show this help (or /help <topic> for details)",

0 commit comments

Comments
 (0)