Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,7 @@ rustfox.db*
.worktrees/

# Playwright config and cache
.playwright/
.playwright/

# Opencode
.opencode/package-lock.json
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ tags: [tag1, tag2] # optional: for organization
2. The skill is auto-loaded at startup β€” no code changes needed
3. Configure the skills directory in `config.toml`: `[skills] directory = "skills"`

All skills are represented in the system prompt by **metadata only** (name + description). **Instruction skills** (no `model` in frontmatter) have their full content loaded by the agent via `read_skill_file(skill_name="...", relative_path="SKILL.md")` when relevant. **Subagent skills** (`model` set) are invoked via `invoke_subagent(skill="name", prompt="...")`. The orchestration skill teaches the agent when to call which subagent and when to override the model (e.g. `model="anthropic/claude-sonnet-4-6"` for thread-writer-hk).
All skills are represented in the system prompt by **metadata only** (name + description). **Instruction skills** (no `model` in frontmatter) have their full content loaded by the agent via `read_skill_file(skill_name="...", relative_path="SKILL.md")` when relevant. **Subagent skills** (`model` set) are invoked via `invoke_agent(agent="name", prompt="...")`. The orchestration skill teaches the agent when to call which subagent and when to override the model (e.g. `model="anthropic/claude-sonnet-4-6"` for thread-writer-hk).

**Subagent tool whitelist:** For subagent skills, the frontmatter `tools:` list must use the **exact** tool names as seen by the agent. MCP tools are named `mcp_{server_name}_{tool_name}` (e.g. `mcp_google-workspace_query_gmail_emails`). These names are logged at startup when MCP servers connect (`MCP server 'X' provides N tools`). A mismatch (e.g. declaring `search_gmail_messages` when the server exposes `query_gmail_emails`) causes the subagent to have no access to that tool.

Expand Down
3 changes: 2 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 15 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "rustfox"
version = "0.1.0"
version = "1.0.1"
edition = "2021"

[dependencies]
Expand Down Expand Up @@ -55,9 +55,12 @@ pulldown-cmark = "0.12"
# SQLite vector search extension
sqlite-vec = "0.1"

# Setup wizard web server (used only by src/bin/setup.rs)
# Setup wizard web server (used by rustfox --setup)
axum = "0.8"

# Embed bundled skills/agents into the binary for cargo install
include_dir = "0.7"

# OCR (pure Rust, neural-network based)
ocrs = "0.12"
rten = { version = "0.24", features = ["rten_format"] }
Expand All @@ -75,7 +78,7 @@ infer = "0.19"
# Base64 for vision API content parts and OAuth PKCE helpers
base64 = "0.22"

# OAuth 2.0 / PKCE helpers (used only by src/bin/setup.rs)
# OAuth 2.0 / PKCE helpers (used by setup wizard)
rand = "0.8"
sha2 = "0.10"

Expand All @@ -84,5 +87,14 @@ regex = "1"

# OS home-directory resolution for the persistent home dir (~/.rustfox)
dirs = "5"

[lib]
name = "rustfox"
path = "src/lib.rs"

[[bin]]
name = "rustfox"
path = "src/main.rs"

[dev-dependencies]
tempfile = "3"
424 changes: 61 additions & 363 deletions README.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion config.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ allowed_user_ids = [123456789]
# Get your API key from https://openrouter.ai/keys
api_key = "YOUR_OPENROUTER_API_KEY"
# Model to use (see https://openrouter.ai/models)
model = "moonshotai/kimi-k2.5"
model = "moonshotai/kimi-k2.6"
# API base URL (usually no need to change)
base_url = "https://openrouter.ai/api/v1"
# Alternative using local ollama
Expand Down
103 changes: 103 additions & 0 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Architecture

## Source Tree

```
src/
β”œβ”€β”€ main.rs # Entry point, config loading, MCP setup, bot launch
β”œβ”€β”€ config.rs # TOML config parsing (all sections)
β”œβ”€β”€ home.rs # Persistent home directory resolution (~/.rustfox)
β”œβ”€β”€ agent.rs # Agentic loop, tool dispatch, skills/agents layer
β”œβ”€β”€ agent_prompt.rs # Prompt preparation, compaction, recovery nudges
β”œβ”€β”€ tools.rs # Built-in tool definitions + sandbox path validation
β”œβ”€β”€ llm.rs # OpenRouter API client with tool calling
β”œβ”€β”€ mcp.rs # MCP client manager for external tool servers
β”œβ”€β”€ file_processor/ # File/attachment processing (OCR, vision, PDF, DOCX)
β”œβ”€β”€ memory/ # SQLite persistence, vector embeddings, RAG, summarizer
β”œβ”€β”€ scheduler/ # Cron/one-shot task scheduler with DB persistence
β”œβ”€β”€ skills/ # Skill loader, registry, embed/seeding, update engine
β”œβ”€β”€ learning.rs # Post-task skill extraction, user model persistence
β”œβ”€β”€ langsmith.rs # Optional LangSmith observability client
β”œβ”€β”€ supervisor/ # Autopilot v2 β€” autonomous task runner
β”‚ β”œβ”€β”€ mod.rs # Facade (submit, execute_now, pause, resume, state)
β”‚ β”œβ”€β”€ task.rs # Task, TaskType, RiskLevel enums
β”‚ β”œβ”€β”€ job.rs # Job, JobType, JobStatus enums
β”‚ β”œβ”€β”€ state.rs # Transition-allowed state machine
β”‚ β”œβ”€β”€ store.rs # CRUD over sup_tasks / sup_jobs / sup_transitions
β”‚ β”œβ”€β”€ intake.rs # Raw text β†’ Task normalization
β”‚ β”œβ”€β”€ classifier.rs # Heuristic / LLM-backed / Skill-aware classifiers
β”‚ β”œβ”€β”€ policy.rs # PolicyEngine β€” auto-execute, clarify, approve gates
β”‚ β”œβ”€β”€ planner.rs # Task β†’ Plan with parallel job groups
β”‚ β”œβ”€β”€ workflow.rs # Fast / Standard / Rigorous workflow templates
β”‚ β”œβ”€β”€ orchestrator.rs # Plan executor with fallback + parallel + subjobs
β”‚ β”œβ”€β”€ verification.rs # Evidence-gated verification engine
β”‚ β”œβ”€β”€ artifact.rs # ArtifactManager with secret redaction
β”‚ β”œβ”€β”€ workspace.rs # Per-task git worktree management
β”‚ β”œβ”€β”€ reporter.rs # Human-readable job summary
β”‚ β”œβ”€β”€ redact.rs # Secret scrubber for api_key / password / token
β”‚ └── backend/ # Backends (reasoning, shell, MCP, claude-code, codex, script)
β”œβ”€β”€ platform/ # Telegram bot handler + tool notifier
β”œβ”€β”€ setup/ # Setup wizard (web + CLI) + service management
└── utils/ # String utilities, markdown-to-entities conversion

skills/ # Bundled skills (15+): code-interpreter, problem-solver,
β”‚ # soul, news-fetcher, sup-* workflow packs, etc.
agents/ # Agent definitions (AGENT.md per agent)
└── verifier/ # Zero-trust verifier (read-only sandbox)
setup/ # Setup wizard HTML
```

## Data Flow

```
User ──Telegram──▢ bot.rs ──▢ Agent.process_message()
β”‚
β–Ό
LlmClient.chat()
(OpenRouter API)
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚
Tool call Text reply
β”‚ β”‚
β–Ό β–Ό
execute_tool() Telegram send
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό β–Ό
Built-in MCP tool Skills/Agents
(tools.rs) (mcp.rs) (agent.rs)
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
Result appended to history
β”‚
β–Ό
Loop back to LLM
(up to max_iterations)
```

## Key Components

| Component | File | Role |
|-----------|------|------|
| **Agent** | `agent.rs` | Orchestrates the agentic loop: calls LLM, dispatches tools, manages conversation state |
| **LlmClient** | `llm.rs` | Stateless HTTP client for OpenRouter `/chat/completions` with tool-calling support |
| **McpManager** | `mcp.rs` | Manages stdio-based MCP child processes; tools namespaced `mcp_{server}_{tool}` |
| **SkillRegistry** | `skills/mod.rs` | Loads and manages skills/agents from the home directory with compile-time embedded fallback |
| **Memory** | `memory/` | SQLite-backed persistence, vector embeddings, hybrid search (FTS5 + vector), query rewriting, summarization |
| **Scheduler** | `scheduler/` | Cron and one-shot task scheduler with DB persistence; supports add/remove/list at runtime |
| **Supervisor** | `supervisor/` | Generic autonomous task runner: intake β†’ classify β†’ plan β†’ execute β†’ verify β†’ report |
| **FileProcessor** | `file_processor/` | Handles image OCR, vision API calls, PDF/DOCX text extraction |

## Agentic Loop

The core loop in `Agent::process_message()` (`agent.rs`):

1. **Prepare** β€” Inject system prompt with skill/agent context, conversation history, and relevant RAG results
2. **Call LLM** β€” Send to OpenRouter with available tool definitions
3. **Check response type**:
- **Tool call(s)** β†’ Execute each tool via `execute_tool()`, append results to conversation, check max iterations, goto step 2
- **Text response** β†’ Send to user via Telegram, update conversation state, run post-task learning
4. **Error recovery** β€” If LLM returns an error or malformed response, append recovery nudge and retry (up to max iterations)
Loading
Loading