Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .claude/commands/implement-feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
You will be implementing a new feature in this codebase

$ARGUMENTS

IMPORTANT: Only do this for front-end features.
Once this feature is built, make sure to write the changes you made to file called frontend-changes.md
Do not ask for permissions to modify this file, assume you can always do it.
6 changes: 6 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"permissions": {
"allow": [],
"deny": []
}
}
44 changes: 44 additions & 0 deletions .github/workflows/claude-code-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Claude Code Review

on:
pull_request:
types: [opened, synchronize, ready_for_review, reopened]
# Optional: Only run on specific file changes
# paths:
# - "src/**/*.ts"
# - "src/**/*.tsx"
# - "src/**/*.js"
# - "src/**/*.jsx"

jobs:
claude-review:
# Optional: Filter by PR author
# if: |
# github.event.pull_request.user.login == 'external-contributor' ||
# github.event.pull_request.user.login == 'new-developer' ||
# github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'

runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write

steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 1

- name: Run Claude Code Review
id: claude-review
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
plugin_marketplaces: 'https://github.com/anthropics/claude-code.git'
plugins: 'code-review@claude-code-plugins'
prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}'
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://code.claude.com/docs/en/cli-reference for available options

50 changes: 50 additions & 0 deletions .github/workflows/claude.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Claude Code

on:
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
issues:
types: [opened, assigned]
pull_request_review:
types: [submitted]

jobs:
claude:
if: |
(github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
(github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write
actions: read # Required for Claude to read CI results on PRs
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 1

- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}

# This is an optional setting that allows Claude to read CI results on PRs
additional_permissions: |
actions: read

# Optional: Give a custom prompt to Claude. If this is not specified, Claude will perform the instructions specified in the comment that tagged it.
# prompt: 'Update the pull request description to include a summary of changes.'

# Optional: Add claude_args to customize behavior and configuration
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://code.claude.com/docs/en/cli-reference for available options
# claude_args: '--allowed-tools Bash(gh pr *)'

3 changes: 3 additions & 0 deletions .playwright-mcp/console-2026-05-24T13-11-58-926Z.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[ 845ms] [LOG] Loading course stats... @ http://localhost:8000/script.js?v=9:166
[ 851ms] [LOG] Course data received: {total_courses: 4, course_titles: Array(4)} @ http://localhost:8000/script.js?v=9:171
[ 852ms] [ERROR] Failed to load resource: the server responded with a status of 404 (Not Found) @ http://localhost:8000/favicon.ico:0
14 changes: 14 additions & 0 deletions .playwright-mcp/page-2026-05-24T13-11-59-810Z.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
- generic [ref=e3]:
- complementary [ref=e4]:
- button "NEW CHAT" [ref=e6] [cursor=pointer]
- group [ref=e8]:
- generic "▶ Courses" [ref=e9] [cursor=pointer]
- group [ref=e11]:
- generic "▶ Try asking:" [ref=e12] [cursor=pointer]
- main [ref=e13]:
- generic [ref=e14]:
- paragraph [ref=e18]: Welcome to the Course Materials Assistant! I can help you with questions about courses, lessons and specific content. What would you like to know?
- generic [ref=e19]:
- textbox "Ask about courses, lessons, or specific content..." [ref=e20]
- button [ref=e21] [cursor=pointer]:
- img [ref=e22]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
75 changes: 75 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Commands

**Install dependencies:**
```bash
uv sync
```

**Run the server** (from the `backend/` directory):
```bash
cd backend
uv run uvicorn app:app --reload --port 8000
```

The web UI is at `http://localhost:8000` and the auto-generated API docs are at `http://localhost:8000/docs`.

**Environment setup:** Copy `.env` and set `ANTHROPIC_API_KEY`.

There are no tests in this codebase.

## Architecture

This is a full-stack RAG chatbot. The backend is a FastAPI app (`backend/app.py`) that serves both the REST API and the static frontend (`frontend/`). All backend modules run from within the `backend/` directory, so relative imports and paths (e.g. `../docs`, `./chroma_db`) are relative to that directory.

### Request flow

1. The browser (`frontend/script.js`) POSTs a query to `POST /api/query`.
2. `app.py` delegates to `RAGSystem.query()` (`rag_system.py`).
3. `RAGSystem` calls `AIGenerator.generate_response()` (`ai_generator.py`), passing the Claude API client, conversation history (from `SessionManager`), and the registered `search_course_content` tool definition.
4. If Claude decides to search, it calls the tool; `AIGenerator._handle_tool_execution()` routes this to `ToolManager.execute_tool()` → `CourseSearchTool.execute()` (`search_tools.py`), which queries `VectorStore` (`vector_store.py`).
5. Search results are injected back into the Claude conversation as a `tool_result` message, and Claude generates the final answer.
6. Sources collected by `CourseSearchTool` are returned to the browser alongside the answer.

### Key components

- **`RAGSystem`** (`rag_system.py`) — top-level orchestrator; the only component that coordinates all others.
- **`VectorStore`** (`vector_store.py`) — wraps ChromaDB with two collections: `course_catalog` (course titles/metadata, used for fuzzy course-name resolution) and `course_content` (chunked text, used for semantic search). Embeddings are generated locally via `sentence-transformers` (`all-MiniLM-L6-v2`). The ChromaDB store is persisted at `backend/chroma_db/`.
- **`DocumentProcessor`** (`document_processor.py`) — parses `.txt`/`.pdf`/`.docx` files from `docs/` into `Course` + `CourseChunk` objects. Expects a specific header format (see below) but falls back to flat chunking if no `Lesson N:` markers are found.
- **`AIGenerator`** (`ai_generator.py`) — thin wrapper around `anthropic.Anthropic`. Uses `tool_choice: auto` and handles one round of tool use (search → final answer). Model and token limits are configured here.
- **`ToolManager` / `CourseSearchTool`** (`search_tools.py`) — extensible tool registry. Adding a new tool means subclassing `Tool` and calling `tool_manager.register_tool()`.
- **`SessionManager`** (`session_manager.py`) — in-memory conversation history, keyed by `session_id`. History is serialized as a plain string and injected into the system prompt.

### Document format

Documents in `docs/` must follow this structure for full metadata extraction:

```
Course Title: <title> ← used as the unique ID in ChromaDB
Course Link: <url> ← optional
Course Instructor: <name> ← optional

Lesson 0: <lesson title>
Lesson Link: <url> ← optional, must immediately follow lesson header
<lesson content>

Lesson 1: <next lesson title>
...
```

If no `Lesson N:` markers are present, the entire file body is chunked as a single flat document.

### Adding a new document source

1. Drop `.txt`, `.pdf`, or `.docx` files into `docs/`.
2. Delete `backend/chroma_db/` to clear stale embeddings.
3. Restart the server — `startup_event()` in `app.py` re-indexes everything.

To clear and re-index programmatically, call `rag_system.add_course_folder(path, clear_existing=True)`.

### Configuration

All tuneable parameters are in `backend/config.py` via the `Config` dataclass: chunk size/overlap, max search results, conversation history length, ChromaDB path, and the Anthropic model name.
3 changes: 3 additions & 0 deletions backend/.playwright-mcp/console-2026-05-24T12-58-16-051Z.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[ 365ms] [LOG] Loading course stats... @ http://127.0.0.1:8000/script.js?v=9:166
[ 371ms] [LOG] Course data received: {total_courses: 4, course_titles: Array(4)} @ http://127.0.0.1:8000/script.js?v=9:171
[ 373ms] [ERROR] Failed to load resource: the server responded with a status of 404 (Not Found) @ http://127.0.0.1:8000/favicon.ico:0
5 changes: 5 additions & 0 deletions backend/.playwright-mcp/console-2026-05-24T13-00-22-107Z.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[ 22ms] [LOG] Loading course stats... @ http://127.0.0.1:8000/script.js?v=9:166
[ 27ms] [LOG] Course data received: {total_courses: 4, course_titles: Array(4)} @ http://127.0.0.1:8000/script.js?v=9:171
[ 32365ms] [LOG] Loading course stats... @ http://127.0.0.1:8000/script.js?v=9:166
[ 32369ms] [LOG] Course data received: {total_courses: 4, course_titles: Array(4)} @ http://127.0.0.1:8000/script.js?v=9:171
[ 147538ms] [ERROR] Failed to load resource: the server responded with a status of 500 (Internal Server Error) @ http://127.0.0.1:8000/api/query:0
14 changes: 14 additions & 0 deletions backend/.playwright-mcp/page-2026-05-24T12-58-16-450Z.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
- generic [ref=e3]:
- complementary [ref=e4]:
- button "+ NEW CHAT" [ref=e6] [cursor=pointer]
- group [ref=e8]:
- generic "▶ Courses" [ref=e9] [cursor=pointer]
- group [ref=e11]:
- generic "▶ Try asking:" [ref=e12] [cursor=pointer]
- main [ref=e13]:
- generic [ref=e14]:
- paragraph [ref=e18]: Welcome to the Course Materials Assistant! I can help you with questions about courses, lessons and specific content. What would you like to know?
- generic [ref=e19]:
- textbox "Ask about courses, lessons, or specific content..." [ref=e20]
- button [ref=e21] [cursor=pointer]:
- img [ref=e22]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 14 additions & 0 deletions backend/.playwright-mcp/page-2026-05-24T13-00-22-145Z.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
- generic [ref=e3]:
- complementary [ref=e4]:
- button "▶ NEW CHAT" [ref=e6] [cursor=pointer]
- group [ref=e8]:
- generic "▶ Courses" [ref=e9] [cursor=pointer]
- group [ref=e11]:
- generic "▶ Try asking:" [ref=e12] [cursor=pointer]
- main [ref=e13]:
- generic [ref=e14]:
- paragraph [ref=e18]: Welcome to the Course Materials Assistant! I can help you with questions about courses, lessons and specific content. What would you like to know?
- generic [ref=e19]:
- textbox "Ask about courses, lessons, or specific content..." [ref=e20]
- button [ref=e21] [cursor=pointer]:
- img [ref=e22]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading