AskDocs 📄🔎

A small, polished RAG (Retrieval-Augmented Generation) document Q&A service built with FastAPI and the OpenAI API. Upload PDFs, text, or markdown, then ask questions and get answers grounded in your documents — with inline citations back to the source passages.

This is a deliberately compact project meant to teach the whole RAG loop end to end without hiding anything behind a heavyweight vector database. Every moving part — chunking, embedding, similarity search, prompt construction — is plain, readable Python you can step through.

Stack: FastAPI · Pydantic v2 · OpenAI embeddings + chat · NumPy vector search · zero external services.

✨ Features

Upload & index .pdf, .txt, and .md files via a REST endpoint or web UI.
Cited answers — every response lists the exact passages it used, with cosine-similarity scores.
Transparent retrieval — a ~150-line NumPy vector store you can actually read, persisted to a single JSON file.
Clean architecture — ingestion, embeddings, storage, retrieval, and the LLM call are each isolated and independently testable.
Auto docs — interactive OpenAPI explorer at /docs for free.
Tested without an API key — the pure logic (chunking, vector math, HTTP layer) is covered by pytest with the OpenAI calls mocked.
Dockerized and ready to deploy.

🏗️ How it works

                upload                          ask
  ┌──────────┐  ─────►  ┌───────────┐         ┌───────────┐
  │ document │          │  chunk +  │         │  embed    │
  │ (.pdf…)  │          │  embed    │         │  question │
  └──────────┘          └─────┬─────┘         └─────┬─────┘
                              │                     │
                              ▼                     ▼
                        ┌───────────────────────────────┐
                        │   JSON vector store (NumPy)    │
                        │   cosine-similarity search     │
                        └───────────────┬───────────────┘
                                        │ top-k chunks
                                        ▼
                          ┌──────────────────────────┐
                          │  build numbered context   │
                          │  → OpenAI chat completion  │
                          │  → answer + citations      │
                          └──────────────────────────┘

Layer	File	Responsibility
Config	`app/config.py`	Typed settings from `.env` (pydantic-settings)
Ingestion	`app/ingestion.py`	Parse files, normalize, chunk with overlap
Embeddings	`app/embeddings.py`	Thin OpenAI embeddings wrapper (batched)
Vector store	`app/vectorstore.py`	Persisted cosine-similarity search
LLM	`app/llm.py`	Grounded chat completion + system prompt
RAG	`app/rag.py`	Orchestrates ingest & ask
API	`app/main.py`	FastAPI routes, DI, error handling
UI	`app/static/index.html`	Single-file upload + chat front end

🚀 Quickstart

# 1. Clone & install
pip install -r requirements.txt

# 2. Configure
cp .env.example .env
# edit .env and set OPENAI_API_KEY=sk-...

# 3. Run
uvicorn app.main:app --reload

Then open:

http://localhost:8000 — the web UI (upload a doc, ask a question)
http://localhost:8000/docs — interactive API docs

Try it with the included sample:

curl -F "file=@sample_docs/refund_policy.md" http://localhost:8000/documents

curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "How long do I have to request a refund?"}'

{
  "answer": "You may request a full refund within 30 days of purchase, as long as the product is unused and in its original packaging [1].",
  "citations": [
    { "filename": "refund_policy.md", "chunk_index": 0, "score": 0.83, "snippet": "Customers may request a full refund within 30 days…" }
  ]
}

🔌 API

Method	Path	Description
`POST`	`/documents`	Upload & index a file (multipart `file`)
`GET`	`/documents`	List indexed documents & chunk counts
`DELETE`	`/documents/{id}`	Remove a document from the index
`POST`	`/ask`	Ask a question → cited answer
`GET`	`/health`	Liveness check

🧪 Tests & lint

make test       # pytest — no API key needed; OpenAI calls are faked
make lint       # ruff check + format --check
make format     # ruff format + autofix
make ci         # lint + test (what CI runs)

Coverage includes chunking edge cases, cosine-similarity ranking, store persistence, the RAG orchestration (with stubbed embedder + LLM), and the HTTP layer (via FastAPI dependency overrides). CI runs on Python 3.11 and 3.12 — see .github/workflows/ci.yml.

🐳 Docker

docker build -t askdocs .
docker run --rm -p 8000:8000 --env-file .env -v $(pwd)/data:/app/data askdocs

⚙️ Configuration

All settings come from environment variables / .env (see .env.example):

Variable	Default	Meaning
`OPENAI_API_KEY`	—	Required. Your OpenAI key
`EMBEDDING_MODEL`	`text-embedding-3-small`	Embedding model
`CHAT_MODEL`	`gpt-4o-mini`	Answer-generation model
`CHUNK_SIZE`	`800`	Target characters per chunk
`CHUNK_OVERLAP`	`150`	Overlap between chunks
`TOP_K`	`4`	Chunks retrieved per question
`DATA_DIR`	`./data`	Where the index persists

🗺️ Ideas for extending it

Swap the JSON store for pgvector, FAISS, or Chroma (only vectorstore.py changes).
Add streaming answers with Server-Sent Events.
Re-rank retrieved chunks with a cross-encoder.
Per-user document collections + auth.
Evaluation harness (faithfulness / answer-relevance scoring).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude		.claude
.github		.github
app		app
sample_docs		sample_docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AskDocs 📄🔎

✨ Features

🏗️ How it works

🚀 Quickstart

🔌 API

🧪 Tests & lint

🐳 Docker

⚙️ Configuration

🗺️ Ideas for extending it

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AskDocs 📄🔎

✨ Features

🏗️ How it works

🚀 Quickstart

🔌 API

🧪 Tests & lint

🐳 Docker

⚙️ Configuration

🗺️ Ideas for extending it

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages