Skip to content

SriHV/askdocs

Repository files navigation

AskDocs 📄🔎

A small, polished RAG (Retrieval-Augmented Generation) document Q&A service built with FastAPI and the OpenAI API. Upload PDFs, text, or markdown, then ask questions and get answers grounded in your documents — with inline citations back to the source passages.

This is a deliberately compact project meant to teach the whole RAG loop end to end without hiding anything behind a heavyweight vector database. Every moving part — chunking, embedding, similarity search, prompt construction — is plain, readable Python you can step through.

Stack: FastAPI · Pydantic v2 · OpenAI embeddings + chat · NumPy vector search · zero external services.


✨ Features

  • Upload & index .pdf, .txt, and .md files via a REST endpoint or web UI.
  • Cited answers — every response lists the exact passages it used, with cosine-similarity scores.
  • Transparent retrieval — a ~150-line NumPy vector store you can actually read, persisted to a single JSON file.
  • Clean architecture — ingestion, embeddings, storage, retrieval, and the LLM call are each isolated and independently testable.
  • Auto docs — interactive OpenAPI explorer at /docs for free.
  • Tested without an API key — the pure logic (chunking, vector math, HTTP layer) is covered by pytest with the OpenAI calls mocked.
  • Dockerized and ready to deploy.

🏗️ How it works

                upload                          ask
  ┌──────────┐  ─────►  ┌───────────┐         ┌───────────┐
  │ document │          │  chunk +  │         │  embed    │
  │ (.pdf…)  │          │  embed    │         │  question │
  └──────────┘          └─────┬─────┘         └─────┬─────┘
                              │                     │
                              ▼                     ▼
                        ┌───────────────────────────────┐
                        │   JSON vector store (NumPy)    │
                        │   cosine-similarity search     │
                        └───────────────┬───────────────┘
                                        │ top-k chunks
                                        ▼
                          ┌──────────────────────────┐
                          │  build numbered context   │
                          │  → OpenAI chat completion  │
                          │  → answer + citations      │
                          └──────────────────────────┘
Layer File Responsibility
Config app/config.py Typed settings from .env (pydantic-settings)
Ingestion app/ingestion.py Parse files, normalize, chunk with overlap
Embeddings app/embeddings.py Thin OpenAI embeddings wrapper (batched)
Vector store app/vectorstore.py Persisted cosine-similarity search
LLM app/llm.py Grounded chat completion + system prompt
RAG app/rag.py Orchestrates ingest & ask
API app/main.py FastAPI routes, DI, error handling
UI app/static/index.html Single-file upload + chat front end

🚀 Quickstart

# 1. Clone & install
pip install -r requirements.txt

# 2. Configure
cp .env.example .env
# edit .env and set OPENAI_API_KEY=sk-...

# 3. Run
uvicorn app.main:app --reload

Then open:

Try it with the included sample:

curl -F "file=@sample_docs/refund_policy.md" http://localhost:8000/documents

curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "How long do I have to request a refund?"}'
{
  "answer": "You may request a full refund within 30 days of purchase, as long as the product is unused and in its original packaging [1].",
  "citations": [
    { "filename": "refund_policy.md", "chunk_index": 0, "score": 0.83, "snippet": "Customers may request a full refund within 30 days…" }
  ]
}

🔌 API

Method Path Description
POST /documents Upload & index a file (multipart file)
GET /documents List indexed documents & chunk counts
DELETE /documents/{id} Remove a document from the index
POST /ask Ask a question → cited answer
GET /health Liveness check

🧪 Tests & lint

make test       # pytest — no API key needed; OpenAI calls are faked
make lint       # ruff check + format --check
make format     # ruff format + autofix
make ci         # lint + test (what CI runs)

Coverage includes chunking edge cases, cosine-similarity ranking, store persistence, the RAG orchestration (with stubbed embedder + LLM), and the HTTP layer (via FastAPI dependency overrides). CI runs on Python 3.11 and 3.12 — see .github/workflows/ci.yml.


🐳 Docker

docker build -t askdocs .
docker run --rm -p 8000:8000 --env-file .env -v $(pwd)/data:/app/data askdocs

⚙️ Configuration

All settings come from environment variables / .env (see .env.example):

Variable Default Meaning
OPENAI_API_KEY Required. Your OpenAI key
EMBEDDING_MODEL text-embedding-3-small Embedding model
CHAT_MODEL gpt-4o-mini Answer-generation model
CHUNK_SIZE 800 Target characters per chunk
CHUNK_OVERLAP 150 Overlap between chunks
TOP_K 4 Chunks retrieved per question
DATA_DIR ./data Where the index persists

🗺️ Ideas for extending it

  • Swap the JSON store for pgvector, FAISS, or Chroma (only vectorstore.py changes).
  • Add streaming answers with Server-Sent Events.
  • Re-rank retrieved chunks with a cross-encoder.
  • Per-user document collections + auth.
  • Evaluation harness (faithfulness / answer-relevance scoring).

About

A lightweight RAG-based document Q&A service using FastAPI and OpenAI API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors