PRISM is a Retrieval-Augmented Generation (RAG) based intelligent learning system designed to provide personalized explanations from long-form textual content.
Instead of generating answers purely from a language model, the system:
- Retrieves relevant knowledge from a vector database (Qdrant)
- Adapts explanations based on a learner’s profile
- Uses an LLM only as a final reasoning and explanation layer
This design ensures factual grounding, explainability, and extensibility.
The system is divided into four logical layers:
[ Data Ingestion ] ↓ [ Vector Storage (Qdrant) ] ↓ [ Retrieval + Agent Reasoning ] ↓ [ LLM Explanation Layer ]
Each layer is independent and loosely coupled.
Purpose: Convert raw book content into machine-understandable knowledge.
- Load text from a source file (
atomic_habits.txt) - Split text into meaningful chunks
- Generate embeddings using
SentenceTransformer - Store embeddings with metadata in Qdrant
load_book.pychunk_book.pyembed_chunks.pystore_in_qdrant.py
- Enables semantic search instead of keyword matching
- Makes the system scalable to multiple books or documents
Purpose: Store and retrieve knowledge efficiently using vector similarity.
- Store embeddings and associated text payloads
- Perform cosine similarity search
- Return top-K relevant chunks
- Qdrant (Docker-based local deployment)
- Fast semantic retrieval
- Database is independent of LLM
- Can be replaced by any vector DB with minimal changes
Purpose: Decide what information to give and how to explain it.
This layer:
- Takes a user query
- Retrieves relevant chunks from Qdrant
- Applies user personalization logic
- Constructs a structured explanation prompt
- Learning level (beginner)
- Learning style (story-based)
- Learning goal (self-improvement)
agent_explain.py
This layer acts as the brain of the system, separating reasoning from generation.
Purpose: Convert structured prompts into natural language explanations.
- LLM is used only after retrieval
- Prevents hallucination
- Keeps explanations grounded in source material
- Gemini 2.5 Flash (Free Tier)
- Fallback to mock LLM when API is unavailable
llm_runner.py
This modular design allows easy replacement with:
- OpenAI
- Local LLMs
- Other cloud providers
- User asks a question
- Agent retrieves relevant chunks from Qdrant
- Agent builds a personalized explanation prompt
- LLM generates final response
- Response is displayed to the user
The system is designed for future extensions such as:
- Multimodal explanations (audio summaries)
- Adaptive difficulty levels
- User memory and learning progress tracking
- Multiple document ingestion
- Web or mobile interface
These features can be added without changing the core architecture.
- Depends on quality of source text
- LLM availability may vary (free tier limits)
- No personal data stored
- Transparent knowledge sources
- Human-readable explanations
- Avoids hallucinations via retrieval grounding
PRISM demonstrates a clean, modular implementation of a retrieval-augmented intelligent learning system with clear separation of concerns, personalization logic, and responsible AI design principles.