What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
-
Updated
Sep 21, 2025 - TypeScript
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Find the ghost tokens. Fix them. Survive compaction. Avoid context quality decay.
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Give Claude Code photographic memory in ONE portable file. No database, no SQLite, no ChromaDB - just a single .mv2 file you can git commit, scp, or share. Native Rust core with sub-ms operations.
Make your OpenClaw AI agent faster, smarter, and cheaper. Speed optimization, memory architecture, context management, model selection, and one-shot development guide.
CLI proxy that reduces LLM token usage by 60-90%. Declarative YAML filters for Claude Code, Cursor, Copilot, Gemini. rtk alternative in Go.
Supercharge AI Agents, Safely
Open Source Context infrastructure for AI agents. Auto-capture and share your agents' context everywhere.
Config-driven CLI tool that compresses command output before it reaches an LLM context
LLM-supervised persistent memory for AI agents — graph-based recall, cross-session knowledge, single binary. Works with Claude Code, OpenClaw, and any CLI agent.
Hook-based token compressor for 5 AI CLI hosts (Claude Code, Copilot CLI, OpenCode, Gemini CLI, Codex CLI). Up to 95% bash compression, signature-mode for code reads, cross-call dedup, MCP server, self-teaching protocol. Zero runtime deps.
A discovery and compression tool for your Python codebase. Creates a knowledge graph for a LLM context window, efficiently outlining your project | Code structure visualization | LLM Context Window Efficiency | Static analysis for AI | Large Language Model tooling #LLM #AI #Python #CodeAnalysis #ContextWindow #DeveloperTools
Local-first permanent, persistent memory for all agents and humans
A local-first memory layer for AI (Cursor, Zed, Claude). Persistent architectural context via semantic search.
Inject relevant documentation into your prompts: 98% savings.
A cognition aware database engine for AI agent memory. Purpose built in Rust with WAL, HNSW, knowledge graphs, and speculative context pre assembly. Not a wrapper, a ground up storage engine that thinks.
Transform and optimize your markdown documentation for Large Language Models (LLMs) and RAG systems. Generate llms.txt automatically.
The open source, no-code MCP Server for AI-Native API Access
Analyze your Claude Code context window, detect wasted tokens, and get pasteable fix commands. Zero API calls
Add a description, image, and links to the context-window topic page so that developers can more easily learn about it.
To associate your repository with the context-window topic, visit your repo's landing page and select "manage topics."