Void is a local-first AI orchestration harness for turning model output into real, traceable work. It exposes OpenAI-compatible chat, then routes requests into deterministic tool execution, multimodal generation services, research, RAG, code loops, quality review, artifact manifests, and distillation-ready traces.
The project started as a "local LLM wrapper," but the current architecture is closer to an execution operating layer for AI systems: one API surface in front of many local models, media services, and verification loops.
Void exists to answer a practical question: what would it take for local and self-hosted AI to do useful multimodal work without becoming a pile of brittle scripts?
The answer in this repo is:
- Keep a familiar OpenAI-compatible entrypoint.
- Parse model/tool traffic through strict envelopes instead of ad hoc strings.
- Route semantically to tools for images, video, audio, research, RAG, code, datasets, and artifact handling.
- Stamp jobs with seeds, manifests, hashes, traces, and quality decisions.
- Preserve enough evidence to rerun, audit, improve, or distill the work later.
- Keep heavy model services modular so they can be run locally, swapped, or scaled independently.
- OpenAI-compatible chat endpoint at
/v1/chat/completions. - Direct tool execution through
/tool.run. - Long-running job lifecycle for generation pipelines.
- Deterministic seed routing and envelope normalization.
- JSON-hardened model outputs and tool envelopes.
- RAG hygiene, local search, research collection, evidence binding, and report assembly.
- Code super-loop utilities for indexing, editing, patching, and iterative local coding workflows.
- Multimodal services for image generation/edit/upscale, video/Film-2, TTS, music, voice conversion, vocals/stems, OCR, VLM, audio analysis, and media repair style workflows.
- Artifact indexing, manifests, shardable ledgers, checkpoints, and trace records for later review or training.
- Quality systems for media scores, review committees, locks, post-run review, segment QA, style packs, and refinement decisions.
Client or app
OpenAI-compatible chat, direct tool.run, admin endpoints, job endpoints
|
v
Orchestrator
planner, semantic tool selection, JSON parser, envelopes, deterministic
seeds, RAG/search, research, code loop, artifact and trace writers
|
v
Tool and media services
ComfyUI, Film-2, image tools, music, TTS, RVC, Demucs, Whisper, OCR,
VLM, upscale/interpolation, local model backends, SearXNG, pgvector
|
v
Artifacts and evidence
manifests, traces, checkpoints, logs, dataset exports, quality reviews,
hashes, seeds, parameters, selected outputs
orchestrator/app/
main.py API, orchestration, tool dispatch, job entrypoints
tools_* Image, music, TTS, and other tool implementations
film2/ Film/video planning and runtime helpers
research/ Search, collection, graphing, judgment, reports
rag/ Retrieval and hygiene
code_loop/ Local code agent loop utilities
artifacts/ Manifests, sharding, artifact index, distillation
quality/ Metrics, review, refinement, selection, locks
state/ Checkpoints, IDs, resume state
tracing/ Runtime, teacher, and training traces
services/
comfyui, music, xtts, rvc, demucs, whisper, ocr, vlm, faceid,
realesrgan_upscale, rife_vfi, hunyuan_video, chatui, searxng
void_envelopes/ Assistant/tool envelope normalization and versioning
void_artifacts/ Artifact schema helpers
void_json/ Hardened JSON parser
configs/ Runtime and quality configuration examples
db/ Raw SQL migrations and DB helpers
docker-compose.yml Full local service stack
Create a real .env from the example and fill in local values. Do not commit
secrets, tokens, or private model credentials.
cp env.example.txt .env
docker compose --env-file .env up -d --buildOn Windows PowerShell:
copy env.example.txt .env
docker compose --env-file .env up -d --buildHealth check:
curl http://localhost:8000/healthzCall the OpenAI-compatible endpoint:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d "{\"model\":\"local\",\"messages\":[{\"role\":\"user\",\"content\":\"Create a concise research brief about local multimodal AI.\"}]}"Direct tool calls go through:
curl http://localhost:8000/tool.run \
-H "Content-Type: application/json" \
-d "{\"name\":\"capabilities.list\",\"arguments\":{}}"Live capabilities:
curl http://localhost:8000/capabilities.jsonVoid is not just a chat UI and not just a media generator. It is a harness that tries to make local AI work reproducible:
- A model decides or plans.
- The orchestrator normalizes the request.
- A tool runs with deterministic IDs/seeds where possible.
- The output becomes an artifact with a manifest.
- Quality and review systems can score or refine it.
- Traces can later become training or distillation data.
That loop is the valuable part. The attached services are replaceable.
Use /v1/chat/completions from any client that can speak OpenAI-style chat.
The orchestrator can answer directly or call tools when the request requires
real work.
Use /tool.run when you know exactly which tool should execute. This is useful
for tests, UI buttons, and automation.
Image, video, music, and voice flows are routed through the orchestrator into the relevant service containers. Long-running work should use the job endpoints:
POST /jobsGET /jobs/{id}GET /jobs/{id}/streamPOST /jobs/{id}/cancel
Research modules collect sources, normalize evidence, build timelines/reports, and bind citations into the final artifact. RAG modules focus on hygiene: deduplication, TTL, newest-first selection, and evidence-aware context.
The code loop utilities support indexing, filesystem views, patch generation, and iterative local coding work. Treat this as an agent harness component, not a replacement for review.
Void is an active experimental harness. Some services require large local models, GPUs, external assets, or host-specific paths. The repo is best read as an orchestration architecture and a set of working service adapters, not as a single small package install.
- Keep secrets in environment files or a vault, never in committed source.
- Prefer local/private model routes when privacy matters.
- Treat manifests, traces, and quality decisions as part of the output.
- Keep services modular: the orchestrator contract should survive model and backend swaps.
Proprietary / internal use unless otherwise stated by the repository owner.