Skip to content

autonomouscereal/Void-Local-LLM-Wrapper

Repository files navigation

Void Local LLM Wrapper

Void is a local-first AI orchestration harness for turning model output into real, traceable work. It exposes OpenAI-compatible chat, then routes requests into deterministic tool execution, multimodal generation services, research, RAG, code loops, quality review, artifact manifests, and distillation-ready traces.

The project started as a "local LLM wrapper," but the current architecture is closer to an execution operating layer for AI systems: one API surface in front of many local models, media services, and verification loops.

Mission

Void exists to answer a practical question: what would it take for local and self-hosted AI to do useful multimodal work without becoming a pile of brittle scripts?

The answer in this repo is:

  • Keep a familiar OpenAI-compatible entrypoint.
  • Parse model/tool traffic through strict envelopes instead of ad hoc strings.
  • Route semantically to tools for images, video, audio, research, RAG, code, datasets, and artifact handling.
  • Stamp jobs with seeds, manifests, hashes, traces, and quality decisions.
  • Preserve enough evidence to rerun, audit, improve, or distill the work later.
  • Keep heavy model services modular so they can be run locally, swapped, or scaled independently.

What It Does

  • OpenAI-compatible chat endpoint at /v1/chat/completions.
  • Direct tool execution through /tool.run.
  • Long-running job lifecycle for generation pipelines.
  • Deterministic seed routing and envelope normalization.
  • JSON-hardened model outputs and tool envelopes.
  • RAG hygiene, local search, research collection, evidence binding, and report assembly.
  • Code super-loop utilities for indexing, editing, patching, and iterative local coding workflows.
  • Multimodal services for image generation/edit/upscale, video/Film-2, TTS, music, voice conversion, vocals/stems, OCR, VLM, audio analysis, and media repair style workflows.
  • Artifact indexing, manifests, shardable ledgers, checkpoints, and trace records for later review or training.
  • Quality systems for media scores, review committees, locks, post-run review, segment QA, style packs, and refinement decisions.

Architecture At A Glance

Client or app
  OpenAI-compatible chat, direct tool.run, admin endpoints, job endpoints
        |
        v
Orchestrator
  planner, semantic tool selection, JSON parser, envelopes, deterministic
  seeds, RAG/search, research, code loop, artifact and trace writers
        |
        v
Tool and media services
  ComfyUI, Film-2, image tools, music, TTS, RVC, Demucs, Whisper, OCR,
  VLM, upscale/interpolation, local model backends, SearXNG, pgvector
        |
        v
Artifacts and evidence
  manifests, traces, checkpoints, logs, dataset exports, quality reviews,
  hashes, seeds, parameters, selected outputs

Repository Map

orchestrator/app/
  main.py              API, orchestration, tool dispatch, job entrypoints
  tools_*              Image, music, TTS, and other tool implementations
  film2/               Film/video planning and runtime helpers
  research/            Search, collection, graphing, judgment, reports
  rag/                 Retrieval and hygiene
  code_loop/           Local code agent loop utilities
  artifacts/           Manifests, sharding, artifact index, distillation
  quality/             Metrics, review, refinement, selection, locks
  state/               Checkpoints, IDs, resume state
  tracing/             Runtime, teacher, and training traces
services/
  comfyui, music, xtts, rvc, demucs, whisper, ocr, vlm, faceid,
  realesrgan_upscale, rife_vfi, hunyuan_video, chatui, searxng
void_envelopes/        Assistant/tool envelope normalization and versioning
void_artifacts/        Artifact schema helpers
void_json/             Hardened JSON parser
configs/               Runtime and quality configuration examples
db/                    Raw SQL migrations and DB helpers
docker-compose.yml     Full local service stack

Quick Start

Create a real .env from the example and fill in local values. Do not commit secrets, tokens, or private model credentials.

cp env.example.txt .env
docker compose --env-file .env up -d --build

On Windows PowerShell:

copy env.example.txt .env
docker compose --env-file .env up -d --build

Health check:

curl http://localhost:8000/healthz

Call the OpenAI-compatible endpoint:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d "{\"model\":\"local\",\"messages\":[{\"role\":\"user\",\"content\":\"Create a concise research brief about local multimodal AI.\"}]}"

Direct tool calls go through:

curl http://localhost:8000/tool.run \
  -H "Content-Type: application/json" \
  -d "{\"name\":\"capabilities.list\",\"arguments\":{}}"

Live capabilities:

curl http://localhost:8000/capabilities.json

How To Think About Void

Void is not just a chat UI and not just a media generator. It is a harness that tries to make local AI work reproducible:

  • A model decides or plans.
  • The orchestrator normalizes the request.
  • A tool runs with deterministic IDs/seeds where possible.
  • The output becomes an artifact with a manifest.
  • Quality and review systems can score or refine it.
  • Traces can later become training or distillation data.

That loop is the valuable part. The attached services are replaceable.

Common Workflows

Chat Through The Orchestrator

Use /v1/chat/completions from any client that can speak OpenAI-style chat. The orchestrator can answer directly or call tools when the request requires real work.

Run A Tool Directly

Use /tool.run when you know exactly which tool should execute. This is useful for tests, UI buttons, and automation.

Generate Media

Image, video, music, and voice flows are routed through the orchestrator into the relevant service containers. Long-running work should use the job endpoints:

  • POST /jobs
  • GET /jobs/{id}
  • GET /jobs/{id}/stream
  • POST /jobs/{id}/cancel

Research And RAG

Research modules collect sources, normalize evidence, build timelines/reports, and bind citations into the final artifact. RAG modules focus on hygiene: deduplication, TTL, newest-first selection, and evidence-aware context.

Code Loop

The code loop utilities support indexing, filesystem views, patch generation, and iterative local coding work. Treat this as an agent harness component, not a replacement for review.

Documentation

Status

Void is an active experimental harness. Some services require large local models, GPUs, external assets, or host-specific paths. The repo is best read as an orchestration architecture and a set of working service adapters, not as a single small package install.

Operational Notes

  • Keep secrets in environment files or a vault, never in committed source.
  • Prefer local/private model routes when privacy matters.
  • Treat manifests, traces, and quality decisions as part of the output.
  • Keep services modular: the orchestrator contract should survive model and backend swaps.

License

Proprietary / internal use unless otherwise stated by the repository owner.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors