Skip to content

Latest commit

 

History

History
192 lines (151 loc) · 7.96 KB

File metadata and controls

192 lines (151 loc) · 7.96 KB

Oxide Project Info

1. Project Context

Oxide is a Rust-based automation kernel that receives text commands from messaging channels (Telegram, Discord, CLI), routes each request to a Lua skill, and replies on the same channel.

The project is designed around three priorities:

  • secure execution of user-defined automations,
  • low operational cost (token-efficient AI usage),
  • channel-agnostic behavior (same skill logic across adapters).

In practice, Oxide behaves like an orchestration runtime for sandboxed Lua skills with optional AI assistance.

2. Scope (What the project currently does)

In scope

  • Multi-adapter inbound messaging:
    • Telegram adapter (allowlist by admin user IDs)
    • Discord adapter (allowlist by user IDs)
    • Local interactive CLI adapter
  • Semantic intent routing to skills using embeddings (all-MiniLM-L6-v2 via fastembed).
  • Two-stage parameter extraction:
    • Stage 1: local deterministic extraction (try_local_extract in Lua skill)
    • Stage 2: AI JSON extraction fallback only when local extraction fails
  • Skill execution in a hardened Lua 5.4 sandbox (mlua).
  • Asynchronous job scheduling and automation:
    • ad-hoc queued jobs,
    • cron-based automations,
    • chained jobs returned by skills (schedule_job).
  • SQLite persistence for queueing, automations, key-value skill state, and embedding cache.
  • Management commands from chat/CLI:
    • /enqueue, /jobs, /kill.

Out of scope (current state)

  • No web dashboard or GUI management panel.
  • No distributed/multi-node scheduler (single-process runtime).
  • No built-in RBAC beyond adapter-level allowlists.
  • No formal plugin marketplace or remote skill registry.

3. Architectural Style

Oxide follows a Ports-and-Adapters (Hexagonal) architecture.

  • Core domain is isolated in src/core.
  • External integrations live in src/adapters and src/network.
  • Stable contracts are traits in src/ports.

Main layers

  • Driving adapters (input/output edges): Telegram, Discord, CLI.
  • Core orchestration: event handling, routing, extraction strategy, execution dispatch.
  • Runtime and execution isolation: Lua sandbox bridge.
  • Async scheduling subsystem: producer + worker pool for delayed/periodic jobs.
  • Infrastructure: SQLite persistence and HTTP AI client.

4. Key Components and Responsibilities

main startup composition

  • Loads settings (Settings.toml + env overrides where supported).
  • Builds AI provider (enabled or disabled adapter).
  • Initializes SQLite and migrations.
  • Creates Orchestrator and starts scheduler.
  • Registers all configured channel adapters and starts adapter loops.

Orchestrator (core brain)

Responsibilities:

  • receives InboundEvent from adapters,
  • resolves management commands,
  • computes embeddings and selects best skill by cosine similarity,
  • if similarity is below threshold and AI is enabled, attempts AI-based route selection among top candidates,
  • runs local extraction first,
  • if needed, calls AI for structured JSON extraction,
  • executes skill in sandbox,
  • returns answer to originating adapter,
  • enqueues follow-up scheduled jobs if returned by skill output.

Lua runtime / bridge

  • Loads skills from skills/*.lua.
  • Exposes execution entrypoints (execute, on_schedule).
  • Enforces sandbox constraints and operational guards.
  • Injects execution context (adapter/platform/user metadata) into params.

Scheduler + worker pool

  • Scheduler producer scans due tasks and cron automations.
  • Worker pool executes scheduled jobs in blocking threads, while async runtime remains responsive.
  • Handles retries, task state transitions, backpressure rescheduling, and cancellation (/kill).

Ports

  • AiProvider: chat_with_system, extract_json_for_skill.
  • MessagingProvider: send_text, send_image.

This keeps core logic independent from concrete AI or chat platform implementations.

5. Runtime Flows

A) Real-time inbound message flow

  1. Adapter normalizes platform payload into InboundEvent.
  2. Orchestrator checks management commands.
  3. Orchestrator computes message embedding and ranks skills.
  4. If confidence is adequate, selected skill is used; otherwise AI router may choose skill (when enabled).
  5. try_local_extract runs first (cheap deterministic path).
  6. If extraction is missing, AI extracts JSON parameters from skill schema.
  7. Skill executes in Lua sandbox.
  8. Orchestrator sends resulting answer through original adapter.
  9. Optional: if result contains schedule_job, task is inserted into scheduler queue.

B) Scheduled/automation flow

  1. Producer checks cron automations and pending scheduled tasks.
  2. Due tasks are claimed and sent to worker queue.
  3. Worker executes on_schedule in sandbox.
  4. Result may enqueue additional tasks.
  5. If execution context includes adapter + platform ID, answer can be forwarded to user/channel.

6. Data and Persistence Model (SQLite)

Main tables:

  • automations: cron-based recurring skill triggers.
  • job_queue: legacy compatibility queue/state.
  • scheduled_tasks: active delayed/priority execution queue.
  • skill_kv: per-skill key-value storage.
  • embedding_cache: hash/model keyed vector cache.

Operational notes:

  • SQLite WAL mode is enabled.
  • Foreign keys are enabled.
  • Indexed queries support status/ready-task lookups.
  • Legacy records are normalized/migrated into current scheduling model.

7. AI Usage Strategy

AI is optional and explicitly gateable (ai_enabled).

When enabled, AI is used in constrained places:

  • general chat fallback when no skill path is suitable,
  • skill parameter extraction fallback,
  • low-confidence skill routing fallback.

When disabled:

  • deterministic/local skill paths continue to work,
  • AI-only operations fail gracefully with fallback responses.

This design keeps average token usage low by prioritizing local extraction and semantic routing before LLM calls.

8. Security and Safety Posture

Security controls implemented in runtime and adapters include:

  • Lua sandboxing with reduced standard library surface.
  • Runtime guards for instruction/time/resource boundaries during skill execution.
  • Adapter allowlists (Telegram admin IDs, Discord allowed users).
  • Skill path validation to prevent traversal when enqueueing scheduled jobs.
  • Safe handling of scheduler cancellation and queue backpressure.
  • Hardened outbound HTTP behavior from Lua runtime and payload size/time limits.

9. Configuration Surface

Primary config file: Settings.toml.

Key settings:

  • AI endpoint/model/key (litellm block),
  • ai_enabled,
  • similarity_threshold,
  • static_fallback_msg,
  • channel definitions ([[channels]] with telegram or discord).

Environment variable overrides are supported for channel tokens/user allowlists in runtime startup.

10. Technology Stack

  • Language/runtime: Rust + Tokio.
  • Skill runtime: Lua 5.4 via mlua.
  • Embeddings: fastembed (AllMiniLML6V2).
  • Persistence: SQLite via sqlx.
  • Messaging: teloxide (Telegram), serenity (Discord).
  • AI connectivity: OpenAI-compatible chat endpoint (often via LiteLLM).
  • Concurrency model:
    • async IO/event loops in Tokio,
    • blocking skill execution and queue processing via crossbeam/thread workers.

11. Current Boundaries and Assumptions

  • Deployment target is a single process with local SQLite.
  • Skills are trusted project assets but executed under sandbox constraints.
  • Reliability focus is graceful degradation (fallback message, retries, queue reschedule) rather than strict exactly-once distributed semantics.
  • Existing observability is log-centric (tracing), not metrics/dashboard-centric.

12. Practical Summary for an External LLM

If you need to reason about this project quickly:

  • Think of Oxide as a secure automation runtime, not a generic chatbot.
  • The orchestration core is Rust; business automations are Lua skills.
  • AI is a fallback/extractor tool, not the primary execution path.
  • Scheduler and queueing are first-class features for delayed and periodic workflows.
  • Hexagonal architecture keeps adapters and providers replaceable without changing core logic.