You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Windows-native automation service for LLM agents. .NET 9, sub-ms IPC, named-pipe protocol. Full perception stack: UIA → OCR → SendInput with divergence detection and crash recovery.
fromreliable_bridgeimportsend# Open an appsend("open", {"path": "notepad.exe"})
# Type textsend("type", {"text": "Hello from OpenGUI"})
# Click an element by textsend("click-text", {"text": "Save"})
# Detect dialogssend("detect-modals")
# → {"modal_detected": true, "modals": [{"title": "Save As", ...}]}# Full picture of the desktopsend("get-actionable")
# → ranked list of interactable elements with bounds, types, confidence
Or talk over the named pipe directly:
// Send over \\.\pipe\OpenGUI.Service
{"action": "status", "args": {}}
// Response
{"ok": true, "data": {"version": "2.9.3", "screen": [1920, 1200], "platform": "Microsoft Windows NT 10.0.26200.0"}}
┌──────────────────────────────┐
│ Agent Loop (agent_loop.py) │ Observe → Plan → Execute → Verify → Adapt
│ ───────────────────────── │ One step at a time. Retries on failure.
│ • Loads goal.json │ Shell action for native commands.
│ • Walks plan steps │ Reconnects on service death (30s retry).
│ • Calls OpenGUI via pipe │
└──────────┬───────────────────┘
│ named pipe: \\.\pipe\OpenGUI.Service
▼
┌──────────────────────────────┐
│ OpenGUI Service │ Single-instance daemon.
│ ───────────────────────── │ Handles ~20 commands.
│ • SendInput keyboard/mouse │ Divergence detection.
│ • UIA tree traversal │ Self-cleaning on kill.
│ • Modal detection │
│ • Runtime journal │
└──────────────────────────────┘
Why OpenGUI
LLMs need to interact with desktop applications — not just browsers, but Notepad, Settings, Word, Excel, file dialogs, installers. OpenGUI provides a unified layer:
Semantic — find elements by text, not coordinates
Reliable — UIA tree primary, coordinates fallback
Crash-safe — reconnects automatically after service restart
Fast — ~3ms for hotkey dispatch, ~11ms for modal scan
LLM-optimized — structured JSON with confidence, diff, divergence metadata