Skip to content

ExtensityAI/singularity

Repository files navigation

Singularity

CI codecov

Singularity is a macOS SwiftUI overlay app that lets you capture screenshots, record audio, paste text snippets, import files, and quickly generate consolidated answers with an LLM. It is designed for a hotkey-only workflow: capture, assemble context, generate, and paste the result without leaving your current app.

The app emphasizes:

  • Fast, hotkey-driven capture and generation
  • Profile-based prompts and per-profile settings (model, reasoning, verbosity, TTS, web search)
  • Reliable context assembly (images, audio transcripts, text snippets, imported files)
  • Observability: JSONL transcripts, TTS caching, request IDs, and reasoning metadata

Shortcuts (default)

  • ⌘H: Full-screen screenshot
  • ⇧⌘H: Area selection screenshot
  • ⌘J: Mic recording
  • ⌘U: System audio recording
  • ⇧⌘J: Joint Mic + System track
  • ⇧⌘U: Mic + System audio (separate)
  • ⇧⌘C: Add text snippet from clipboard (or selection)
  • ⌘M: Add files (images, audio, text/code, PDF/docx)
  • ⌘↩: Generate (or submit manual input)
  • ⌘B: Show/Hide overlay
  • ⌘G: New session (reset context)

Features

  • Profiles

    • Named prompt profiles with per-profile settings: model, reasoning effort, verbosity, TTS model/voice/auto-generate/auto-play, and web search settings (allowed domains, user location, timezone).
    • Profiles persist under Application Support and are applied consistently to generation.
  • Context Assembly

    • Screenshots and imported images included as attachments for the LLM.
    • Audio recorded/imported is transcribed and appended as text to user messages.
    • Text snippets come from clipboard or imported text-like/code/PDF/docx (converted where possible) and are included with file headers/paths.
    • All of the above are packed into an indexed summary for the LLM to reference.
  • LLM Integration

    • Primary: OpenAI Responses API with reasoning summary capture and encrypted reasoning storage when supported.
    • Fallback: Chat Completions for multimodal and web-search models when needed.
    • Web Search: per-profile enable + allowed domains + approximate user location + timezone. Sources are logged to JSONL.
    • Output control: max_output_tokens enforced in streaming and non-streaming paths.
  • TTS

    • OpenAI TTS with configurable model + voice; optional auto-generate and auto-play after each answer.
    • Audio cached at Application Support with playback controls (rate toggle, skip, resume).
  • Observability

    • JSONL logs under Application Support/Singularity/Conversations with messages, assets, reasoning summary, encrypted reasoning, and web sources.
    • Plaintext request/response logs per run under Application Support/Singularity/Logs.

How It Works (High-Level)

  • AppModel orchestrates capture, context assembly, profile selection, and LLM calls.
  • LLMService builds typed content for the OpenAI Responses API, handling images/audio transcripts and reasoning metadata.
  • ConfigService resolves configuration from (priority order): Keychain → bundled Config.plist → Application Support Config.plist → environment variables.
  • Profiles persist to Application Support/Singularity/Profiles.json (selected + items), backfilled with defaults if new fields are added.
  • Web Search settings are respected across Responses and Chat fallbacks (allowed domains, location, timezone), with model gating for unsupported combos.

Build & Run

Prerequisites

  • Xcode 15.2+ on macOS.
  • An OpenAI API key.

Open in Xcode and run

  • xed . or open singularity.xcodeproj
  • Select the singularity scheme and press ⌘R (choose My Mac destination).

Command line (build/test)

  • Build: xcodebuild -scheme singularity -destination 'platform=macOS' build
  • Test: xcodebuild -scheme singularity -destination 'platform=macOS' test

Configuration

API Key

  • The app reads OPENAI_API_KEY from:
    1. macOS Keychain: account OpenAIAPIKey, service Singularity
    2. bundled Config.plist
    3. Application Support Singularity/Config.plist
    4. environment variable OPENAI_API_KEY

Other keys (read from the same sources)

  • OPENAI_MODEL (default: gpt-5)
  • OPENAI_REASONING (default: medium) — reasoning disabled for mini/nano variants
  • OPENAI_VERBOSITY (default: low)
  • OPENAI_SYSTEM_PROMPT (default: concise coding coach)
  • OPENAI_MAX_OUTPUT_TOKENS (default: 1024)
  • Timeouts: OPENAI_TIMEOUT_REQUEST_SECONDS (default: 300) and OPENAI_TIMEOUT_RESOURCE_SECONDS (default: 600)
  • No-progress stream timeout: OPENAI_NO_PROGRESS_TIMEOUT_SECONDS (default: 30)
  • TTS: OPENAI_TTS_MODEL, OPENAI_TTS_VOICE, OPENAI_TTS_AUTOGENERATE, OPENAI_TTS_AUTOPLAY

Profiles

  • Stored at ~/Library/Application Support/Singularity/Profiles.json.
  • Keep multiple profiles, reorder them, and cycle via hotkey.

Logging & Cache

  • JSONL sessions: ~/Library/Application Support/Singularity/Conversations
  • TTS cache: ~/Library/Application Support/Singularity/TTS
  • Logs: ~/Library/Application Support/Singularity/Logs

Permissions

  • Screen Recording: needed for screenshots (System Settings → Privacy & Security → Screen Recording).
  • Microphone: needed for mic recording.
  • Accessibility (optional): used to read selected text when clipboard capture is empty.

Running Tests

Xcode

  • Use the singularity scheme and run tests.

CI-friendly scheme

  • Use the coverage-enabled scheme: singularity-CI.
  • Locally: xcodebuild -scheme singularity-CI -destination 'platform=macOS' test

GitHub Actions

  • Workflow: .github/workflows/ci.yml
  • Script runner: ./ci-run-tests.sh
  • Artifacts: build/TestResults.xcresult and build/junit.xml (if xcpretty is available).
  • Codecov upload uses build/TestResults.xcresult (set secret CODECOV_TOKEN for private repos).

Troubleshooting

  • Tests failing in CI with signing error

    • The CI runner disables code signing via ci-run-tests.sh flags: CODE_SIGNING_ALLOWED=NO, etc.
  • Streaming stuck / long pauses

    • The stream watchdog aborts and falls back to Chat Completions if no progress occurs for OPENAI_NO_PROGRESS_TIMEOUT_SECONDS.
  • Web Search sources not showing

    • Ensure the profile enables Web Search and allowed domains are set correctly. Sources are logged to JSONL under sources.
  • Screenshots or audio not captured

    • Confirm permissions. For screen capture, re-open System Settings and grant access to the app.

Security & Privacy

  • Uses the minimal entitlements needed (screen capture, microphone, sandbox). No secrets are committed.
  • API keys are read from Keychain/Config/env; do not hardcode.
  • Reasoning-encrypted content is stored if provided by the API; summaries are shown, raw text is not surfaced.

Contributing

  • Keep changes focused.
  • Follow Swift style (4 spaces; ~120 cols). Views end with View suffix.
  • Use clear names and minimal comments (self-explanatory code preferred).
  • Add tests for profiles, context assembly, and LLM path changes.

Status Badges

  • CI badge tracks main branch.
  • Codecov badge shows coverage for main; install the Codecov app (and set CODECOV_TOKEN for private repos).

License

Copyright © Marius-Constantin Dinu. All rights reserved.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors