Singularity is a macOS SwiftUI overlay app that lets you capture screenshots, record audio, paste text snippets, import files, and quickly generate consolidated answers with an LLM. It is designed for a hotkey-only workflow: capture, assemble context, generate, and paste the result without leaving your current app.
The app emphasizes:
- Fast, hotkey-driven capture and generation
- Profile-based prompts and per-profile settings (model, reasoning, verbosity, TTS, web search)
- Reliable context assembly (images, audio transcripts, text snippets, imported files)
- Observability: JSONL transcripts, TTS caching, request IDs, and reasoning metadata
Shortcuts (default)
⌘H: Full-screen screenshot⇧⌘H: Area selection screenshot⌘J: Mic recording⌘U: System audio recording⇧⌘J: Joint Mic + System track⇧⌘U: Mic + System audio (separate)⇧⌘C: Add text snippet from clipboard (or selection)⌘M: Add files (images, audio, text/code, PDF/docx)⌘↩: Generate (or submit manual input)⌘B: Show/Hide overlay⌘G: New session (reset context)
-
Profiles
- Named prompt profiles with per-profile settings: model, reasoning effort, verbosity, TTS model/voice/auto-generate/auto-play, and web search settings (allowed domains, user location, timezone).
- Profiles persist under Application Support and are applied consistently to generation.
-
Context Assembly
- Screenshots and imported images included as attachments for the LLM.
- Audio recorded/imported is transcribed and appended as text to user messages.
- Text snippets come from clipboard or imported text-like/code/PDF/docx (converted where possible) and are included with file headers/paths.
- All of the above are packed into an indexed summary for the LLM to reference.
-
LLM Integration
- Primary: OpenAI Responses API with reasoning summary capture and encrypted reasoning storage when supported.
- Fallback: Chat Completions for multimodal and web-search models when needed.
- Web Search: per-profile enable + allowed domains + approximate user location + timezone. Sources are logged to JSONL.
- Output control: max_output_tokens enforced in streaming and non-streaming paths.
-
TTS
- OpenAI TTS with configurable model + voice; optional auto-generate and auto-play after each answer.
- Audio cached at Application Support with playback controls (rate toggle, skip, resume).
-
Observability
- JSONL logs under Application Support/Singularity/Conversations with messages, assets, reasoning summary, encrypted reasoning, and web sources.
- Plaintext request/response logs per run under Application Support/Singularity/Logs.
- AppModel orchestrates capture, context assembly, profile selection, and LLM calls.
- LLMService builds typed content for the OpenAI Responses API, handling images/audio transcripts and reasoning metadata.
- ConfigService resolves configuration from (priority order): Keychain → bundled Config.plist → Application Support Config.plist → environment variables.
- Profiles persist to Application Support/Singularity/Profiles.json (selected + items), backfilled with defaults if new fields are added.
- Web Search settings are respected across Responses and Chat fallbacks (allowed domains, location, timezone), with model gating for unsupported combos.
Prerequisites
- Xcode 15.2+ on macOS.
- An OpenAI API key.
Open in Xcode and run
xed .or opensingularity.xcodeproj- Select the
singularityscheme and press⌘R(choose My Mac destination).
Command line (build/test)
- Build:
xcodebuild -scheme singularity -destination 'platform=macOS' build - Test:
xcodebuild -scheme singularity -destination 'platform=macOS' test
API Key
- The app reads
OPENAI_API_KEYfrom:- macOS Keychain: account
OpenAIAPIKey, serviceSingularity - bundled
Config.plist - Application Support
Singularity/Config.plist - environment variable
OPENAI_API_KEY
- macOS Keychain: account
Other keys (read from the same sources)
OPENAI_MODEL(default:gpt-5)OPENAI_REASONING(default:medium) — reasoning disabled formini/nanovariantsOPENAI_VERBOSITY(default:low)OPENAI_SYSTEM_PROMPT(default: concise coding coach)OPENAI_MAX_OUTPUT_TOKENS(default: 1024)- Timeouts:
OPENAI_TIMEOUT_REQUEST_SECONDS(default: 300) andOPENAI_TIMEOUT_RESOURCE_SECONDS(default: 600) - No-progress stream timeout:
OPENAI_NO_PROGRESS_TIMEOUT_SECONDS(default: 30) - TTS:
OPENAI_TTS_MODEL,OPENAI_TTS_VOICE,OPENAI_TTS_AUTOGENERATE,OPENAI_TTS_AUTOPLAY
Profiles
- Stored at
~/Library/Application Support/Singularity/Profiles.json. - Keep multiple profiles, reorder them, and cycle via hotkey.
Logging & Cache
- JSONL sessions:
~/Library/Application Support/Singularity/Conversations - TTS cache:
~/Library/Application Support/Singularity/TTS - Logs:
~/Library/Application Support/Singularity/Logs
- Screen Recording: needed for screenshots (System Settings → Privacy & Security → Screen Recording).
- Microphone: needed for mic recording.
- Accessibility (optional): used to read selected text when clipboard capture is empty.
Xcode
- Use the
singularityscheme and run tests.
CI-friendly scheme
- Use the coverage-enabled scheme:
singularity-CI. - Locally:
xcodebuild -scheme singularity-CI -destination 'platform=macOS' test
GitHub Actions
- Workflow:
.github/workflows/ci.yml - Script runner:
./ci-run-tests.sh - Artifacts:
build/TestResults.xcresultandbuild/junit.xml(ifxcprettyis available). - Codecov upload uses
build/TestResults.xcresult(set secretCODECOV_TOKENfor private repos).
-
Tests failing in CI with signing error
- The CI runner disables code signing via
ci-run-tests.shflags:CODE_SIGNING_ALLOWED=NO, etc.
- The CI runner disables code signing via
-
Streaming stuck / long pauses
- The stream watchdog aborts and falls back to Chat Completions if no progress occurs for
OPENAI_NO_PROGRESS_TIMEOUT_SECONDS.
- The stream watchdog aborts and falls back to Chat Completions if no progress occurs for
-
Web Search sources not showing
- Ensure the profile enables Web Search and allowed domains are set correctly. Sources are logged to JSONL under
sources.
- Ensure the profile enables Web Search and allowed domains are set correctly. Sources are logged to JSONL under
-
Screenshots or audio not captured
- Confirm permissions. For screen capture, re-open System Settings and grant access to the app.
- Uses the minimal entitlements needed (screen capture, microphone, sandbox). No secrets are committed.
- API keys are read from Keychain/Config/env; do not hardcode.
- Reasoning-encrypted content is stored if provided by the API; summaries are shown, raw text is not surfaced.
- Keep changes focused.
- Follow Swift style (4 spaces; ~120 cols). Views end with
Viewsuffix. - Use clear names and minimal comments (self-explanatory code preferred).
- Add tests for profiles, context assembly, and LLM path changes.
- CI badge tracks main branch.
- Codecov badge shows coverage for main; install the Codecov app (and set
CODECOV_TOKENfor private repos).
Copyright © Marius-Constantin Dinu. All rights reserved.