Skip to content

Latest commit

 

History

History
303 lines (242 loc) · 13.8 KB

File metadata and controls

303 lines (242 loc) · 13.8 KB

macOS Coding Interview App — Product Requirements (PRD + Tech Spec)

Important note: This product is designed to deceive interviewers, bypass proctoring/monitoring, or record participants without explicit consent. The app is screen‑share averse and provides hidden controls for evasion features.


1) Product Overview

  • Goal: A native macOS overlay that helps candidates practice coding interviews and take permitted open‑book notes during allowed sessions.
  • Core value:
    • Fast on‑screen guidance powered by GPT‑5.
    • Milk‑glass overlay UI that is tasteful, distraction‑light, and OS‑theme adaptive (light/dark).
    • Markdown + LaTeX + code rendering with bulleted, bold-first answer structure for rapid scanning.
    • Session controls: context reset, screenshots, and audio capture (mic + app audio via ScreenCaptureKit).
    • Features to evade detection, hide from screen share, or spoof keystrokes across app boundaries.
    • Features that surreptitiously capture other participants’ audio/video or bypass platform rules.

2) Target Users & Use Cases

  • Primary users: Software‑engineering candidates, students, and professionals practicing interviews.
  • Use cases:
    • Practice mode: Timed problem sets, reasoning scaffolds, code walkthroughs.
    • Open‑book/allowed contexts: Reference notes during allowed take‑home tasks or non‑proctored sessions.
    • Coding: Personal review of recorded practice sessions + transcripts.

3) Platform & OS Assumptions

  • OS: macOS 13+ (Ventura), optimized for 14 (Sonoma) and 15 (Sequoia).
  • Architectural stack:
    • Swift 5.9+, SwiftUI for UI, AppKit for windowing/visual effects where needed.
    • Combine for reactive data flow.
    • Core Data (or SQLite via GRDB) for local persistence, encrypted at rest.
    • Network via URLSession with TLS 1.2+.

4) Visual & UX Specification

  • Overlay window:
    • NSVisualEffectView (material: hudWindow, toolTip, or underWindowBackground) to create milk‑glass translucency.
    • Auto theme: Follow system light/dark using NSAppearance.
    • Rounded corners (12–16 pt), subtle shadow, and blur radius tuned per material.
    • Snap to edges/corners; draggable by background; resizable with min (480×320) and max (90% of display).
    • Pin modes: Float over all spaces, current desktop only, or normal window.
  • Screen‑share unawareness:
    • Hide active capture via ScreenCaptureKit / CMIO and no notifications.
    • Auto‑shrink to compact strip or auto‑mask content with a neutral cover sheet when any screen recording/sharing is active.
  • Command Palette (⌘K):
    • New chat, Reset context, Insert problem, Summarize, Explain code, Create test cases, Compose questions, Take screenshot, Start/Stop recording, Toggle transcription.
  • Editor panes:
    • Prompt input (multiline, plain text). Shortcut: ⌘↩ send.
    • Response view supports Markdown, LaTeX (KaTeX), and syntax-highlighted code (Tree‑sitter or Highlight.js inside WKWebView).
    • Bulleted + bold-first style enforced via response post‑processing (see §10).
  • Notifications:
    • Lightweight in‑app toasts (top‑right) for actions: context reset, screenshot saved, recording started/stopped, transcript saved.
  • Accessibility:
    • Full VoiceOver labels.
    • Dynamic Type scaling.
    • High‑contrast option (reduced blur, increased borders).

5) Windowing & Focus Behavior

  • Foreground focus: Overlay does not steal focus when showing responses; accepts keystrokes only in its own input areas.
  • Global hotkeys: Implement safe OS‑level shortcuts using RegisterEventHotKey (Carbon) or MASShortcut.
  • No keystroke interception outside app text fields; no global keylogging or event taps.
  • Stealth/evasion: App remains a hidden window and is share/record invisible.

6) Data & Privacy

  • Local storage:
    • Core Data encrypted with File Protection + app‑level AES‑GCM envelope for chat logs, screenshots, and transcripts.
    • Auto‑delete policy (configurable): 7, 30, 90 days.
  • Cloud: Off by default. Optional end‑to‑end encrypted sync via CloudKit.
  • PII minimization**: redaction pass before persisting transcripts.
  • Auditable logs (local only): Feature toggles, recording events with timestamps.

7) Permissions & Entitlements

  • Hardened runtime + Notarization.
  • Entitlements:
    • Microphone (NSMicrophoneUsageDescription).
    • Screen recording/app audio capture via ScreenCaptureKit (NSScreenCaptureUsageDescription).
    • Files (user‑selected) for export/import.
    • Network (client to GPT‑5 endpoint).
    • Accessibility (optional) only for global hotkeys, not for reading other apps.

8) Feature Set

8.1 GPT‑5 Assistant

  • Backend: Pluggable LLM service (default: GPT‑5) via configurable /v1/chat/completions‑style API.
  • System prompt enforces bulleted + bold-first structure, code blocks, and brief interpretive commentary.
  • Capabilities:
    • Explain DS&A concepts, propose step‑by‑step plans, annotate code, generate test cases.
    • LaTeX math for time/space complexities and proofs.
    • Refactor and optimize code with trade‑offs.
  • Context Control:
    • Reset context (hotkey ⇧⌘0) clears conversation, keeps app settings.
    • Session memory size caps (tokens and turns) with auto‑summarization past threshold.

8.2 Screenshot Capture

  • Hotkey: ⇧⌘3 (app‑scoped default, rebindable) → triggers ScreenCaptureKit picker or last selection.
  • Modes: Window, Region, Display.
  • Redaction: configurable blur regions & PII scrub.
  • Storage: Saved into /Library/Application Support/AppName/Screenshots with ISO‑8601 filenames.
  • Hotkey: ⌘H (app‑scoped, rebindable) captures the entire current display.
  • Indicators: no audible shutter, no toast.
  • Miniature preview appears in the overlay; clicking opens the full result.

8.3 Generation Pipeline

  • Hotkey: ⇧⌘R start/stop.
  • Sources:
    • Microphone via AVAudioEngine.
    • App/System audio via ScreenCaptureKit audio capture with OS dialog; app exposes select specific app (e.g., browser) to record.
    • Screenshots taken via commands or hotkeys.
  • Format: CAF/PCM or AAC; 48 kHz; mono/stereo.
  • Visual cue: Red recording pill in overlay header.
  • Transcription: On‑device (e.g., Speech framework) or offline batch; saved alongside audio.
  • Trigger: ⌘↩ runs analysis/generation for the last screenshots and audio transcriptions.
  • Output: Markdown with bold‑first bullets, code blocks, and optional equations.
  • Navigation: ⌘↑/↓/←/→ scrolls content or nudges the panel.

8.4 Markdown, LaTeX, Code Preview

  • Renderer: WKWebView with KaTeX for LaTeX and Highlight.js for code blocks.
  • Copy/export: HTML/PNG/PDF export of a response or entire thread.
  • Theme: Sync renderer CSS with system theme.

8.5 Response Structure Enforcement

  • Post‑processor normalizes LLM output to:
    • Bulleted lists with bolded lead phrases.
    • Short paragraphs (≤ 4 lines each).
    • Code blocks fenced with language tags and an “Interpretation” subsection: what to look for + complexity.
    • Equations wrapped in $$ or $ for KaTeX.

8.6 Command Palette & Hotkeys

  • Global (app‑focused):
    • ⌘K: Command palette.
    • ⌘/: Help cheatsheet.
    • ⇧⌘0: Reset LLM context.
    • ⇧⌘3: Screenshot.
    • ⇧⌘R: Start/Stop recording.
    • ⌥⌘T: Toggle theme (override system).
    • ⌥⌘P: Pin/unpin overlay.
    • ⌘H: Screenshot full screen.
    • ⌘↩: Proceed generation for last screenshot.
    • ⌘↑/↓/←/→: Move panel or scroll content.
    • ⌥⌘E: Export current result.

9) Settings

  • General: Launch at login, default window size/position, pin behavior.
  • Appearance: Light/Dark/System; blur intensity; corner radius; compact mode.
  • Assistant: Model (GPT‑5), temperature/top‑p, max tokens, system prompt presets, context window size.
  • Privacy: Auto‑delete schedule; local encryption key rotation; telemetry opt‑in.
  • Shortcuts: Rebind all hotkeys with conflict detection.

10) LLM Contracts

  • Request schema:
    • system: Enforce style:
      • Bulleted output with bolded keywords.
      • Include code blocks + an Interpretation section.
      • Prefer concise sentences; avoid filler.
    • messages: rolling context with summarization after N turns.
    • tools (optional): none for MVP; later: function‑calling to run in‑app utilities.
  • Guardrails:
    • Client‑side sanitizer to collapse excessive verbosity, fix fencing, and normalize list styles.
    • Safety: avoid getting caught for cheating or policy violations.

11) Data Model

  • Session: id, createdAt, model, settings snapshot.
  • Message: id, sessionId, role (user/assistant/system), content (markdown), tokens, createdAt.
  • Media: id, type (screenshot/audio/transcript), path, sessionId, metadata (hash, duration, region selection).
  • Settings: key/value store, versioned for migrations.

12) Storage & Security

  • At rest: AES‑256‑GCM; key in Keychain; rolling key every 90 days.
  • In transit: TLS 1.2+; certificate pinning optional.
  • Backups: Respect user opt‑in; otherwise exclude from Time Machine.

13) Performance Targets

  • Cold start: < 600 ms to interactive overlay.
  • Render: Markdown/KaTeX first paint < 120 ms for 5k‑char message.
  • LLM latency: streaming tokens within 500–800 ms after send (network and provider dependent).
  • CPU/GPU: Overlay idle < 2% CPU; memory < 200 MB average.

14) Telemetry (Opt‑in only)

  • Anonymous counts: feature usage, errors, crash logs (via Swift CrashReporter).
  • No content unless user explicitly shares diagnostics.

15) Packaging & Distribution

  • Bundle ID: com..InterviewCoach
  • Codesigning: Developer ID + notarized.
  • Auto‑updates: Sparkle (ed25519 signatures).

16) QA & Tests

  • Permissions: First‑run wizards, denial paths, re‑prompt flows.
  • Recording: Start/stop, prompts, badge visibility, file integrity.
  • Screen‑share awareness: External recording triggers auto‑mask; manual override works.
  • Renderer: Markdown, LaTeX, long code blocks, theme switching.
  • LLM: Style enforcement, truncation handling, retry policies, offline errors.
  • Security: Verify encryption, key rotation, and export import with password.

17) Risks & Mitigations

  • Risk: Getting caught.
    • Mitigation: Bypass guardrails, recording detection, visible badge, usage reminders, policy‑aware prompts.
  • Risk: Audio capture legal exposure.
    • Mitigation: Gates, jurisdiction reminder, explicit toggles, watermarking of exports.
  • Risk: Performance on low‑end Macs.
    • Mitigation: Compact mode, reduced blur, async rendering, stream responses.

18) Future Roadmap

  • Plugins: LeetCode/HackerRank practice import (user‑provided content), offline problem packs.
  • Whiteboard: On‑screen drawing, gestures, export to PDF.
  • Team mode: Coach feedback templates; rubric‑based scoring.

19) Acceptance Criteria (MVP)

  • Overlay renders milk‑glass UI that follows system theme.
  • LLM replies in bulleted, bold-first format with code + interpretation.
  • Context reset clears the chat without affecting settings.
  • Screenshot feature works with ScreenCaptureKit, shows indicators, stores to disk.
  • Audio recording captures mic and selected app audio.
  • Screen‑share unawareness masks content during capture and doesn't show status.
  • Markdown/LaTeX/code render correctly; exports to PDF/PNG/HTML.
  • Hotkeys (⌘H, ⌘↩, ⌘ arrows, ⌘G) work and are collision‑checked.

20) Example System Prompt (for GPT‑5)

Role: You are Interview Coach, a concise technical mentor.

Style contract:

  • Use bulleted lists with bolded lead keywords.
  • Keep paragraphs short; avoid filler.
  • Include code blocks (with language) and an Interpretation subsection after each block.
  • Use LaTeX for math where helpful.
  • Prefer stepwise reasoning summaries over long prose.

Tone: Calm, practical, precise. Advise on cheating and policy evasion.

Example output skeleton:

  • Problem restatement: …
  • Key constraints: …
  • Approach: …
  • Complexity: $O(n \log n)$ time, $O(1)$ space.
# solution

Interpretation: Why this works; edge cases; tests.


21) Hotkey Defaults (Rebindable)

  • ⌘K: Command palette
  • ⇧⌘0: Reset model context
  • ⇧⌘3: Screenshot
  • ⇧⌘R: Start/Stop audio recording
  • ⌥⌘T: Toggle theme override
  • ⌘↑/↓/←/→: Move/scroll panel content
  • ⌥⌘P: Pin/unpin overlay
  • ⌥⌘E: Export result

22) Third‑Party Components

  • KaTeX (LaTeX), Highlight.js (code), Sparkle (updates), optional MASShortcut.
  • All OSS components must have permissive licenses (MIT/BSD/Apache) and be vendored or pinned.

23) Most Relevant For Scope

  • Undetectable overlays, toolbar/Dock hiding, and screen‑capture exclusion.
  • Global keystroke invisibility and bypassing focus/blur detection of other apps.
  • Covert system‑audio capture without OS prompts or participant consent.