Skip to content

jstruzik/Speek

Repository files navigation

Speek app icon

Speek

Real-time, local-only, speech-to-text for macOS. Speak and watch your words appear in any app.

Swift 5.9 macOS 14.0+ MIT License Apple Silicon


speekdemo.mp4

Features

  • Real-time transcription — text appears as you speak with live corrections
  • Voice Activity Detection — only transcribes when you're speaking, ignores silence
  • Menu bar app — runs quietly in your menu bar, always one click away
  • Global hotkey — toggle with Cmd+Shift+A from anywhere (configurable)
  • 100% local — all processing on-device via WhisperKit, nothing sent to the cloud
  • Auto-corrections — diff-based typing automatically fixes text as the model refines output

Download

Grab the latest Speek.dmg or Speek.app.zip from the Releases page.

Speek is unsigned, so on first launch you may need to right-click → Open, or allow it in System Settings → Privacy & Security.

Getting Started

  1. Open Speek — it appears as a microphone icon in your menu bar
  2. Grant permissions when prompted:
    • Microphone — for audio capture
    • Accessibility — for typing text into other applications (System Settings → Privacy & Security → Accessibility)
  3. Press Cmd+Shift+A (or click the menu bar icon) to start transcribing
  4. Speak — your words are typed in real-time into the focused app
  5. Press Cmd+Shift+A again to stop

On first launch, Speek downloads the Whisper base.en model (~50MB). This is a one-time download stored in ~/Library/Application Support/Speek/Models/.

Build from Source

git clone https://github.com/jstruzik/Speek.git
cd Speek

Option A — Xcode:

Open Speek.xcodeproj, select your development team under Signing & Capabilities, and hit Cmd+R.

Option B — Command line:

swift build -c release

Requirements

macOS 14.0 (Sonoma) or later
Chip Apple Silicon (M1/M2/M3/M4) recommended
Permissions Microphone + Accessibility

How It Works

Speek uses WhisperKit's AudioStreamTranscriber with Voice Activity Detection to capture and transcribe speech in real-time. A diff-based approach handles corrections — if the model revises earlier text, Speek backspaces and retypes the corrected portion so the final output is always accurate.

License

MIT

Acknowledgments

  • WhisperKit — on-device speech recognition for Apple Silicon
  • OpenAI Whisper — the underlying speech recognition model

About

A lightweight, local only, speech to text macOS menu bar app

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages