speakfree

hold to speak  •  free and open source  •  never leaves your mac

macOS 14+  •  Apple Silicon  •  MIT License

Transcription in less than a second.

Free and open source. Regular automatic updates.
I use it daily and barely type anymore.
Proudly vibe coded. Let's improve it together →

1. 🌐
Hold the Globe key
Replaces the built-in macOS dictation shortcut. Any key works.
2. 🎙️
Speak naturally
speakfree listens. Transcription runs on your GPU — no cloud involved.
3. ⌨️
Release — your words appear
Text lands at your cursor in under a second. In any app.
🔒
Completely private
Transcribed on-device using whisper.cpp. Nothing leaves your Mac. Whisper is included in the app — no internet required after install.
Under a second
The model stays loaded between dictations — no warmup, no reload. Most phrases land in under a second on Apple Silicon.
🎙️
Never misses the first word
The audio engine runs continuously. Press the key mid-thought — it already has the beginning. Your first word is always captured.
🌍
99 languages
Any Whisper-supported language, or let it auto-detect. Pick a language in Settings — the matching model downloads automatically.
⌨️
Works in any app
Dictate into Slack, VS Code, Google Docs, or a remote desktop. speakfree picks the right insertion strategy automatically — no apps excluded, no configuration.
🔄
Automatic updates
Signed and notarized by Apple. You're always on the latest version without thinking about it.

How it compares

speakfree Superwhisper VoiceInk Wispr Flow Apple Dictation
Price Free $249.99 Free / $39.99 $15/mo Built-in
License MIT Closed GPL v3 Closed
Stays on your Mac Always Yes Mostly No — cloud only M1+ only
Works on remote desktop ✓ automatic Paste manually
Latency under 1s not published not published 1–2s (cloud) variable
Account required No Yes No Yes Apple ID
Works offline Always Yes Yes Never Yes

Questions

Yes — completely free and MIT licensed. No subscription, no trial, no premium tier. The source code is public and anyone can inspect it, fork it, or contribute. If you want to support development, there's a tip link on GitHub.

Never. Audio goes to a temporary file, gets transcribed locally by whisper.cpp, and is deleted. There are zero network requests during normal use — the only internet connection is the one-time model download on first launch.

Apple Silicon Macs (M1 or later) running macOS 14 Sonoma or later. Whisper runs on the GPU via Metal — Intel Macs are not supported.

Most dictation tools use a single paste strategy everywhere. That works for simple text fields but breaks silently in Electron apps like Slack, VS Code, and Discord; in web apps like Google Docs; and in remote desktops where the clipboard never reaches the remote machine.

speakfree detects what kind of app your cursor is in and picks the right approach automatically: Accessibility API for native Mac apps, clipboard paste (with your previous contents restored) for Electron apps, and keyboard emulation via AppleScript for remote desktops like Splashtop, TeamViewer, and Citrix. Other tools either ask you to configure this manually or just tell you to "paste manually" in remote sessions. speakfree handles it automatically.

Apple Dictation cuts off after 30–60 seconds of silence and struggles with technical terms, code, and anything that isn't plain English prose. Whisper is significantly more accurate — it handles jargon, acronyms, and proper nouns reliably. speakfree also gives you true push-to-talk (hold to record, release to insert), which Apple's built-in dictation doesn't support.

Yes — Whisper supports 99 languages. Go to Settings, pick your language, and speakfree will download the right multilingual model automatically. You can also set it to auto-detect the language from your speech.

v1.7.1

  • The Help menu item now opens the Help window. It previously did nothing.

v1.7.0

  • The first-launch model download now shows a real, smooth progress bar and reliably finishes. A timing bug in the setup window could leave it stuck at 0% even after the model had fully downloaded.
  • A partial or corrupted model cache now repairs itself automatically instead of silently failing.
  • Stop reliably cancels a download. Removed a "pause" that could only restart the large download from scratch.
  • Closing the setup window after the model is installed no longer asks you to download one.

v1.6.0

  • Parakeet (English) is now the default engine — NVIDIA's speech model running on the Apple Neural Engine, faster and more accurate than Whisper for English
  • Dictation never inserts "…" or other pause markers when you hesitate mid-sentence
  • Dictating into the middle of a sentence now continues in lowercase instead of force-capitalizing — names, "I", and acronyms keep their case
  • Recordings are kept by default as your private, on-device dictation history — set a cap or clear them anytime in Settings
  • Removed the auto-correction "learner" that was quietly degrading accuracy over time
  • A missing model now prompts a re-download instead of silently falling back to a lower-quality one
  • Faster, more reliable transcription: reworked model downloads with integrity verification, performance regression guards, and multi-monitor overlay fixes

v1.5.0

  • Local transcription API (experimental) — point any OpenAI-compatible audio client at http://localhost:5765
  • Security hardening: the API is loopback-only with host-header and bearer-token guards, and model downloads are verified with SHA-256

v1.4.0

  • New Parakeet transcription engine (NVIDIA, via FluidAudio) — runs on the Apple Neural Engine for higher accuracy, with an optional one-time ~600 MB download
  • The model now warms up at launch so your first dictation isn't slow
  • macOS 14 Sonoma is now the minimum — required by the Parakeet engine
  • Fix: AirPods / Bluetooth handoff no longer crashes the app
  • Adaptive silence threshold improves transcription of quiet or rambling speech

v1.3.0

  • large-v3-turbo is now the default model — 4× faster than large-v3, comparable accuracy, 1.5 GB download vs 3.1 GB
  • Pre-recording buffer ensures the first word isn't cut off when you start speaking
  • Streaming preview now strips hallucination markers ([BLANK_AUDIO], etc.) from the live overlay
  • Fix: use-after-free in the streaming transcription callback that could crash during rapid hold/release
  • Fix: stale OCR screen context from a previous recording appearing in the current one
  • Fix: characters above U+FFFF (emoji, rare CJK) no longer silently dropped during insertion
  • Security: text insertion skipped when Secure Input is enabled (password fields, terminal secure mode); recordings dir 0700, transcripts 0600

v1.2.12

  • Fix: AirPods/Bluetooth handoff no longer crashes the app (AVAudioEngine NSException now caught instead of becoming SIGABRT)
  • Fix: spoken-punctuation wrapped in quotes ("Question mark" / "period.") now converts correctly
  • Fix: single-letter acronyms (U.S.A., A.B.C.) stay joined instead of being split into "U. S. A."

v1.2.11

  • CRITICAL: Sparkle auto-update is now working — every previously shipped version had a broken signature, so this is the first version that can actually deliver updates
  • Fix: the gradual comma/apostrophe degradation (cursor-context was being fed raw into Whisper's prompt, which made it amplify commas/apostrophes over time)
  • Fix: literal "comma" word no longer wrongly converted to a comma in phrases like "the comma issue"
  • Fix: decimals (4.30) preserved; word after "..." stays lowercase
  • Added: large-v3-turbo model option (faster and more accurate than medium.en) — selectable in Settings
  • Retention: recording-history cap applied by default (was previously unbounded; hidden config preserves the old behavior for power users)

v1.2.10

  • Fix: Google Docs in Chrome now gets text inserted via clipboard (Docs silently drops CGEvent unicode events)
  • Fix: Splashtop no longer pastes stale text — types characters directly via AppleScript keystroke, bypassing clipboard sync entirely
  • Fix: whisper comma spam collapsed in post-processing when 3+ consecutive single-word-commas appear ("I'll, try, to, make, it" → "I'll try to make it")
  • Fix: bundle script now always includes Sparkle.framework (previous builds sometimes missed it and crashed on launch)

v1.2.9

  • Fix: comma spam caused by whisper mimicking comma-heavy prompt instruction
  • Fix: only Superhuman uses clipboard paste fallback — VS Code, Signal, Slack etc. now use direct text insertion again
  • Screen context moved to Advanced, marked experimental (can cause hallucinations), off by default
  • Microphone + Apple Events entitlements added to build (fixes silent recordings)

v1.2.8

  • Fix: AirPods connect/disconnect no longer deadlocks the audio engine

v1.2.7

  • Fix: app works on first launch after every install — no longer resets accessibility permission on version upgrade (it persists automatically with Developer ID signing)

v1.2.6

  • Fix: keyboard modifier event tap moved off the main thread — eliminates "can't select text" and cursor flicker caused by system-wide event delivery delays
  • Fix: Accessibility polling after dictation moved to background thread — no more hover/cursor lag in the frontmost app
  • Fix: Electron apps (Slack, Discord, VS Code) now use clipboard paste — no more garbled characters
  • Fix: Remote desktop (Splashtop, TeamViewer) now uses AppleScript paste — no more stray "V" keystrokes
  • Fix: Whisper comma-before-capital sentence detection tuned to avoid false positives
  • Fix: Per-app writing style — trailing period stripped for messaging apps (Signal, Messages, Slack)

v1.2.5

  • Deadlock-free audio rebuild — engine teardown on background thread, AirPods handoff reliable
  • Remote desktop support — clipboard/AppleScript paste for Splashtop, TeamViewer, RDP, Parsec
  • Better sentence detection — comma before capital → period

Performance

  • Whisper runs in-process via Metal GPU — model stays loaded between dictations, no reload delay
  • Audio engine runs continuously with 500ms pre-buffer — never lose the first word
  • Smart model management: auto-unloads under memory pressure, reloads on next keypress

Text insertion

  • Text inserted directly via Accessibility API — no clipboard hijacking for native apps
  • Clipboard paste with clobber-safety for Electron apps — your clipboard (including images) is restored
  • AppleScript keystroke for remote desktops — no manual configuration needed

Settings

  • New native macOS settings panel with inline model download
  • Language picker with autocomplete, per-language model selection
  • Hotkey recorder — click and press any key to set your shortcut
  • Usage stats: dictation count, keystrokes saved, time saved

Reliability

  • AirPods and audio device hot-switching — connect/disconnect without restarting
  • Silence trimming (VAD) and hallucination filtering for better accuracy
  • Spoken punctuation: say "period", "comma", "question mark" naturally

Experimental

  • Live streaming preview while you speak (off by default)

Self-contained app

  • No Homebrew or terminal needed — just drag to Applications
  • Whisper is bundled in the download
  • Signed and notarized by Apple
  • Automatic updates

Reliability

  • Globe key works reliably even after long sessions
  • Keyboard shortcuts like fn+F5 no longer trigger accidental recordings
  • Paste works consistently across all apps including Electron-based ones
  • Recovers gracefully if the configured model is missing

Smarter punctuation

  • Hybrid mode: Whisper auto-punctuates and you can say "period", "comma", etc.
  • No duplicate punctuation when spoken and auto-punctuation overlap
  • Won't accidentally replace words like "karma" or "comma-separated"

Better defaults

  • Globe/fn key instead of Right Option
  • Hybrid punctuation on by default
  • Settings accessible from the menu bar — no config files to edit

Quality of life

  • Context-aware: reads text before your cursor so Whisper matches your writing style
  • Help window with plain-English documentation
  • Animated menu bar icon shows recording, transcribing, and downloading state
  • "Copied to clipboard" fallback when the original text field can't be refocused