Voice-First Coding

Boris Cherny does most of his coding by speaking to Claude rather than typing. This is not a novelty — it is a productivity pattern that changes how much you can accomplish in a session.

When the friction of expressing an idea drops to zero, you explore more ideas. You describe requirements in full sentences instead of terse prompts. You correct Claude’s misunderstandings immediately instead of pausing to rephrase. You stay in flow instead of context-switching to the keyboard.

Why Voice Changes the Productivity Equation

Typing is a bottleneck between your thoughts and Claude. The slower and more effortful it is to express what you want, the more you simplify your instructions — which means Claude gets less context and produces less accurate output.

Voice removes that bottleneck for requirements and direction. You still use the keyboard for exact strings, file paths, and code snippets that must be precise. But the narrative — the “here’s what I’m trying to do, here’s the constraint, here’s why the current approach is wrong” — flows naturally as speech.

The result: Claude receives richer context, produces better first drafts, needs fewer correction rounds.

Setup Options

Option 1: Claude Code Desktop — Built-In Voice Button

The simplest path. No configuration required.

Open the Claude Code Desktop app. A microphone button appears in the input area. Click it, speak, click again. Your words appear as text in the prompt field. Review, edit if needed, send.

This is the right starting point if you have never tried voice coding before.

Option 2: CLI `/voice` Command

From the Claude Code CLI:

# Enable voice input mode
/voice

# Claude Code activates your system microphone
# Speak your prompt — press Enter or say "send" to submit
# Say "cancel" to discard without sending

Configure the voice language if your system default is not what you want:

/voice --lang vi       # Vietnamese
/voice --lang ja       # Japanese
/voice --lang es       # Spanish

Option 3: System-Level Speech-to-Text

Any tool that converts speech to text in the active input field works with Claude Code CLI. The text arrives in the terminal as if you typed it.

macOS — built-in dictation:

System Settings → Keyboard → Dictation → On
Shortcut: press Fn key twice
Speak → text appears wherever your cursor is

Windows — built-in voice typing:

Shortcut: Win + H
Click the microphone → speak → text appears in active field
Works in any terminal application including Windows Terminal

Linux — whisper-cli:

# Install whisper.cpp for offline transcription
brew install whisper-cpp   # or build from source

# Record and transcribe
whisper-cli --model base.en --output-txt --file recording.wav

Option 4: Wispr Flow (Boris’s Preferred)

Wispr Flow is a cross-application dictation tool that integrates with any text field — terminals, browsers, editors, chat apps. Unlike system dictation, it understands technical vocabulary and can be trained on your personal word patterns.

Boris uses Wispr Flow because it handles technical jargon (package names, framework terms, variable names) better than general-purpose dictation.

Setup:
1. Install Wispr Flow
2. Set a global activation shortcut (e.g., Option+Space)
3. Click into any text field — terminal, browser, IDE
4. Press shortcut → speak → text is inserted

Wispr Flow works across the entire OS, not just Claude Code. Your voice becomes the input method for everything — emails, documentation, commit messages, Slack.

Option 5: Superwhisper (Mac Alternative)

Superwhisper is a macOS-native alternative to Wispr Flow. It uses local Whisper models for transcription, meaning your audio never leaves your machine — relevant if you work with sensitive codebases.

Setup:
1. Install Superwhisper
2. Grant microphone permissions
3. Set global shortcut
4. Speak in any application

The tradeoff vs. Wispr Flow: Superwhisper is fully local (more private, works offline) but requires more CPU and may be slower on older hardware.

How to Dictate Effectively

Use voice for direction, keyboard for precision

Voice is good for:
  "Refactor the authentication middleware to use the new token validation
   service we discussed. The old approach has a race condition when tokens
   expire during a request — the new service handles that atomically."

Keyboard is better for:
  import { TokenService } from '@auth/token-service'
  /path/to/specific/file.ts
  exact error messages to paste

Think of voice as your architecture and intent layer. Type the things that must be exact.

Natural language for complex requirements

Voice enables you to give Claude the context it would otherwise miss:

Typed (terse, low context):
  "fix the user query"

Spoken (natural, high context):
  "The user list query is timing out in production when there are more than
   ten thousand users. I think it's because we're loading all related posts
   for each user in the list view even though we only show the post count.
   Fix this — we should only fetch the count, not the full post objects.
   Make sure the tests still pass after the change."

The spoken version gives Claude enough context to find the right fix the first time.

How to correct misinterpretations via voice

When Claude misunderstands, respond naturally:

"No, that's not what I meant. Don't change the schema — the schema is fine.
 What I want is to change the query that reads from that schema.
 The query in user-repository.ts, the findAll method."

Do not re-explain from scratch. Correct the specific misunderstanding and continue.

Punctuation and special characters

Most voice tools handle natural punctuation. Speak it explicitly when you need it:

"open paren, close paren, arrow"   → ()=>
"backtick"                          → `
"dollar sign"                       → $
"at sign"                           → @

For code snippets with many special characters, switch to keyboard. Do not fight the voice tool on things keyboards handle better.

Multilingual Support

Claude Code’s voice input supports 20+ languages. You can switch languages mid-session by changing your dictation tool’s language setting.

# CLI voice with explicit language
/voice --lang vi    # Vietnamese
/voice --lang zh    # Chinese (Simplified)
/voice --lang fr    # French
/voice --lang de    # German
/voice --lang ja    # Japanese
/voice --lang ko    # Korean
/voice --lang pt    # Portuguese
/voice --lang es    # Spanish

Wispr Flow and Superwhisper both support multi-language and can detect language automatically if you switch mid-sentence.

Example Voice Session

A realistic transcript showing the full voice-first workflow:

Voice: “I need a React hook that fetches user data with loading and error states. It should accept a user ID, call the user service’s get method, and return the user object plus loading and error booleans.”

Claude writes useUser hook with fetch logic, loading state, error handling.

Voice: “Good. Now add a cache so if we fetch the same user ID twice within five minutes, we use the cached value instead of making another network request.”

Claude adds a Map-based cache with TTL logic.

Voice: “The cache should be module-level, not inside the hook — otherwise every component using this hook gets its own cache.”

Claude moves cache declaration outside the function.

Voice: “Run the tests for this hook.”

Claude runs test suite, all pass.

Voice: “Add a test for the cache hit scenario — verify that the service is only called once when the same ID is fetched twice within the TTL.”

Claude writes the test, runs it, confirms it passes.

Voice: “Looks good. Open a PR.”

Seven spoken sentences. A complete feature with tests, ready for review. The session took 8 minutes.

Productivity Workflow

Boris’s standard voice flow for a new feature:

1. Voice brief     → describe the feature fully, including context and constraints
2. Claude plans    → Claude outlines its approach, asks clarifying questions
3. Voice approval  → "yes, do it" or "no, the approach should be X instead"
4. Claude builds   → implements + runs verification
5. Voice review    → describe any changes needed
6. Iterate         → repeat 4-5 until passing
7. Voice merge     → "open a PR" / "squash merge this"

The keyboard appears at steps where you are reviewing Claude’s output (reading code) or providing exact values. Everything directional is voice.

Common Challenges

Technical jargon

Package names, framework terms, and proper nouns are the hardest for voice recognition.

Hard to dictate accurately:
  "useQueryClient"     → often transcribed as "use query client" (wrong casing)
  "@tanstack/query"    → hit or miss
  "useState"           → usually fine

Workarounds:
  - Say "camel case use query client" if your tool supports formatting commands
  - Type proper nouns directly, dictate everything around them
  - Train Wispr Flow on your personal vocabulary

Package names and paths

"install tanstack slash react-dash-query"  → doesn't work well

Better approach:
  1. Voice: "add TanStack Query as a dependency"
  2. Claude writes the install command
  3. You press Enter to confirm rather than re-dictating

Code snippets from errors

Pasting an error message is faster than reading it aloud. When Claude needs to see an exact error:

# Copy the error, then tell Claude:
# Voice: "here's the error I'm seeing"
# Then paste it in the terminal

/advanced/voice — full technical setup guide for all voice options
/advanced/btw — voice-friendly background questions while Claude works

Attribution: Boris Cherny described using voice as his primary coding interface in public posts (March 2026). Boris created Claude Code at Anthropic.