Skip to content

Prompting Techniques

The gap between a mediocre Claude session and an exceptional one is usually not the task — it is how the task is framed. These patterns come from Boris Cherny, Thariq, and the broader Anthropic engineering team. Most are non-obvious. None require special configuration.

ultrathink — Maximum Reasoning Effort

ultrathink is a keyword that activates Claude’s extended thinking mode. When you include it in a prompt, Claude allocates significantly more reasoning effort before responding — working through the problem step by step before producing output.

# Standard prompt
"What's causing the race condition in the auth service?"
# ultrathink prompt
"ultrathink: what's the root cause of this race condition in the auth service?"

The difference: without ultrathink, Claude reasons quickly and responds. With it, Claude thinks longer and harder before answering — appropriate for problems where the first answer is often wrong.

When to use ultrathink:

Good uses:
✓ Architecture decisions — "ultrathink: should we use event sourcing or CQRS for this billing system?"
✓ Complex bug root cause — "ultrathink: why does this only fail under concurrent load?"
✓ Security analysis — "ultrathink: what are the attack vectors in this authentication flow?"
✓ Performance diagnosis — "ultrathink: why is this query slow only for users with > 1000 records?"
✓ Tradeoff analysis — "ultrathink: what are the long-term consequences of this schema decision?"
Not worth it for:
✗ Simple code generation — just ask directly
✗ Lookup-style questions — "what's the syntax for X"
✗ Routine refactors — performance cost isn't justified

Why it works: Extended thinking gives Claude the equivalent of working memory for multi-step reasoning. Complex problems often have a non-obvious correct answer that only becomes clear after exploring several paths. ultrathink gives Claude space to explore before committing.

See /advanced/extended-thinking for the technical details.

”Grill Me on These Changes” — Peer Review Pattern

After implementing a feature, instead of asking Claude to review it, ask Claude to challenge it.

# Weak — invites surface-level review
"Review my implementation of the payment service."
# Strong — invites adversarial thinking
"Grill me on these changes. What are the failure modes? What did I miss?
What would break this in production?"

The word “grill” shifts Claude’s posture from collaborative to critical. Claude will:

  • Challenge architectural assumptions
  • Find edge cases you did not consider
  • Identify missing error handling
  • Point out race conditions or security gaps
  • Ask uncomfortable questions about requirements

Example exchange:

You: "I've implemented the webhook handler. Grill me on these changes."
Claude: "A few things concern me:
1. No idempotency key handling — if the webhook fires twice (common with
Stripe), you'll create duplicate records. How are you preventing that?
2. The signature verification happens after you've already begun processing
the payload. An attacker who can send malformed payloads can cause errors
before verification runs. Verify first, process second.
3. You're returning 200 before the database write completes. If the write
fails, Stripe thinks the event was processed and won't retry. The 200
should come after the write.
4. No timeout on the database write — a slow DB will leave the webhook
request hanging until Stripe's 30-second timeout, which counts as a
failure and triggers a retry.
Which of these do you want to address first?"

This catches what you did not think to check. The pattern works because Claude is not trying to be polite — it is explicitly asked to find problems.

”Knowing What You Know Now, Implement the Elegant Version” — Iteration Pattern

After a working but messy first implementation, this prompt reliably produces a clean refactor.

# After getting working but ugly code:
"Knowing what you know now about the full requirements and how this
feature actually works, implement the elegant version."

Why this beats “clean this up”:

When Claude implements a feature, it learns things that were not in the original spec — edge cases that emerged, data shapes that became clear, control flow that simplified in one direction but not another. “Clean this up” asks Claude to improve the surface without rethinking the structure. “Elegant version” asks Claude to use that accumulated understanding to redesign from a position of full knowledge.

The result is typically:

  • Fewer functions doing more focused things
  • Removed abstractions that turned out to be unnecessary
  • Simplified control flow
  • Better names for things that became clear during implementation

When to use it:

Good fit:
✓ After a complex feature that required several correction rounds
✓ After a port or migration where you now understand both old and new
✓ After a performance fix where the correct data structure is now obvious
✓ After implementing tests that revealed the real interface
Not the right tool:
✗ On code that is already clean — you will lose specificity
✗ When the "ugliness" is required by external constraints (API compatibility, etc.)

Interview-Style Requirements Gathering

When you are not sure what you need, do not write a PRD. Let Claude interview you.

# Instead of writing requirements upfront:
"Interview me about the requirements for the notification system.
Ask one question at a time. Do not start implementing until you have
asked at least 8 questions and I have told you we are ready."

Claude will ask the questions you forgot to answer:

Claude: "Who are the recipients of notifications — just end users,
or also internal team members?"
You: "Both."
Claude: "Should users be able to configure which notification types
they receive, or is it all-or-nothing?"
You: "Configurable per type."
Claude: "What happens to notifications sent while a user is offline —
do they accumulate and show when they return, or expire?"
You: "Good question — accumulate, but expire after 30 days."
...

By question 8 you have answered things you would not have thought to specify. The implementation that follows is more accurate than anything produced from a pre-written spec.

The deeper principle: Claude asking questions is cheaper than Claude implementing the wrong thing and you discovering it three rounds later.

Phase-Wise Gated Plans — Test at Every Boundary

Never implement all phases of a feature and then test. Test at each phase boundary.

# Wrong order
"Implement all 5 phases of the billing refactor, then run the tests."
# Right order
"Implement phase 1. Run the unit tests for phase 1. If they pass,
tell me and wait. I will tell you when to proceed to phase 2."

Each phase gate:

  1. Implement the phase
  2. Run unit tests for that phase
  3. Run integration tests that touch it
  4. Confirm no regressions in adjacent code
  5. Get explicit approval before proceeding

This prevents the worst failure mode in large Claude sessions: discovering at step 5 that step 2 has a fundamental flaw that requires rewinding everything.

Second Claude as staff engineer:

Before implementing a multi-phase plan, have a separate Claude instance review it:

"Here is my implementation plan for [feature].
You are a senior staff engineer who is skeptical of this plan.
What is wrong with it? What risks am I underestimating?
What would you change before approving this plan?"

This review happens before any code is written, when changes are free. After implementation, changes are expensive.

”Fix” — Trust Claude with Bugs

When something is broken, paste the error and say “fix.” Nothing else.

# With prescriptive guidance (often misleads)
"The login is broken because I think there's a JWT expiration issue.
Can you check the token validation logic and fix the expiry check?"
# Without prescriptive guidance
[paste error + stack trace]
"fix"

Why less is more for bug fixes:

Your theory about the root cause is often wrong. When you provide it, Claude anchors on your theory and looks for evidence to support it, even when the actual problem is somewhere else entirely.

Without a theory, Claude reads the full error, follows the stack trace, examines the actual failure point, and finds the real cause. It often finds bugs that are one or two levels removed from where you were looking.

Use prescriptive guidance when you are certain of the cause. Use bare “fix” when you are guessing.

/careful Hook — Guarding Destructive Commands

The /careful pattern activates pre-execution approval for commands that cannot be undone.

Terminal window
# Activate careful mode
/careful on
# Now Claude will pause and request confirmation before running:
# - git push --force
# - rm -rf (any path)
# - DROP TABLE (any SQL)
# - Any production deployment command
# - kubectl delete
# - Any command matching your configured destructive patterns

When Claude encounters a dangerous command in careful mode:

Claude: "I'm about to run: git push --force origin main
This will overwrite the remote history.
This cannot be undone without a force push from a backup.
Confirm? [yes/no]"

How to configure what gets intercepted:

.claude/careful-patterns.json
{
"intercept": [
"git push --force",
"git push -f",
"rm -rf",
"DROP TABLE",
"DROP DATABASE",
"kubectl delete",
"vercel --prod"
],
"auto-approve": [
"rm -rf node_modules",
"rm -rf .next",
"rm -rf dist"
]
}

auto-approve lists patterns that look destructive but are safe in your context (build artifacts, caches). Everything else in intercept requires explicit confirmation.

Deactivating:

Terminal window
/careful off # Disable for current session
/careful once # Disable for next command only

The Prompting Escalation Ladder

graph TD A[Simple request] --> B{Working?} B -->|Yes| G[Done] B -->|No - need peer review| C[grill me on these changes] C --> D{Fundamental issue?} D -->|No| G D -->|Yes - need deep reasoning| E[ultrathink] E --> F{Architecture decision?} F -->|No| G F -->|Yes - need plan review| H[Second Claude as staff engineer] H --> G style A fill:#1e293b,color:#7dd3fc,stroke:#334155 style B fill:#1e293b,color:#fcd34d,stroke:#334155 style C fill:#1e293b,color:#7dd3fc,stroke:#334155 style D fill:#1e293b,color:#fcd34d,stroke:#334155 style E fill:#1e293b,color:#7dd3fc,stroke:#334155 style F fill:#1e293b,color:#fcd34d,stroke:#334155 style G fill:#1e293b,color:#86efac,stroke:#334155 style H fill:#1e293b,color:#7dd3fc,stroke:#334155

Start with the simplest prompt that could work. Escalate only when the output is insufficient. ultrathink on every prompt is expensive and unnecessary. “Grill me” on a five-line function is overkill. The ladder helps you calibrate.

Quick Reference

ultrathink → max reasoning effort for hard problems
"grill me" → adversarial review, find failure modes
"elegant version" → refactor with full requirements knowledge
"interview me" → Claude gathers requirements via questions
phase-wise + test gates → implement in stages, verify at each boundary
second Claude as staff → review plans before implementing
"fix" (bare) → trust Claude to find root cause without leading it
/careful on → approval gate for destructive commands

Attribution: ultrathink, “grill me”, and “elegant version” patterns are from Boris Cherny’s public posts (March 2026). Interview-style requirements and phase-wise gating emerged from Anthropic engineering team practices. Boris created Claude Code at Anthropic.