Tool Orchestration
When Claude needs to read 3 files and run a test, does it do them one at a time? No — it’s smarter than that.
The Tool Orchestration layer sits between the Agent Loop and the Permission Pipeline. Its job: take a batch of tool calls from the LLM, classify each one, and schedule them for maximum throughput without risking data corruption.
The Partitioning Problem
When the LLM returns multiple tool calls in a single response, naive execution runs them sequentially. That’s slow and unnecessary. But running all of them in parallel is unsafe — a FileWrite that runs while another FileRead is mid-execution can corrupt the read result.
The solution is concurrency-safe partitioning: group tools into batches where all tools in a batch can safely run together.
Concurrency-Safe Partitioning
Here is a concrete example with 5 tool calls in one LLM response:
Result: 3 execution steps instead of 5 sequential calls. The two file reads and the grep complete in parallel, then the bash runs alone, then the write runs alone.
Safe by Default
The classification logic follows a strict rule: every tool defaults to exclusive (serial) unless it explicitly declares itself concurrent-safe.
FUNCTION classifyTool(tool, input): IF tool.isConcurrentSafe(input) == false: RETURN "exclusive" // runs alone, blocks all others
IF tool.isConcurrentSafe(input) == true: RETURN "concurrent" // can run in parallel with othersThe critical insight: the decision is per-invocation, not per-tool-type. The same Bash tool can be concurrent-safe or exclusive depending on the command it’s running.
Bash("cat src/config.ts") → read-only → concurrent-safeBash("npm install") → mutating → exclusiveBash("git status") → read-only → concurrent-safeBash("git commit -m 'fix'") → mutating → exclusiveThe tool itself inspects its input and decides.
Deep Dive: The Partitioning Algorithm
The partitioning uses a reduce-based approach that builds batches as it processes each tool call:
FUNCTION partitionIntoBatches(toolCalls, context): RETURN toolCalls.reduce((batches, toolCall) => { tool = findToolByName(toolCall.name) input = tool.inputSchema.safeParse(toolCall.input)
// Determine concurrency safety for THIS specific invocation isSafe = false IF input.success: TRY: isSafe = tool.isConcurrencySafe(input.data) CATCH: isSafe = false // If check throws → default exclusive
lastBatch = batches[batches.length - 1]
// Can we append to the current batch? IF isSafe AND lastBatch?.isConcurrent: lastBatch.tools.push(toolCall) // Extend concurrent batch ELSE: // Start a new batch (concurrent or exclusive) batches.push({ isConcurrent: isSafe, tools: [toolCall] })
RETURN batches }, [])Key rules:
- Maximum 10 tools per concurrent batch — if 15 FileReads arrive, they split into batches of 10 + 5
- Consecutive concurrent tools are merged — FileRead + Grep + FileRead = one concurrent batch
- One exclusive tool breaks the chain — any exclusive tool creates a new batch boundary
- Input parsing failure → exclusive — if the tool’s input can’t be validated, it runs alone for safety
BashTool’s concurrency check inspects the actual command:
// Inside BashTool.isConcurrencySafe(input):FUNCTION isReadOnly(command): readOnlyCommands = [cat, head, tail, less, wc, ls, find, stat, file, du, df, git status, git log, git diff, git show, git branch, grep, rg, ag, ack, echo, printf, env, printenv, whoami, hostname, date, uname]
// Check: does the command start with a read-only command? // AND: does it NOT contain mutation operators? (>, >>, |, ;, &&) RETURN startsWithAny(command, readOnlyCommands) AND NOT containsMutationOperator(command)Which Tools Are Concurrent-Safe?
| Tool | Concurrent-Safe? | Notes |
|---|---|---|
FileRead | Yes | Read-only by definition |
Grep | Yes | Read-only search |
Glob | Yes | Read-only pattern match |
LSP (language server queries) | Yes | Read-only semantic queries |
TaskGet / TaskList | Yes | Read-only task queries |
WebFetch | Yes | Network read, no local mutation |
WebSearch | Yes | Network read, no local mutation |
Bash | Depends | Inspects command; defaults to exclusive |
FileWrite | No | Always exclusive |
FileEdit | No | Always exclusive |
Agent (subagent spawn) | No | Subagent may call any tool |
MCP tools | Depends | Declared by MCP server |
Streaming Execution
Tools do not wait for the full LLM response to finish before starting. As soon as a tool_use block arrives in the API stream, the streaming executor begins running it immediately.
By the time the LLM finishes generating, the file reads are already complete. The LLM’s “thinking time” overlaps with tool execution, reducing perceived latency significantly.
Deep Dive: Streaming Executor Timing
The streaming executor starts as soon as the API stream begins delivering tool_use blocks. Here’s the exact sequence:
- First byte of
tool_useblock arrives → executor registers the tool but waits for the full block - Full
tool_useblock received (input JSON complete) → executor begins execution immediately - If tool is concurrent-safe → starts in parallel with any other concurrent tools already running
- If tool is exclusive → queued until all current concurrent tools finish
- LLM stream ends →
getRemainingResults()called to collect any tools still running - All tools complete → results collected, context modifiers applied, loop continues
If a concurrent tool completes before the LLM finishes streaming, its result is held in the executor’s buffer. Results are only yielded to the UI and added to the conversation after the full response is received — preserving message ordering.
Context Modifier Chain
After a tool runs, it can modify shared state that subsequent tools will see. This enables coordination between sequential steps.
FUNCTION applyContextModifiers(toolResult): IF toolResult.changedFiles: context.changedFilesList.add(toolResult.changedFiles)
IF toolResult.memoryPrefetch: context.prefetchedMemory.merge(toolResult.memoryPrefetch)Only serial (exclusive) tools can apply context modifiers. If concurrent tools modified shared context, the order of modifications would be non-deterministic — a race condition. By restricting context modification to serial execution, the system guarantees that the context seen by tool N+1 is deterministic.
Batch Execution Pseudocode
FUNCTION executeBatches(toolCalls): batches = partitionIntoBatches(toolCalls)
FOR batch IN batches: IF batch.mode == "concurrent": // All tools in batch run in parallel (up to 10) results = await Promise.all( batch.tools.map(tool => runWithPermission(tool)) ) ELSE: // Exclusive: run one tool, then wait, then next results = [] FOR tool IN batch.tools: result = await runWithPermission(tool) applyContextModifiers(result) results.append(result)
RETURN collectAllResults()The runWithPermission call routes through the Permission Pipeline before the tool executes. No tool runs without clearing all 6 permission layers.
Why This Matters to You
- Why file reads feel fast: They run in parallel (up to 10 concurrent readers). Reading 5 files takes roughly the same time as reading 1.
- Why bash commands feel sequential: They are exclusive by default — the system cannot know if
npm run buildconflicts with another command without inspecting deeply. Default safe means default serial. - Why Claude starts working before finishing its response: Streaming execution overlaps LLM generation with tool I/O. Claude is already reading your files while still writing its explanation.
- Why read-only bash commands can be faster: Commands like
cat,git status,lsare classified as concurrent-safe. If Claude runs several of these in one response, they execute in parallel. - Why writing to a file always feels like a pause:
FileWriteis exclusive. It waits for all concurrent tools in the previous batch to complete before running.