Tool Orchestration

When Claude needs to read 3 files and run a test, does it do them one at a time? No — it’s smarter than that.

The Tool Orchestration layer sits between the Agent Loop and the Permission Pipeline. Its job: take a batch of tool calls from the LLM, classify each one, and schedule them for maximum throughput without risking data corruption.

The Partitioning Problem

When the LLM returns multiple tool calls in a single response, naive execution runs them sequentially. That’s slow and unnecessary. But running all of them in parallel is unsafe — a FileWrite that runs while another FileRead is mid-execution can corrupt the read result.

The solution is concurrency-safe partitioning: group tools into batches where all tools in a batch can safely run together.

Concurrency-Safe Partitioning

Here is a concrete example with 5 tool calls in one LLM response:

flowchart LR INPUT["LLM Response\n5 tool calls"] subgraph B1["Batch 1 — Concurrent (parallel)"] T1["FileRead\nsrc/query.ts"] T2["FileRead\nsrc/tool.ts"] T3["Grep\npattern in src/"] end subgraph B2["Batch 2 — Serial (exclusive)"] T4["Bash\nnpm test"] end subgraph B3["Batch 3 — Serial (exclusive)"] T5["FileWrite\nsrc/fix.ts"] end INPUT --> B1 --> B2 --> B3 style B1 fill:#14532d,color:#86efac,stroke:#166534 style B2 fill:#7c2d12,color:#fda4af,stroke:#9a3412 style B3 fill:#7c2d12,color:#fda4af,stroke:#9a3412

Result: 3 execution steps instead of 5 sequential calls. The two file reads and the grep complete in parallel, then the bash runs alone, then the write runs alone.

Safe by Default

The classification logic follows a strict rule: every tool defaults to exclusive (serial) unless it explicitly declares itself concurrent-safe.

FUNCTION classifyTool(tool, input):
  IF tool.isConcurrentSafe(input) == false:
    RETURN "exclusive"            // runs alone, blocks all others

  IF tool.isConcurrentSafe(input) == true:
    RETURN "concurrent"           // can run in parallel with others

The critical insight: the decision is per-invocation, not per-tool-type. The same Bash tool can be concurrent-safe or exclusive depending on the command it’s running.

Bash("cat src/config.ts")      → read-only  → concurrent-safe
Bash("npm install")            → mutating   → exclusive
Bash("git status")             → read-only  → concurrent-safe
Bash("git commit -m 'fix'")   → mutating   → exclusive

The tool itself inspects its input and decides.

Deep Dive: The Partitioning Algorithm

The partitioning uses a reduce-based approach that builds batches as it processes each tool call:

FUNCTION partitionIntoBatches(toolCalls, context):
  RETURN toolCalls.reduce((batches, toolCall) => {
    tool = findToolByName(toolCall.name)
    input = tool.inputSchema.safeParse(toolCall.input)

    // Determine concurrency safety for THIS specific invocation
    isSafe = false
    IF input.success:
      TRY:
        isSafe = tool.isConcurrencySafe(input.data)
      CATCH:
        isSafe = false    // If check throws → default exclusive

    lastBatch = batches[batches.length - 1]

    // Can we append to the current batch?
    IF isSafe AND lastBatch?.isConcurrent:
      lastBatch.tools.push(toolCall)    // Extend concurrent batch
    ELSE:
      // Start a new batch (concurrent or exclusive)
      batches.push({
        isConcurrent: isSafe,
        tools: [toolCall]
      })

    RETURN batches
  }, [])

Key rules:

Maximum 10 tools per concurrent batch — if 15 FileReads arrive, they split into batches of 10 + 5
Consecutive concurrent tools are merged — FileRead + Grep + FileRead = one concurrent batch
One exclusive tool breaks the chain — any exclusive tool creates a new batch boundary
Input parsing failure → exclusive — if the tool’s input can’t be validated, it runs alone for safety

BashTool’s concurrency check inspects the actual command:

// Inside BashTool.isConcurrencySafe(input):
FUNCTION isReadOnly(command):
  readOnlyCommands = [cat, head, tail, less, wc,
    ls, find, stat, file, du, df,
    git status, git log, git diff, git show, git branch,
    grep, rg, ag, ack,
    echo, printf, env, printenv, whoami, hostname, date, uname]

  // Check: does the command start with a read-only command?
  // AND: does it NOT contain mutation operators? (>, >>, |, ;, &&)
  RETURN startsWithAny(command, readOnlyCommands)
         AND NOT containsMutationOperator(command)

Which Tools Are Concurrent-Safe?

Tool	Concurrent-Safe?	Notes
`FileRead`	Yes	Read-only by definition
`Grep`	Yes	Read-only search
`Glob`	Yes	Read-only pattern match
`LSP` (language server queries)	Yes	Read-only semantic queries
`TaskGet` / `TaskList`	Yes	Read-only task queries
`WebFetch`	Yes	Network read, no local mutation
`WebSearch`	Yes	Network read, no local mutation
`Bash`	Depends	Inspects command; defaults to exclusive
`FileWrite`	No	Always exclusive
`FileEdit`	No	Always exclusive
`Agent` (subagent spawn)	No	Subagent may call any tool
`MCP tools`	Depends	Declared by MCP server

Streaming Execution

Tools do not wait for the full LLM response to finish before starting. As soon as a tool_use block arrives in the API stream, the streaming executor begins running it immediately.

sequenceDiagram participant L as LLM Stream participant SE as Streaming Executor participant T as Tool Runner L->>SE: tool_use: FileRead("query.ts") [t=0ms] SE->>T: start FileRead immediately L->>SE: tool_use: FileRead("tool.ts") [t=50ms] SE->>T: start FileRead immediately L->>SE: text: "I'll read these files..." [t=80ms] L->>SE: tool_use: Bash("npm test") [t=120ms] Note over SE: Bash is exclusive, queued T-->>SE: FileRead results done [t=180ms] L-->>SE: Stream complete [t=200ms] SE->>T: start Bash (stream done, safe)

By the time the LLM finishes generating, the file reads are already complete. The LLM’s “thinking time” overlaps with tool execution, reducing perceived latency significantly.

Deep Dive: Streaming Executor Timing

The streaming executor starts as soon as the API stream begins delivering tool_use blocks. Here’s the exact sequence:

First byte of tool_use block arrives → executor registers the tool but waits for the full block
Full tool_use block received (input JSON complete) → executor begins execution immediately
If tool is concurrent-safe → starts in parallel with any other concurrent tools already running
If tool is exclusive → queued until all current concurrent tools finish
LLM stream ends → getRemainingResults() called to collect any tools still running
All tools complete → results collected, context modifiers applied, loop continues

If a concurrent tool completes before the LLM finishes streaming, its result is held in the executor’s buffer. Results are only yielded to the UI and added to the conversation after the full response is received — preserving message ordering.

Context Modifier Chain

After a tool runs, it can modify shared state that subsequent tools will see. This enables coordination between sequential steps.

FUNCTION applyContextModifiers(toolResult):
  IF toolResult.changedFiles:
    context.changedFilesList.add(toolResult.changedFiles)

  IF toolResult.memoryPrefetch:
    context.prefetchedMemory.merge(toolResult.memoryPrefetch)

Only serial (exclusive) tools can apply context modifiers. If concurrent tools modified shared context, the order of modifications would be non-deterministic — a race condition. By restricting context modification to serial execution, the system guarantees that the context seen by tool N+1 is deterministic.

Batch Execution Pseudocode

FUNCTION executeBatches(toolCalls):
  batches = partitionIntoBatches(toolCalls)

  FOR batch IN batches:
    IF batch.mode == "concurrent":
      // All tools in batch run in parallel (up to 10)
      results = await Promise.all(
        batch.tools.map(tool => runWithPermission(tool))
      )
    ELSE:
      // Exclusive: run one tool, then wait, then next
      results = []
      FOR tool IN batch.tools:
        result = await runWithPermission(tool)
        applyContextModifiers(result)
        results.append(result)

  RETURN collectAllResults()

The runWithPermission call routes through the Permission Pipeline before the tool executes. No tool runs without clearing all 6 permission layers.

Why This Matters to You

Why file reads feel fast: They run in parallel (up to 10 concurrent readers). Reading 5 files takes roughly the same time as reading 1.
Why bash commands feel sequential: They are exclusive by default — the system cannot know if npm run build conflicts with another command without inspecting deeply. Default safe means default serial.
Why Claude starts working before finishing its response: Streaming execution overlaps LLM generation with tool I/O. Claude is already reading your files while still writing its explanation.
Why read-only bash commands can be faster: Commands like cat, git status, ls are classified as concurrent-safe. If Claude runs several of these in one response, they execute in parallel.
Why writing to a file always feels like a pause: FileWrite is exclusive. It waits for all concurrent tools in the previous batch to complete before running.