deep dive · claude code · part 1 — the overview

2026-04-11 · 7 min read · #claude-code #source-dive #agents

Claude Code is a CLI. You type a prompt, it does some stuff on your machine, and it replies. What happens between those two moments — the "some stuff" — is almost 1900 TypeScript files. That's a lot of files to sit between a keystroke and an API call.

you keystroke · terminal claude code the CLI between you and the model claude the model · API prompt envelope stream rendered text

This series is me trying to understand all of it, piece by piece. This first post is the overview: what's in the source tree, what's load-bearing, what's scaffolding, and the single shape that makes sense of the whole thing. The deep dives come after.

agent, not chat client

From the outside it's a pipe: prompt in, API request out, response back, text in the terminal. What makes it an agent rather than a chat client is the middle — between your prompt and the model's final reply, Claude Code will often run things on your machine on the model's behalf (read files, execute shell commands, search the web, call MCP servers) and fold each result back into the conversation before calling the model again. So the CLI is really a three-way mediator: userClaude Code(Claude API, local system), and everything in src/ exists to serve that mediation.

opening the box · by file count

Before I looked at the source I assumed it'd be organized around "agent brain" concerns: a big prompt-engineering module, a big conversation manager, maybe a policy engine. It's not — or at least, not visibly. Here's the top-level src/ folder sorted by raw .ts/.tsx file count:

click any folder to see its sub-directories.

That's a shock the first time you see it. utils/ — the grab-bag folder — is the biggest, with 564 files of paths, fs helpers, ANSI conversion, auth plumbing, perf tracing, and dozens of other one-off helpers. components/ is the second biggest, and it's almost entirely React/Ink UI. The next four — commands, tools, services, hooks — are mostly wiring: slash commands, tool implementations, service clients, React hooks for the REPL.

At the root of src/ there are only 18 non-directory files. Most of them are small. But three of them — query.ts (1729 lines), QueryEngine.ts (1295), and Tool.ts (792) — contain the entire agent. I've highlighted them in green above. About 3800 lines run the show; the other ~200,000 are scaffolding around them — UI chrome, slash commands, plugin loading, permissions, MCP wrapping, IDE integration. Real work, but not the work you'd sketch on a whiteboard.

the shape · one envelope, one loop

Here's the useful mental model I eventually landed on:

Claude Code is a function that builds one envelope and runs one loop.

The envelope is an API request. It has three slots: system, tools, messages. Every turn — every single time the model is called — that envelope is rebuilt from scratch by gathering data from all over the source tree. Claude Code doesn't keep state on the API side; there are no server-side threads or assistants. Each call sends the entire envelope, and each call reassembles it.

The loop is what happens around the envelope. You send it, the model responds, and the response either contains tool calls or it doesn't. If it doesn't, the loop exits — the model is done talking. If it does, Claude Code executes each tool (reads in parallel, writes serially), collects the results, appends them to messages, and loops back to send a new envelope. This keeps going until the model stops asking for tools, hits a turn limit, or the user aborts.

The animation below walks one full turn through that pipe — envelope assembly, the API call, tool execution, a loop-back to rebuild, and the final streamed text.

claude code loop envelope · rebuilt every turn system prompts · ctx · mcp · skills tools ~160 declarations messages history · compacted you keystroke src/ 1884 files constants/ context.ts tools/ services/ Claude API local system files · shell · git · net turn 0
press play to walk through one full turn
Each step pauses so you can read what's happening while the pulse animates.
step 0/12
speed

what goes in the envelope

Each slot is fed by code from multiple folders. Once you know who feeds what, the tree stops feeling arbitrary.

system — the system prompt — is assembled from:

tools — the tool declarations — come from:

messages — the conversation history — comes from:

The fact that these sources are so scattered across the tree is exactly why a holistic picture matters before a deep dive. The prompt isn't written in one place. It's gathered.

Here's what the envelope actually looks like for a realistic prompt, picked up mid-session so you can see prior history flow into the rebuild. Each step applies one diff — new content flashes amber.

prompt · "refactor this bash function to use pipefail"
turn / 3
press play · mid-session example
Starts with 2 prior messages already in the history. Walks through: new user prompt → envelope re-assembly → tool_use → tool_result → next API call → final text. New content is highlighted amber.
system 0 chars
(empty)
tools 0 / 160
(empty)
messages 0 items
(empty)
step 0/8
speed

the loop

With the envelope built, the loop itself lives in query.ts. It's an async function* — an AsyncGenerator — so it can yield messages as they arrive and let the REPL stream tokens to the terminal in real time. Stripped of hooks, error recovery, and bookkeeping, the shape is:

async function* query(state) {
  while (true) {
    // 1. rebuild the envelope, apply compaction layers
    const envelope = prepareEnvelope(state)

    // 2. stream the API call
    const response = yield* queryModelWithStreaming(envelope)

    // 3. done if the model didn't ask for tools
    if (!response.hasToolCalls) return

    // 4. execute tools (reads parallel, writes serial)
    const results = yield* runTools(response.toolCalls)

    // 5. append results, loop
    state.messages.push(...results)
  }
}

The real query.ts adds layers around each step: pre- and post-tool hooks, post-sampling hooks, stop hooks that can veto continuation, reactive compaction on prompt_too_long errors, max-turn limits, and an abort-signal tree that lets ctrl-c unwind the pipeline cleanly. But the skeleton is what it is: prepare → call → execute → repeat.

zooming out · harness engineering

There's a name for what we just walked through: harness engineering. Agent = model + harness. The model is the stateless API endpoint. The harness is everything else — the loop, the tool registry, the permission gates, the compaction, the hooks, the session store. It's what turns a raw LLM into something you can trust with a keyboard.

Claude Code is one of the most polished production harnesses in the wild, and it's unusual in that Anthropic ships the same binary they use internally — it's the real thing, not a demo. The 3800-vs-200K line split from earlier now has a proper name: the 3800 lines are the agentic core, the 200K are the harness around them. This series reads that harness one subsystem at a time.

more to come — each of those subsystems is a post of its own.