diff --git a/learn/01-system-overview.md b/learn/01-system-overview.md new file mode 100644 index 0000000..96db2a2 --- /dev/null +++ b/learn/01-system-overview.md @@ -0,0 +1,268 @@ +# 1. System Overview + +> The entire Claude Code architecture in one diagram, then unpacked layer by layer. + +--- + +## The Big Picture + +Claude Code is structured in **8 distinct layers**, each with a clear responsibility. Understanding these layers is the key to navigating the 512K-line codebase. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'primaryBorderColor': '#4a9eff', 'lineColor': '#4a9eff', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460', 'edgeLabelBackground': '#1a1a2e'}}}%% +graph TB + CLI["CLI Entry
main.tsx — 804KB"]:::entry + SDK["SDK Entry
Programmatic API"]:::entry + MCP_S["MCP Server
Expose as MCP"]:::entry + + REPL["REPL.tsx — 896KB
Interactive Terminal Shell"]:::ui + Comps["113 Components
Messages, Diffs, Dialogs"]:::ui + Hooks["83 React Hooks
Permissions, Input, IDE"]:::ui + + QE["QueryEngine.ts
Session Lifecycle Owner"]:::core + QL["query.ts — 1730 lines
Agentic Loop"]:::core + CL["claude.ts — 3420 lines
Anthropic API Client"]:::core + + TD["Tool Interface
Tool.ts"]:::tool + BT["42 Built-in Tools"]:::tool + MT["MCP Tools — dynamic"]:::tool + TO["Tool Orchestration
Parallel Execution"]:::tool + + Compact["Compaction Pipeline
snip / micro / auto /
reactive / collapse"]:::ctx + + Rules["Allow + Deny Rules"]:::perm + HK["PreToolUse Hooks"]:::perm + Classifier["Auto-mode Classifier"]:::perm + + AS["AppState Store
Immutable — 50+ fields"]:::state + SS["Session Storage
Transcripts + Resume"]:::state + Cfg["Config Layer
Global / Project / CLAUDE.md"]:::state + + Skills["Skills"]:::ext + Plugins["Plugins"]:::ext + Agents["Sub-agents + Swarms"]:::ext + + API["Anthropic Messages API"]:::external + MCP_Ext["External MCP Servers"]:::external + GrowthBook["GrowthBook + Statsig"]:::external + + CLI --> REPL + SDK --> QE + MCP_S --> QE + + REPL --> Comps + REPL --> Hooks + REPL --> QE + REPL --> AS + + QE --> QL + QL --> CL + QL --> Compact + QL --> TO + + CL --> API + CL --> GrowthBook + + TO --> TD + TD --> BT + TD --> MT + BT --> Rules + BT --> HK + MT --> Rules + + Rules --> Classifier + + QE --> SS + REPL --> Cfg + + Skills --> TD + Plugins --> TD + Plugins --> MCP_Ext + Agents --> QL + MCP_Ext --> MT + + classDef entry fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef ui fill:#1a1a4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef core fill:#2d1b4e,stroke:#e83e8c,color:#e0e0e0,stroke-width:2px + classDef tool fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef ctx fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef perm fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px + classDef state fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef ext fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef external fill:#333,stroke:#888,color:#aaa,stroke-width:1px,stroke-dasharray: 5 5 +``` + +--- + +## Layer 1: Entry Points + +There are **three ways** into Claude Code: + +| Entry | File | How It Works | +|-------|------|-------------| +| **CLI** | `src/main.tsx` (804KB) | Commander.js parses args → boots React/Ink → renders `REPL.tsx` | +| **SDK** | `src/entrypoints/sdk/` | Programmatic API → creates `QueryEngine` directly | +| **MCP Server** | `src/entrypoints/mcp.ts` | Exposes Claude Code itself as an MCP server | + +### Key Insight: Parallel Prefetch + +Startup time is critical for a CLI tool. Claude Code parallelizes heavy work *before* any module evaluation: + +```typescript +// main.tsx — fired as side-effects before other imports +startMdmRawRead() // MDM settings (enterprise) +startKeychainPrefetch() // API key from macOS Keychain +``` + +Heavy modules like OpenTelemetry (~400KB) and gRPC (~700KB) are loaded lazily via dynamic `import()` only when needed. + +--- + +## Layer 2: UI — React in a Terminal + +This is where it gets wild. Claude Code uses **React** (yes, the web framework) to render a terminal UI via [Ink](https://github.com/vadimdemedes/ink). + +| What | Count | Examples | +|------|-------|---------| +| Components | 113 files | Messages, Diffs, Dialogs, Settings, Spinners | +| React Hooks | 83 files | `useCanUseTool`, `useVoice`, `useReplBridge`, `useTypeahead` | + +The crown jewel is `REPL.tsx` at **896KB** — a single React component that is the entire interactive terminal experience. It handles: + +- Message rendering and virtual scrolling +- Permission dialogs +- Tool progress indicators +- Keyboard shortcuts and vim mode +- Voice input +- IDE bridge integration +- Background task management + +### Why React for a CLI? + +React's component model gives you: +- **Declarative UI** — Describe what to render, not how +- **Hooks** — Share stateful logic across 83 hooks +- **State management** — AppState drives re-renders +- **Composability** — 113 components snap together + +--- + +## Layer 3: Core Engine + +The engine has three key files forming a pipeline: + +``` +QueryEngine.ts → query.ts → claude.ts +(session owner) (loop) (API client) +``` + +1. **`QueryEngine.ts`** — Owns the session lifecycle. Creates a conversation, manages transcripts, tracks usage, and handles resume. + +2. **`query.ts`** — The **agentic loop**. This is the beating heart. It cycles between calling the model and executing tools until the model says `end_turn`. (See [Guide 2: The Agentic Loop](./02-agentic-loop.md)) + +3. **`claude.ts`** — The Anthropic API client. Handles streaming SSE responses, retry logic (429/529), prompt caching, and model fallback. (See [Guide 8: API Client](./08-api-client.md)) + +--- + +## Layer 4: Tool System + +Every capability Claude Code has — reading files, running bash, searching the web — is implemented as a **Tool**. There are 42 built-in tools, plus dynamic MCP tools loaded at runtime. + +Tools are self-contained modules with a standard interface defined in `Tool.ts` (793 lines): + +```typescript +type Tool = { + name: string + inputSchema: ZodSchema // Validate inputs + checkPermissions(input, ctx) // Permission check + call(input, ctx) // Execute + prompt(options) // Describe to model + renderToolUseMessage(input) // Terminal rendering + // ... 30+ more methods +} +``` + +Full breakdown in [Guide 3: Tool System](./03-tool-system.md). + +--- + +## Layer 5: Context Management + +LLMs have finite context windows. Claude Code has a sophisticated **5-stage compaction pipeline** to keep conversations within limits: + +1. **Snip Compact** — Sliding window, drop oldest turns +2. **Micro Compact** — Truncate oversized individual tool results +3. **Auto Compact** — Summarize via a separate API call +4. **Context Collapse** — Read-time projection with archived views +5. **Reactive Compact** — Emergency trigger on API 413 errors + +Full breakdown in [Guide 5: Context Management](./05-context-management.md). + +--- + +## Layer 6: Permission System + +Claude Code can run arbitrary bash commands and write to any file. The permission system is a multi-layered defense: + +``` +Deny Rules → Allow Rules → Tool Check → Hooks → Classifier → User Dialog +``` + +Four distinct permission modes: Default, Plan, Auto, and Bypass. + +Full breakdown in [Guide 4: Permission System](./04-permission-system.md). + +--- + +## Layer 7: State Management + +All application state lives in a **single immutable store** (`AppState`) with 50+ fields, following a simplified Redux pattern: + +- **Consumers** read state via `getAppState()` or React's `useAppState(selector)` +- **Mutators** update via `setAppState(prev => newState)` (functional updates) +- **Side effects** fire via `onChangeAppState` listeners +- The `DeepImmutable` type wrapper enforces immutability at the type level + +Full breakdown in [Guide 6: State Management](./06-state-management.md). + +--- + +## Layer 8: Extensions + +Claude Code is extensible through four mechanisms: + +| Mechanism | What | Where | +|-----------|------|-------| +| **Skills** | Markdown instruction files | `.claude/skills/*.md` | +| **Plugins** | Bundles of tools + MCP servers | Managed or user-installed | +| **Hooks** | Pre/PostToolUse scripts | `settings.json` or CLAUDE.md | +| **Agents** | Sub-agents, coordinators, swarms | AgentTool, tmux-based swarms | + +Full breakdown in [Guide 7: Extension Model](./07-extension-model.md). + +--- + +## External Dependencies + +| Dependency | Role | +|-----------|------| +| **Anthropic Messages API** | The LLM backend — all model calls go here | +| **External MCP Servers** | Third-party tools exposed via Model Context Protocol | +| **GrowthBook** | Feature flags — gates like `VOICE_MODE`, `PROACTIVE`, `CONTEXT_COLLAPSE` | +| **Statsig** | Additional analytics and experimentation | + +--- + +## Design Principles + +Looking at the codebase, several design principles emerge: + +1. **Feature flags for dead code elimination** — `feature('X')` from `bun:bundle` strips unused code at build time +2. **Lazy loading** — Heavy modules are `import()`ed only when needed +3. **Immutable state** — `DeepImmutable` enforces no mutation +4. **Generator-based streaming** — `async function*` throughout for backpressure-aware streaming +5. **Self-contained tools** — Each tool is a module with schema, permissions, execution, and rendering + +--- + +**Next:** [The Agentic Loop →](./02-agentic-loop.md) diff --git a/learn/02-agentic-loop.md b/learn/02-agentic-loop.md new file mode 100644 index 0000000..baf2f95 --- /dev/null +++ b/learn/02-agentic-loop.md @@ -0,0 +1,413 @@ +# 2. The Agentic Loop + +> How Claude Code cycles between the model and tools — the core of the entire system. + +--- + +## What Is the "Agentic Loop"? + +The agentic loop is the mechanism that makes Claude Code more than a chatbot. Instead of: + +``` +User → Model → Response (done) +``` + +Claude Code does: + +``` +User → Model → Tool Call → Model → Tool Call → ... → Response (done) +``` + +The model can call tools, see results, and decide what to do next — in a **loop** — until it decides it's finished. This loop lives in **`query.ts`** (1,730 lines), and it's the most important file in the entire codebase. + +--- + +## The Full Sequence + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'actorTextColor': '#e0e0e0', 'actorBorder': '#4a9eff', 'signalColor': '#4a9eff', 'noteBkgColor': '#16213e', 'noteTextColor': '#e0e0e0', 'activationBkgColor': '#2d1b4e', 'activationBorderColor': '#e83e8c'}}}%% +sequenceDiagram + participant U as User + participant QE as QueryEngine + participant PI as processUserInput + participant Q as query.ts + participant CP as Compaction + participant C as claude.ts + participant API as Anthropic API + participant T as Tool Executor + participant H as Hooks + + U->>QE: submitMessage(prompt) + activate QE + QE->>PI: parse slash commands, @mentions, attachments + PI-->>QE: messages[], shouldQuery + + alt Slash command — no API call needed + QE-->>U: return local result + else Model query required + QE->>QE: persist transcript to disk + QE->>Q: query(messages, systemPrompt, tools) + activate Q + + loop Agentic Loop — until end_turn or max_turns + Q->>CP: snip compact + CP->>CP: micro compact + CP->>CP: auto compact + CP->>CP: context collapse + CP-->>Q: compacted messages + + Q->>C: queryModel(messages, tools, thinking) + activate C + C->>C: build request: betas, cache, effort, budget + C->>API: POST /v1/messages — SSE stream + activate API + + API-->>C: message_start — usage + API-->>C: content_block — thinking / text / tool_use + API-->>C: message_delta — stop_reason, final usage + deactivate API + + C-->>Q: yield AssistantMessage + StreamEvents + deactivate C + + alt stop_reason = end_turn + Q->>H: postSamplingHooks + Q->>Q: handleStopHooks + Q-->>QE: return Terminal result + else stop_reason = tool_use + Q->>T: runTools(toolUseBlocks) + activate T + + T->>T: validate input against schema + T->>H: PreToolUse hooks + T->>T: check permissions + T->>T: call(input, context) + T->>H: PostToolUse hooks + T-->>Q: yield tool_result messages + deactivate T + + Q->>Q: inject CLAUDE.md attachments + Note over Q: push tool_results, continue loop + else stop_reason = max_tokens + Q->>Q: truncation retry — up to 3x + Q->>CP: reactive compact — emergency + end + end + + deactivate Q + QE->>QE: accumulate usage, persist transcript + QE-->>U: yield SDKMessage stream + end + deactivate QE +``` + +--- + +## Anatomy of `query.ts` + +The file exports a single async generator function: + +```typescript +export async function* query(params: QueryParams): + AsyncGenerator { + // ... 1,730 lines of agentic loop +} +``` + +### Why an Async Generator? + +This is a crucial design decision. An async generator (`async function*`) lets query.ts: + +1. **Yield messages as they arrive** — The consumer (REPL or SDK) sees each message in real-time +2. **Maintain backpressure** — The loop pauses if the consumer isn't ready +3. **Support cancellation** — `.return()` on the generator cleanly tears down the loop +4. **Compose generators** — `yield*` delegates to sub-generators seamlessly + +### The Loop State + +Each loop iteration carries mutable state: + +```typescript +type State = { + messages: Message[] // Conversation history + toolUseContext: ToolUseContext // Tool execution context + autoCompactTracking: AutoCompactTrackingState // Compact progress + maxOutputTokensRecoveryCount: number // Truncation retry counter + hasAttemptedReactiveCompact: boolean // Emergency compact flag + turnCount: number // Loop iteration counter + pendingToolUseSummary: Promise<...> // Async summary generation + transition: Continue | undefined // Why we're in this iteration +} +``` + +--- + +## Phase 1: Pre-Processing — Before the API Call + +Before every API call, query.ts runs a **compaction pipeline** on the message history: + +```typescript +// 1. Apply per-message tool result budgets +messagesForQuery = await applyToolResultBudget(messagesForQuery, ...) + +// 2. Snip compact — sliding window over old turns +if (feature('HISTORY_SNIP')) { + const snipResult = snipModule.snipCompactIfNeeded(messagesForQuery) + messagesForQuery = snipResult.messages +} + +// 3. Micro compact — truncate oversized tool results +const microcompactResult = await deps.microcompact(messagesForQuery, ...) +messagesForQuery = microcompactResult.messages + +// 4. Context collapse — read-time projection +if (feature('CONTEXT_COLLAPSE') && contextCollapse) { + const collapseResult = await contextCollapse.applyCollapsesIfNeeded(...) + messagesForQuery = collapseResult.messages +} + +// 5. Auto compact — full summarization via API +const { compactionResult } = await deps.autocompact(messagesForQuery, ...) +``` + +Each stage is feature-gated and runs **independently**. They compose — snip reduces history, micro truncates individual results, auto summarizes the whole thing, collapse archives old views. + +### Blocking Limit Check + +After compaction, the loop checks if we're at the **blocking limit** (>98% context used): + +```typescript +const { isAtBlockingLimit } = calculateTokenWarningState( + tokenCountWithEstimation(messagesForQuery), + toolUseContext.options.mainLoopModel, +) +if (isAtBlockingLimit) { + yield createAssistantAPIErrorMessage({ content: PROMPT_TOO_LONG_ERROR_MESSAGE }) + return { reason: 'blocking_limit' } +} +``` + +This prevents the API call entirely if we know it'll fail. + +--- + +## Phase 2: The API Call + +The actual model call uses `claude.ts`: + +```typescript +for await (const message of deps.callModel({ + messages: prependUserContext(messagesForQuery, userContext), + systemPrompt: fullSystemPrompt, + thinkingConfig: toolUseContext.options.thinkingConfig, + tools: toolUseContext.options.tools, + signal: toolUseContext.abortController.signal, + options: { + model: currentModel, + fallbackModel, + effortValue: appState.effortValue, + taskBudget: params.taskBudget, + // ... 20+ more options + }, +})) { + // Process each streamed message +} +``` + +Responses stream in as SSE events. The loop processes three content block types: + +| Block Type | What Happens | +|-----------|-------------| +| `thinking` | Rendered in UI, not sent back to model | +| `text` | Rendered as markdown in terminal | +| `tool_use` | Triggers tool execution (next phase) | + +--- + +## Phase 3: Tool Execution + +When the model responds with `tool_use` blocks, the loop executes them: + +```typescript +// Parallel tool execution +const toolResults = yield* runTools(toolUseBlocks, canUseTool, toolUseContext) +``` + +### Streaming Tool Execution + +A performance optimization: tools can begin executing **while the model is still streaming**: + +```typescript +const useStreamingToolExecution = config.gates.streamingToolExecution +let streamingToolExecutor = useStreamingToolExecution + ? new StreamingToolExecutor(tools, canUseTool, toolUseContext) + : null +``` + +The `StreamingToolExecutor` starts validating and permission-checking tool calls as their blocks arrive, before the full response is complete. + +### Tool Lifecycle + +Each tool goes through: + +1. **Schema Validation** — Zod validates the input against `tool.inputSchema` +2. **PreToolUse Hooks** — User-defined scripts can approve, deny, or modify the input +3. **Permission Check** — Deny rules → Allow rules → Tool-specific check → Classifier → User dialog +4. **Execution** — `tool.call(input, context)` runs the actual operation +5. **PostToolUse Hooks** — Scripts run after execution with the result + +--- + +## Phase 4: Loop Continuation + +After tools execute, the loop decides what to do next based on the `stop_reason`: + +### `end_turn` — Model is done + +```typescript +if (stop_reason === 'end_turn') { + // Run post-sampling hooks + await executePostSamplingHooks(assistantMessage, toolUseContext) + // Check stop hooks (can force continuation) + const stopResult = await handleStopHooks(assistantMessage, messages) + if (stopResult.shouldContinue) { + // Inject hook feedback and continue loop + } else { + return { reason: 'end_turn' } // Terminal — loop exits + } +} +``` + +### `tool_use` — Model wants to use tools + +The tool results are pushed to messages and the loop continues: + +```typescript +messages.push(...toolResults) +// Inject CLAUDE.md attachments for newly-discovered memory files +const attachments = await getAttachmentMessages(messages, toolUseContext) +messages.push(...attachments) +// Continue to next iteration (back to Phase 1) +``` + +### `max_tokens` — Response was truncated + +```typescript +if (maxOutputTokensRecoveryCount < MAX_OUTPUT_TOKENS_RECOVERY_LIMIT) { + // Retry with increased max_tokens + state.maxOutputTokensRecoveryCount++ + continue +} else { + // Trigger reactive compact as last resort + state.hasAttemptedReactiveCompact = true +} +``` + +--- + +## QueryEngine: The Session Wrapper + +`QueryEngine.ts` wraps `query()` in a session lifecycle: + +```typescript +class QueryEngine { + private mutableMessages: Message[] + private totalUsage: NonNullableUsage + private readFileState: FileStateCache + + async *submitMessage(prompt): AsyncGenerator { + // 1. Parse user input (slash commands, @mentions) + const { messages, shouldQuery } = await processUserInput({ input: prompt }) + + // 2. Persist transcript to disk + await recordTranscript(messages) + + // 3. Run the agentic loop + if (shouldQuery) { + for await (const message of query({ messages, systemPrompt, tools })) { + // 4. Map internal messages to SDK format + yield normalizedSDKMessage(message) + // 5. Persist each message + await recordTranscript(messages) + } + } + + // 6. Return final result with usage stats + yield { type: 'result', total_cost_usd, usage, duration_ms } + } +} +``` + +--- + +## Key Patterns to Understand + +### 1. Generator Composition + +The codebase uses `yield*` heavily to compose generators: + +```typescript +// query.ts delegates to sub-generators +yield* runTools(toolUseBlocks, canUseTool, toolUseContext) + +// QueryEngine delegates to query +for await (const message of query(params)) { + yield* normalizeMessage(message) +} +``` + +### 2. Feature-Gated Loading + +Code paths are gated by build-time feature flags: + +```typescript +const reactiveCompact = feature('REACTIVE_COMPACT') + ? require('./services/compact/reactiveCompact.js') + : null + +// Later... +if (reactiveCompact?.isReactiveCompactEnabled()) { + // This entire branch is eliminated in builds where REACTIVE_COMPACT is false +} +``` + +### 3. Tombstone Messages + +When a streaming fallback occurs (model switch mid-stream), orphaned messages are tombstoned: + +```typescript +for (const msg of assistantMessages) { + yield { type: 'tombstone', message: msg } // Removed from UI + transcript +} +assistantMessages.length = 0 // Reset +``` + +### 4. Task Budget Tracking + +API-level task budgets track spend across compaction boundaries: + +```typescript +if (params.taskBudget) { + const preCompactContext = finalContextTokensFromLastResponse(messagesForQuery) + taskBudgetRemaining = Math.max( + 0, + (taskBudgetRemaining ?? params.taskBudget.total) - preCompactContext, + ) +} +``` + +--- + +## Common Debugging Scenarios + +| Symptom | Where to Look | +|---------|--------------| +| Loop never stops | Check `maxTurns` limit, `handleStopHooks` | +| Tool not executing | Permission system — check deny rules, hooks, classifier | +| Context too large | Compaction pipeline — which stage is failing? | +| Model fallback | `withRetry` in claude.ts — 529 overloaded triggers | +| Truncation errors | `MAX_OUTPUT_TOKENS_RECOVERY_LIMIT` (3 retries) | + +--- + +**Previous:** [← System Overview](./01-system-overview.md) · **Next:** [Tool System →](./03-tool-system.md) diff --git a/learn/03-tool-system.md b/learn/03-tool-system.md new file mode 100644 index 0000000..efa13ee --- /dev/null +++ b/learn/03-tool-system.md @@ -0,0 +1,445 @@ +# 3. Tool System + +> How 42 built-in tools are defined, validated, orchestrated, and rendered. + +--- + +## Overview + +Every capability Claude Code has — reading files, running bash, editing code, searching the web — is a **Tool**. Tools are the bridge between the model's intentions and the real world. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'primaryBorderColor': '#28a745', 'lineColor': '#28a745', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460'}}}%% +graph TB + subgraph Interface["Tool Interface — Tool.ts"] + direction LR + IS["inputSchema
Zod validation"] + CP["checkPermissions"] + CALL["call — execute"] + PROMPT["prompt — model instructions"] + RENDER["render — terminal UI"] + end + + IS --> CP --> CALL --> PROMPT --> RENDER + + subgraph FileOps["File Operations"] + FR["FileRead"] + FW["FileWrite"] + FE["FileEdit"] + GL["Glob"] + GR["Grep"] + NE["NotebookEdit"] + end + + subgraph Exec["Execution"] + BA["Bash"] + PS["PowerShell"] + end + + subgraph Web["Web"] + WF["WebFetch"] + WS["WebSearch"] + end + + subgraph AgentTools["Agent and Task"] + AG["Agent — spawn sub-agent"] + TC["TaskCreate"] + TG["TaskGet"] + TU["TaskUpdate"] + TL["TaskList"] + TS["TaskStop"] + SM["SendMessage"] + end + + subgraph Meta["Meta Tools"] + AQ["AskUserQuestion"] + SK["SkillTool"] + TW["TodoWrite"] + EP["EnterPlanMode"] + XP["ExitPlanMode"] + TSR["ToolSearch"] + end + + subgraph Dynamic["Dynamic — Runtime Loaded"] + MCP_T["MCP Tools
from external servers"] + LSP_T["LSP Tool
language server queries"] + end + + subgraph Orchestration["Orchestration Layer"] + RUN["toolOrchestration.ts
runTools — parallel dispatch"] + STE["StreamingToolExecutor
execute as blocks stream in"] + TEX["toolExecution.ts — 60KB
single tool lifecycle"] + THK["toolHooks.ts
Pre/Post hook dispatch"] + end + + Interface --> FileOps + Interface --> Exec + Interface --> Web + Interface --> AgentTools + Interface --> Meta + Interface --> Dynamic + + FileOps --> Orchestration + Exec --> Orchestration + Web --> Orchestration + AgentTools --> Orchestration + Meta --> Orchestration + Dynamic --> Orchestration + + RUN --> STE + RUN --> TEX + TEX --> THK +``` + +--- + +## The Tool Interface — `Tool.ts` (793 lines) + +Every tool implements the `Tool` type. Here are the key methods: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#28a745', 'primaryBorderColor': '#28a745'}}}%% +flowchart LR + subgraph Definition["Tool Definition"] + NAME["name: string"] + SCHEMA["inputSchema: Zod"] + ALIASES["aliases?: string array"] + HINT["searchHint?: string"] + end + + subgraph Lifecycle["Lifecycle Methods"] + VAL["validateInput
pre-execution check"] + PERM["checkPermissions
allow / deny / prompt"] + CALL["call
execute the tool"] + DESC["description
model-facing summary"] + end + + subgraph Rendering["Rendering Methods"] + RUM["renderToolUseMessage
show input in terminal"] + RRM["renderToolResultMessage
show output in terminal"] + RPM["renderToolUseProgressMessage
spinner / progress bar"] + GRP["renderGroupedToolUse
parallel display"] + end + + subgraph Metadata["Metadata Methods"] + RO["isReadOnly
does it write?"] + CS["isConcurrencySafe
parallel safe?"] + EN["isEnabled
available now?"] + DS["isDestructive
irreversible?"] + AC["toAutoClassifierInput
safety classifier text"] + end + + Definition --> Lifecycle --> Rendering + Definition --> Metadata +``` + +### The `buildTool` Factory + +All tools go through `buildTool()` which provides safe defaults: + +```typescript +const TOOL_DEFAULTS = { + isEnabled: () => true, + isConcurrencySafe: () => false, // Assume not safe + isReadOnly: () => false, // Assume writes + isDestructive: () => false, + checkPermissions: (input) => // Defer to general system + Promise.resolve({ behavior: 'allow', updatedInput: input }), + toAutoClassifierInput: () => '', // Skip classifier + userFacingName: () => '', +} + +export function buildTool(def) { + return { ...TOOL_DEFAULTS, userFacingName: () => def.name, ...def } +} +``` + +This "fail-closed" design means a tool that forgets to implement `isConcurrencySafe` defaults to `false` (not safe for parallel execution). + +--- + +## The 42 Built-in Tools + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#17a2b8', 'primaryBorderColor': '#17a2b8'}}}%% +graph TB + subgraph FileOps["File Operations — Read + Write + Search"] + FR["FileRead
Read files, images,
PDFs, notebooks"] + FW["FileWrite
Create or overwrite
entire files"] + FE["FileEdit
Partial string
replacement edits"] + GL["Glob
File pattern
matching search"] + GR["Grep
ripgrep content
search"] + NE["NotebookEdit
Jupyter notebook
cell editing"] + end + + subgraph Exec["Execution — Run Commands"] + BA["Bash
Shell command
execution"] + PS["PowerShell
Windows shell
execution"] + REPL["REPL
Persistent JS/TS
runtime context"] + end + + subgraph Web["Web — Fetch and Search"] + WF["WebFetch
HTTP GET to URLs
HTML to markdown"] + WS["WebSearch
Web search via
Brave or similar"] + end + + subgraph AgentTask["Agent and Task Management"] + AG["Agent
Spawn sub-agent
with forked context"] + TC["TaskCreate
Background task"] + TG["TaskGet
Check task status"] + TU["TaskUpdate
Update task state"] + TL["TaskList
List all tasks"] + TS["TaskStop
Terminate task"] + SM["SendMessage
Inter-agent
messaging"] + TmC["TeamCreate
Create agent team"] + TmD["TeamDelete
Remove agent team"] + end + + subgraph Meta["Meta Tools — Control Claude's Behavior"] + AQ["AskUserQuestion
Interactive prompt"] + SK["SkillTool
Execute skills"] + TW["TodoWrite
Manage task lists"] + EP["EnterPlanMode
Switch to read-only"] + XP["ExitPlanMode
Resume full access"] + TSR["ToolSearch
Find deferred tools"] + BF["Brief
Toggle concise mode"] + SL["Sleep
Idle wait for
proactive mode"] + SO["SyntheticOutput
Structured JSON
output"] + end + + subgraph Dynamic["Dynamic — Loaded at Runtime"] + MCP["MCP Tools
From external
MCP servers"] + LSP["LSP Tool
Language server
queries"] + end + + subgraph Special["Special Purpose"] + EW["EnterWorktree
Git worktree
isolation"] + XW["ExitWorktree
Leave worktree"] + RT["RemoteTrigger
Remote execution"] + SC["ScheduleCron
Timed triggers"] + CF["Config
Settings management"] + end +``` + +--- + +## Tool Orchestration — Parallel Execution + +When the model returns multiple `tool_use` blocks, Claude Code can execute them **in parallel**: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'primaryBorderColor': '#4a9eff'}}}%% +sequenceDiagram + participant Q as query.ts + participant O as toolOrchestration.ts + participant STE as StreamingToolExecutor + participant T1 as Tool 1 — FileRead + participant T2 as Tool 2 — Grep + participant T3 as Tool 3 — Bash + participant P as Permission System + + Q->>O: runTools(3 tool_use blocks) + activate O + + Note over O: Check concurrency safety + + O->>STE: FileRead — isConcurrencySafe = true + O->>STE: Grep — isConcurrencySafe = true + O->>STE: Bash — isConcurrencySafe = false + + par Parallel Execution + STE->>P: checkPermissions(FileRead) + P-->>STE: allow + STE->>T1: call(input) + T1-->>STE: result + + STE->>P: checkPermissions(Grep) + P-->>STE: allow + STE->>T2: call(input) + T2-->>STE: result + end + + Note over STE: Wait for parallel tools + + STE->>P: checkPermissions(Bash) + P-->>STE: prompt user + STE->>T3: call(input) + T3-->>STE: result + + O-->>Q: yield all tool_result messages + deactivate O +``` + +Key files in the orchestration layer: + +- **`toolOrchestration.ts`** — `runTools()`: dispatches tools, handles parallel vs. sequential +- **`StreamingToolExecutor`** — Starts permission checks while model is still streaming +- **`toolExecution.ts`** (60KB) — Single tool lifecycle: validate → permissions → execute → hooks +- **`toolHooks.ts`** — Dispatches PreToolUse and PostToolUse hooks + +--- + +## Single Tool Lifecycle + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#28a745', 'primaryBorderColor': '#28a745'}}}%% +flowchart TD + BLOCK["tool_use block arrives
from model stream"]:::start + + PARSE["Parse + validate input
via Zod inputSchema"]:::step + VAL{"validateInput?"}:::check + + DENY_VAL["Return error to model
with validation message"]:::deny + + PRE_HOOK["Run PreToolUse hooks
user-defined scripts"]:::hook + HOOK_R{"Hook result?"}:::check + + PERM["Check permissions
deny → allow → tool → hooks → classifier → dialog"]:::step + PERM_R{"Permission?"}:::check + + EXEC["tool.call(input, context)
execute the operation"]:::step + RESULT["Map output to tool_result
via mapToolResultToToolResultBlockParam"]:::step + + SIZE{"Result exceeds
maxResultSizeChars?"}:::check + PERSIST["Persist to disk
return file path + preview"]:::step + + POST_HOOK["Run PostToolUse hooks"]:::hook + RENDER["Render in terminal
renderToolResultMessage"]:::step + + YIELD["Yield tool_result
to query loop"]:::done + + DENY_PERM["Return permission_denied
error to model"]:::deny + + BLOCK --> PARSE --> VAL + VAL -->|"pass"| PRE_HOOK + VAL -->|"fail"| DENY_VAL + + PRE_HOOK --> HOOK_R + HOOK_R -->|"approve"| PERM + HOOK_R -->|"deny"| DENY_PERM + HOOK_R -->|"modify input"| PERM + + PERM --> PERM_R + PERM_R -->|"allow"| EXEC + PERM_R -->|"deny"| DENY_PERM + + EXEC --> RESULT --> SIZE + SIZE -->|"within limit"| POST_HOOK + SIZE -->|"exceeds limit"| PERSIST --> POST_HOOK + + POST_HOOK --> RENDER --> YIELD + + classDef start fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef step fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef check fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef hook fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef deny fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px + classDef done fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px +``` + +--- + +## ToolSearch — Deferred Tool Loading + +With 42+ tools, sending all schemas to the model wastes tokens. **ToolSearch** defers tools that aren't immediately needed: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#ffc107', 'primaryBorderColor': '#ffc107'}}}%% +flowchart LR + ALL["42+ Tools"]:::input + + SPLIT{"shouldDefer?"}:::check + + EAGER["~15 Eager Tools
Always in prompt
FileRead, Bash, Grep..."]:::eager + DEFER["~27 Deferred Tools
Schema not sent initially
TaskCreate, WebSearch..."]:::defer + ALWAYS["alwaysLoad Tools
Forced eager by MCP meta"]:::eager + + SEARCH["ToolSearch Tool
Model searches by keyword
using searchHint"]:::tool + + FOUND["Tool schema injected
into next request"]:::result + + ALL --> SPLIT + SPLIT -->|"no"| EAGER + SPLIT -->|"yes"| DEFER + SPLIT -->|"alwaysLoad"| ALWAYS + + DEFER --> SEARCH + SEARCH --> FOUND + + classDef input fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef check fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef eager fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef defer fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef tool fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef result fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px +``` + +--- + +## Dynamic Tools — MCP and LSP + +Beyond built-in tools, Claude Code loads tools dynamically at runtime: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#17a2b8', 'primaryBorderColor': '#17a2b8'}}}%% +flowchart TD + subgraph MCP["MCP Tools — Model Context Protocol"] + SRV["External MCP Servers
configured in settings"] + CONN["MCPConnectionManager
stdio / SSE transport"] + DISC["Discover tools
via tools/list"] + WRAP["Wrap as Tool objects
name: mcp__server__tool"] + end + + subgraph LSP["LSP Tool — Language Server Protocol"] + LS["Language Server
runtime type info"] + QUERY_LSP["Query definitions,
references, diagnostics"] + end + + SRV --> CONN --> DISC --> WRAP + LS --> QUERY_LSP + + MERGE["Merged into tool pool
via useMergedTools hook"]:::merge + + WRAP --> MERGE + QUERY_LSP --> MERGE + + classDef merge fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px +``` + +MCP tools are prefixed with `mcp____` unless running in SDK no-prefix mode. They go through the same permission system as built-in tools. + +--- + +## Key Design Decisions + +### 1. Self-Contained Modules + +Each tool directory (`src/tools//`) contains everything: +- `index.ts` — Tool definition via `buildTool()` +- `prompt.ts` — Model-facing instructions +- `*.test.ts` — Tests +- Additional helpers as needed + +### 2. Fail-Closed Defaults + +`buildTool()` defaults are conservative: +- `isConcurrencySafe = false` — Won't run in parallel unless explicitly safe +- `isReadOnly = false` — Assumed to write unless stated otherwise +- `checkPermissions` defaults to `allow` — But the general permission system still applies + +### 3. Result Size Budgets + +Each tool has `maxResultSizeChars`. Oversized results are persisted to disk and the model gets a truncated preview + file path. This prevents single tool results from consuming the entire context window. + +### 4. Observable Input Backfilling + +`backfillObservableInput()` adds derived fields to tool inputs for SDK consumers and transcripts, without mutating the API-bound input (which would break prompt caching): + +```typescript +// The API sees: { file_path: "src/foo.ts" } +// SDK/transcript sees: { file_path: "src/foo.ts", resolved_path: "/abs/path/src/foo.ts" } +``` + +--- + +**Previous:** [← The Agentic Loop](./02-agentic-loop.md) · **Next:** [Permission System →](./04-permission-system.md) diff --git a/learn/04-permission-system.md b/learn/04-permission-system.md new file mode 100644 index 0000000..23caa35 --- /dev/null +++ b/learn/04-permission-system.md @@ -0,0 +1,339 @@ +# 4. Permission System + +> How Claude Code prevents an AI from doing dangerous things — a multi-layered defense. + +--- + +## Why Permissions Matter + +Claude Code can run **arbitrary bash commands**, **write to any file**, and **make network requests**. Without a permission system, a single misguided model response could `rm -rf /` your entire system. + +The permission system is a chain of checks — if any link denies, the tool doesn't run. + +--- + +## The Permission Flow + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#dc3545', 'primaryBorderColor': '#dc3545'}}}%% +flowchart TD + ENTRY["Tool call arrives"]:::start + + DR{"Deny rules?
blanket deny, pattern match"} + AR{"Allow rules?
always-allow from settings"} + TSP{"tool.checkPermissions?
tool-specific logic"} + HOOK{"PreToolUse hooks?
user-defined scripts"} + CLASS{"Auto-mode classifier?
transcript safety analysis"} + DIALOG{"User permission dialog
Y / n / always-allow"} + + ALLOW["ALLOW
execute tool"]:::allow + DENY["DENY
return error to model"]:::deny + + ENTRY --> DR + + DR -->|"matched deny rule"| DENY + DR -->|"no match"| AR + + AR -->|"matched allow rule"| ALLOW + AR -->|"no match"| TSP + + TSP -->|"tool says allow"| HOOK + TSP -->|"tool says deny"| DENY + + HOOK -->|"hook approves"| ALLOW + HOOK -->|"hook denies"| DENY + HOOK -->|"no decision"| CLASS + + CLASS -->|"classified safe"| ALLOW + CLASS -->|"classified unsafe"| DIALOG + CLASS -->|"not in auto-mode"| DIALOG + + DIALOG -->|"user accepts"| ALLOW + DIALOG -->|"user rejects"| DENY + DIALOG -->|"always allow"| AR + + classDef start fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef allow fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef deny fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px +``` + +--- + +## Layer 1: Deny Rules + +**First check. Highest priority. Cannot be overridden.** + +Deny rules are pattern-matched against tool name and input. If a deny rule matches, the tool is **immediately rejected** — no further checks run. + +Sources of deny rules: +- `settings.json` — User-configured +- CLAUDE.md — Project-level rules +- Organization policy — Enterprise MDM settings + +Example deny rules: +```json +{ + "alwaysDenyRules": { + "settings": [ + { "tool": "Bash", "pattern": "rm -rf" }, + { "tool": "FileWrite", "pattern": "/etc/*" } + ] + } +} +``` + +### Permission Matching + +Tools can implement `preparePermissionMatcher()` for custom pattern matching: + +```typescript +// Bash tool: "git *" matches any git command +preparePermissionMatcher(input) { + return async (pattern) => minimatch(input.command, pattern) +} +``` + +--- + +## Layer 2: Allow Rules + +If no deny rule matched, check if an **allow rule** grants automatic approval. + +Allow rules come from: +- User clicking "always allow" in the permission dialog +- `settings.json` configuration +- Slash command grants (e.g., `/plan` exit grants specific operations) + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#28a745', 'primaryBorderColor': '#28a745'}}}%% +flowchart LR + subgraph Sources["Allow Rule Sources"] + S1["settings.json
user config"] + S2["CLAUDE.md
project rules"] + S3["User dialog
always-allow choice"] + S4["Command grants
plan mode exit"] + end + + MERGE["ToolPermissionContext
alwaysAllowRules"]:::merge + + CHECK{"Pattern match
against tool + input"}:::check + + ALLOW["Auto-approved"]:::allow + NEXT["Continue to
next layer"]:::next + + S1 --> MERGE + S2 --> MERGE + S3 --> MERGE + S4 --> MERGE + + MERGE --> CHECK + CHECK -->|"match"| ALLOW + CHECK -->|"no match"| NEXT + + classDef merge fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef check fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef allow fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef next fill:#333,stroke:#888,color:#e0e0e0,stroke-width:1px +``` + +--- + +## Layer 3: Tool-Specific Permissions + +Each tool implements `checkPermissions(input, context)`: + +```typescript +// Example: FileRead defaults to allow (it's read-only) +checkPermissions: () => Promise.resolve({ behavior: 'allow' }) + +// Example: Bash checks if the command is read-only +checkPermissions: (input) => { + if (isReadOnlyCommand(input.command)) { + return { behavior: 'allow' } + } + return { behavior: 'askUser', message: `Run: ${input.command}` } +} +``` + +The result can be: +- `{ behavior: 'allow' }` — Approved +- `{ behavior: 'deny', message }` — Rejected with reason +- `{ behavior: 'askUser', message }` — Escalate to user prompt + +--- + +## Layer 4: PreToolUse Hooks + +User-defined scripts that run before tool execution. Configured in `settings.json` or CLAUDE.md: + +```json +{ + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash", + "command": "/path/to/safety-check.sh" + } + ] + } +} +``` + +Hook scripts receive the tool name and input as JSON on stdin. They can: +- **Approve** (exit 0, no output) +- **Deny** (exit non-zero, stderr has reason) +- **Modify input** (exit 0, stdout has modified JSON) + +--- + +## Layer 5: Auto-Mode Classifier + +In `--auto` mode, a **classifier** examines the conversation transcript to determine if a tool call is safe: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#17a2b8', 'primaryBorderColor': '#17a2b8'}}}%% +flowchart TD + TC["Tool call in auto-mode"]:::start + + BUILD["Build classifier input
tool.toAutoClassifierInput(input)"]:::step + TRANSCRIPT["Append recent transcript
for context"]:::step + CLASSIFY["Run safety classifier
is this operation safe?"]:::step + + SAFE{"Classified as?"}:::check + + ALLOW["Auto-approved
no user prompt"]:::allow + PROMPT["Escalate to
user dialog"]:::deny + + TC --> BUILD --> TRANSCRIPT --> CLASSIFY --> SAFE + SAFE -->|"safe"| ALLOW + SAFE -->|"unsafe"| PROMPT + + classDef start fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef step fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef check fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef allow fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef deny fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px +``` + +Each tool provides `toAutoClassifierInput()` which returns a compact representation for the classifier. Security-irrelevant tools return `''` to skip classification. + +--- + +## Layer 6: User Permission Dialog + +The last resort — ask the human: + +``` +╭────────────────────────────────────────╮ +│ Claude wants to run: │ +│ │ +│ $ npm install lodash │ +│ │ +│ (Y)es · (n)o · (a)lways allow │ +╰────────────────────────────────────────╯ +``` + +Choosing "always allow" adds a permanent allow rule. + +--- + +## Permission Modes + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#fd7e14', 'primaryBorderColor': '#fd7e14'}}}%% +flowchart TB + START(["Session Start"]):::neutral --> DEFAULT + + DEFAULT["DEFAULT MODE
Prompt user on every write tool"]:::mode1 + PLAN["PLAN MODE
Read tools auto-approved
Write tools require approval"]:::mode2 + AUTO["AUTO MODE
Classifier decides safety
Safe = allow, Unsafe = prompt"]:::mode3 + BYPASS["BYPASS MODE
Everything auto-approved
No permission checks"]:::mode4 + + DEFAULT -->|"/plan command
or model enters plan"| PLAN + PLAN -->|"model exits
plan mode"| DEFAULT + DEFAULT -->|"--auto flag
user opts in"| AUTO + AUTO -->|"denial limit
exceeded"| DEFAULT + DEFAULT -->|"--dangerously-
skip-permissions"| BYPASS + + classDef neutral fill:#333,stroke:#888,color:#e0e0e0,stroke-width:1px + classDef mode1 fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef mode2 fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef mode3 fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef mode4 fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px +``` + +### Default Mode +- Every write operation prompts the user +- Read operations (FileRead, Glob, Grep) auto-approved +- Most secure, most friction + +### Plan Mode +- Entered via `/plan` command or model's `EnterPlanMode` tool +- All read tools auto-approved +- All write tools require explicit user approval +- Model can plan freely, execute cautiously + +### Auto Mode +- Enabled via `--auto` flag +- Safety classifier decides per-tool +- Falls back to prompting if classifier says "unsafe" +- Has a **denial limit** — too many denials drops back to Default + +### Bypass Mode +- Enabled via `--dangerously-skip-permissions` +- **Everything auto-approved** — no checks at all +- Named to be scary because it IS scary +- No permission system protection whatsoever + +--- + +## The `ToolPermissionContext` Type + +All permission state lives in `AppState.toolPermissionContext`: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#dc3545', 'primaryBorderColor': '#dc3545'}}}%% +graph TB + subgraph TPC["ToolPermissionContext — Immutable"] + MODE["mode
default / plan / auto / bypass"] + AWD["additionalWorkingDirectories
extra safe paths"] + ALLOW["alwaysAllowRules
by source: settings, command, etc."] + DENY["alwaysDenyRules
by source"] + ASK["alwaysAskRules
force prompt even if allowed"] + BPM["isBypassPermissionsModeAvailable
can user enable bypass?"] + AUTO_A["isAutoModeAvailable
can user enable auto?"] + AVOID["shouldAvoidPermissionPrompts
background agents that cannot show UI"] + AWAIT["awaitAutomatedChecksBeforeDialog
coordinator workers"] + PRE["prePlanMode
mode to restore after plan exits"] + end +``` + +This is wrapped in `DeepImmutable` — TypeScript enforces that nobody mutates this in place. Updates go through `setAppState(prev => ({ ...prev, toolPermissionContext: { ... } }))`. + +--- + +## Denial Tracking + +Auto mode tracks denials to prevent runaway unsafe operations: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#dc3545', 'primaryBorderColor': '#dc3545'}}}%% +flowchart LR + START["Auto mode active"]:::start + D1["Denial 1"]:::deny + D2["Denial 2"]:::deny + DN["Denial N
limit exceeded"]:::deny + FALLBACK["Fall back to
Default mode"]:::result + + START --> D1 --> D2 -->|"..."| DN --> FALLBACK + + classDef start fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef deny fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef result fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px +``` + +This is stored in `DenialTrackingState` — for async subagents that can't show UI, a local tracking copy is used since their `setAppState` is a no-op. + +--- + +**Previous:** [← Tool System](./03-tool-system.md) · **Next:** [Context Management →](./05-context-management.md) diff --git a/learn/05-context-management.md b/learn/05-context-management.md new file mode 100644 index 0000000..84e5583 --- /dev/null +++ b/learn/05-context-management.md @@ -0,0 +1,106 @@ +# 5. Context Management — The Compaction Pipeline + +> How Claude Code keeps conversations within the model's context window. + +--- + +## The Pipeline + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#fd7e14', 'primaryBorderColor': '#fd7e14'}}}%% +flowchart LR + RAW["Raw message
history"]:::input + + S1["SNIP COMPACT
Sliding window
Drop oldest turns
Preserve recent N"]:::stage1 + S2["MICRO COMPACT
Truncate individual
tool results exceeding
size thresholds"]:::stage2 + S3["AUTO COMPACT
Summarize full conversation
via separate API call
Circuit breaker on failure"]:::stage3 + S4["CONTEXT COLLAPSE
Read-time projection
Archived collapsed views
Granular preservation"]:::stage4 + + FINAL["Messages ready
for API call"]:::output + + S5["REACTIVE COMPACT
Emergency trigger
on API 413 error
Last resort"]:::emergency + + RAW ==> S1 ==> S2 ==> S3 ==> S4 ==> FINAL + FINAL -. "API returns prompt_too_long" .-> S5 + S5 ==> FINAL + + classDef input fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef stage1 fill:#0d3d0d,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef stage2 fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef stage3 fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef stage4 fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef output fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef emergency fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px +``` + +--- + +## Stage Details + +### Stage 1: Snip Compact +Sliding window that drops the oldest turns. The REPL keeps full history for UI scrollback — snip is a *read-time projection* only affecting what's sent to the API. Feature-gated via `HISTORY_SNIP`. + +### Stage 2: Micro Compact +Truncates individual tool results exceeding size thresholds. Results are cached by `tool_use_id` so subsequent iterations reuse cached truncations. +**Key file:** `src/services/compact/microCompact.ts` (19KB) + +### Stage 3: Auto Compact +Summarizes the full conversation via a **separate API call**. Has a circuit breaker — too many consecutive failures stops retrying. +**Key files:** `autoCompact.ts` (13KB), `compact.ts` (60KB), `prompt.ts` (16KB) + +### Stage 4: Context Collapse +Read-time projection that archives older segments with granular preservation. Exists in a separate store — the REPL's message array is never modified. + +### Stage 5: Reactive Compact +Emergency trigger when the API returns `prompt_too_long` (413). Last resort — only runs after a real API failure. Feature-gated via `REACTIVE_COMPACT`. + +--- + +## Token Budget State Machine + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#ffc107', 'primaryBorderColor': '#ffc107'}}}%% +flowchart LR + N["NORMAL
within limits"]:::green + W["WARNING
context > 80%"]:::yellow + C["CRITICAL
context > 95%"]:::orange + B["BLOCKING
context > 98%
auto-compact OFF"]:::red + AC["AUTO COMPACT
fires automatically"]:::blue + RC["REACTIVE
emergency on 413"]:::darkred + M["MANUAL
user runs /compact"]:::gray + + N -->|"grows"| W + W -->|"grows"| C + C -->|"auto-compact ON"| AC + C -->|"auto-compact OFF"| B + AC -->|"success"| N + AC -->|"fails + API 413"| RC + RC -->|"success"| N + B -->|"user: /compact"| M + M -->|"success"| N + + classDef green fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef yellow fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef orange fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef red fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px + classDef darkred fill:#3a0a0a,stroke:#a30000,color:#e0e0e0,stroke-width:2px + classDef blue fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef gray fill:#2a2a2a,stroke:#888,color:#e0e0e0,stroke-width:1px +``` + +### Transitions +- **NORMAL → WARNING** at 80% — UI shows warning indicator +- **WARNING → CRITICAL** at 95% — compaction should fire +- **CRITICAL → AUTO COMPACT** — if enabled, fires summarization API call +- **CRITICAL → BLOCKING** — if auto-compact OFF, blocks new API calls +- **BLOCKING → MANUAL** — user runs `/compact` to recover + +--- + +## Tool Result Budget + +Separate from conversation compaction — a per-message budget for aggregate tool result size. Runs **before** the pipeline every iteration. Oversized results are persisted to disk, replaced with a file path + truncated preview. Tools with `maxResultSizeChars = Infinity` (e.g., FileRead) are exempt. + +--- + +**Previous:** [← Permission System](./04-permission-system.md) · **Next:** [State Management →](./06-state-management.md) diff --git a/learn/06-state-management.md b/learn/06-state-management.md new file mode 100644 index 0000000..70ba4e3 --- /dev/null +++ b/learn/06-state-management.md @@ -0,0 +1,148 @@ +# 6. State Management + +> A single immutable store with 50+ fields — how Claude Code manages application state. + +--- + +## Architecture + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'primaryBorderColor': '#4a9eff'}}}%% +graph TB + subgraph Store["AppState — Single Immutable Store"] + direction TB + subgraph Core_S["Core Session State"] + Model["mainLoopModel"] + Think["thinkingEnabled"] + Fast["fastMode"] + Effort["effortValue"] + Settings["settings: SettingsJson"] + end + + subgraph Perm_S["Permission State"] + TPC["toolPermissionContext"] + Mode["mode: default / plan / auto / bypass"] + Allow["alwaysAllowRules"] + Deny["alwaysDenyRules"] + end + + subgraph MCP_S["MCP State"] + MCPCli["clients: MCPServerConnection array"] + MCPTool["tools: Tool array"] + MCPCmd["commands: Command array"] + end + + subgraph Task_S["Background Tasks"] + Tasks["tasks: taskId to TaskState map"] + AgentReg["agentNameRegistry"] + FG["foregroundedTaskId"] + end + + subgraph UI_S["UI State"] + Spec["speculation: predictive execution"] + Suggest["promptSuggestion: autocomplete"] + Notif["notifications queue"] + Bridge["replBridge: remote control state"] + end + + subgraph History_S["History and Tracking"] + FH["fileHistory: snapshots for rewind"] + Attr["attribution: commit metadata"] + Todos["todos: per-agent lists"] + end + end + + REPL_C["REPL.tsx
reads + subscribes"]:::consumer + QE_C["QueryEngine
reads via getAppState"]:::consumer + Tools_C["Tools
reads via ToolUseContext"]:::consumer + + SET["setAppState
functional update"]:::mutator + ONCHANGE["onChangeAppState
side effect reactions"]:::mutator + + Store --> REPL_C + Store --> QE_C + Store --> Tools_C + + SET --> Store + ONCHANGE --> Store + + classDef consumer fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef mutator fill:#2d1b4e,stroke:#e83e8c,color:#e0e0e0,stroke-width:2px +``` + +--- + +## Core Concepts + +### Immutability via `DeepImmutable` + +The `AppState` type is wrapped in `DeepImmutable` — TypeScript enforces that no consumer can mutate state in place: + +```typescript +export type AppState = DeepImmutable<{ + settings: SettingsJson + mainLoopModel: ModelSetting + toolPermissionContext: ToolPermissionContext + // ... 50+ fields +}> +``` + +### Functional Updates + +State is updated via `setAppState(prev => newState)`: + +```typescript +setAppState(prev => ({ + ...prev, + toolPermissionContext: { + ...prev.toolPermissionContext, + mode: 'plan', + }, +})) +``` + +### Side Effects via `onChangeAppState` + +After state changes, `onChangeAppState.ts` fires reactive side effects — persisting settings, updating UI, notifying remote sessions, etc. + +--- + +## Key State Groups + +### Session State +`mainLoopModel`, `thinkingEnabled`, `fastMode`, `effortValue` — control how the model behaves each turn. + +### Permission State +`toolPermissionContext` — contains mode, allow/deny rules, and bypass availability. See [Guide 4](./04-permission-system.md). + +### MCP State +`mcp.clients`, `mcp.tools`, `mcp.commands` — dynamically connected MCP servers and their exposed tools/commands. + +### Background Tasks +`tasks` — a map of `taskId → TaskState` for background agent tasks. `foregroundedTaskId` controls which task's messages appear in the main view. + +### UI State +`speculation` — predictive execution state for pre-computing responses. `promptSuggestion` — autocomplete suggestions. `notifications` — queued UI notifications. + +### History +`fileHistory` — snapshots for `/rewind`. `attribution` — commit metadata for git attribution. `todos` — per-agent task lists. + +--- + +## Key Files + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'primaryBorderColor': '#4a9eff'}}}%% +graph LR + subgraph StateDir["src/state/"] + AS["AppState.tsx
React context + hooks"] + ASS["AppStateStore.ts
Type definition + defaults"] + OC["onChangeAppState.ts
Side effect reactions"] + SEL["selectors.ts
Derived state"] + ST["store.ts
Store type"] + end +``` + +--- + +**Previous:** [← Context Management](./05-context-management.md) · **Next:** [Extension Model →](./07-extension-model.md) diff --git a/learn/07-extension-model.md b/learn/07-extension-model.md new file mode 100644 index 0000000..a35e7d1 --- /dev/null +++ b/learn/07-extension-model.md @@ -0,0 +1,248 @@ +# 7. Extension Model + +> Skills, plugins, hooks, sub-agents, and swarms — how Claude Code is extended. + +--- + +## Overview + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#ffc107', 'primaryBorderColor': '#ffc107'}}}%% +graph TB + subgraph Skills["Skills"] + BS["Bundled Skills
shipped with CLI"]:::skill + US["User Skills
.claude/skills/*.md"]:::skill + PS["Project Skills
.claude/skills/ in repo"]:::skill + SL["loadSkillsDir.ts
discover + parse frontmatter"]:::loader + end + + subgraph Plugins["Plugins"] + MP["Managed Plugins
org-level policy"]:::plugin + IP["Installed Plugins
user choice"]:::plugin + BP["Built-in Plugins
shipped with CLI"]:::plugin + PL["pluginLoader.ts
cache-only, versioned"]:::loader + end + + subgraph Agents["Agents"] + SA["Sub-agents via AgentTool
forked context, own query loop"]:::agent + CO["Coordinator Mode
leader dispatches tasks,
workers get limited tools"]:::agent + SW["Swarms
multi-process via tmux,
mailbox message passing"]:::agent + FA["Forked Agents
share parent prompt cache,
overlay filesystem"]:::agent + end + + subgraph HookSys["Hooks"] + PRE["PreToolUse
before tool execution"]:::hook + POST["PostToolUse
after tool execution"]:::hook + SESS["Session Hooks
lifecycle events"]:::hook + HC["Configured in
settings.json or CLAUDE.md"]:::hook + end + + CMD["commands.ts — Command Registry
merges all sources"]:::registry + TOOL["Tool.ts — Tool Interface"]:::registry + QUERY["query.ts — Agentic Loop"]:::registry + + BS --> SL + US --> SL + PS --> SL + SL --> CMD + + MP --> PL + IP --> PL + BP --> PL + PL --> CMD + PL -->|"plugin MCP servers"| TOOL + + CMD --> TOOL + + SA --> QUERY + CO --> QUERY + SW --> QUERY + FA --> QUERY + + PRE --> TOOL + POST --> TOOL + HC --> PRE + HC --> POST + + classDef skill fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef plugin fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef agent fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef hook fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef loader fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef registry fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:3px +``` + +--- + +## Skills + +Skills are **markdown instruction files** with YAML frontmatter. They teach Claude Code how to do specific tasks. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#17a2b8', 'primaryBorderColor': '#17a2b8'}}}%% +flowchart TD + subgraph Sources["Skill Sources"] + BUNDLED["Bundled
shipped with CLI"]:::bundled + USER[".claude/skills/
user-defined"]:::user + PROJECT["repo .claude/skills/
project-specific"]:::project + end + + LOADER["loadSkillsDir.ts
Discover + parse
YAML frontmatter"]:::loader + + TOOL["SkillTool
Model invokes via tool call"]:::tool + CMD["Slash commands
/skills to manage"]:::cmd + + BUNDLED --> LOADER + USER --> LOADER + PROJECT --> LOADER + LOADER --> TOOL + LOADER --> CMD + + classDef bundled fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef user fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef project fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef loader fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef tool fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef cmd fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px +``` + +**Key file:** `src/skills/loadSkillsDir.ts` (34KB) + +--- + +## Plugins + +Plugins are bundles of tools, MCP servers, and commands. They extend Claude Code at a deeper level than skills. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#ffc107', 'primaryBorderColor': '#ffc107'}}}%% +flowchart TD + subgraph Types["Plugin Types"] + MANAGED["Managed Plugins
org-level policy
enterprise MDM"]:::managed + INSTALLED["User Plugins
installed via marketplace
or manual"]:::installed + BUILTIN["Built-in Plugins
shipped with CLI"]:::builtin + end + + CACHE["pluginLoader.ts
cache-only loading
versioned artifacts"]:::loader + + subgraph Provides["Plugin Provides"] + TOOLS["Tools
via MCP servers"]:::tool + CMDS["Slash Commands"]:::cmd + SKILLS_P["Skills"]:::skill + end + + MANAGED --> CACHE + INSTALLED --> CACHE + BUILTIN --> CACHE + + CACHE --> TOOLS + CACHE --> CMDS + CACHE --> SKILLS_P + + classDef managed fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px + classDef installed fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef builtin fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef loader fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef tool fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef cmd fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef skill fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px +``` + +**Key files:** `src/plugins/builtinPlugins.ts`, `src/utils/plugins/pluginLoader.ts` + +--- + +## Agent System + +Claude Code can spawn **sub-agents** — each gets its own query loop, forked context, and limited tool set. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#6f42c1', 'primaryBorderColor': '#6f42c1'}}}%% +flowchart TD + subgraph AgentTypes["Agent Types"] + SUB["Sub-agent
AgentTool spawns in-process
forked context, own query loop"]:::sub + COORD["Coordinator
Leader dispatches tasks
workers get limited tools"]:::coord + SWARM["Swarm
Multi-process via tmux
mailbox message passing"]:::swarm + FORK["Forked Agent
Share parent prompt cache
overlay filesystem"]:::fork + end + + subgraph SubAgent["Sub-agent Details"] + CONTEXT["Forked ToolUseContext
cloned file cache
separate abort controller"] + LOOP["Own query loop
independent agentic cycle"] + RESULTS["Results flow back
to parent as tool_result"] + end + + subgraph SwarmDetails["Swarm Details"] + TMUX["tmux sessions
separate processes"] + MAILBOX["Mailbox system
JSON message passing"] + LEADER["Leader process
dispatches and coordinates"] + WORKER["Worker processes
limited tool access"] + end + + SUB --> SubAgent + SWARM --> SwarmDetails + + classDef sub fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef coord fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef swarm fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef fork fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px +``` + +**Key files:** `src/tools/AgentTool/`, `src/coordinator/coordinatorMode.ts` + +--- + +## Hooks + +User-defined scripts that run at specific points in the tool execution lifecycle: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#fd7e14', 'primaryBorderColor': '#fd7e14'}}}%% +sequenceDiagram + participant M as Model + participant Q as query.ts + participant PRE as PreToolUse Hook + participant T as Tool + participant POST as PostToolUse Hook + + M->>Q: tool_use block + Q->>PRE: Run hook script + + alt Hook approves + PRE-->>Q: exit 0 + Q->>T: Execute tool + T-->>Q: result + Q->>POST: Run hook script + POST-->>Q: done + Q->>M: tool_result + else Hook denies + PRE-->>Q: exit non-zero + Q->>M: error: denied by hook + else Hook modifies input + PRE-->>Q: exit 0 + modified JSON + Q->>T: Execute with modified input + T-->>Q: result + Q->>POST: Run hook script + POST-->>Q: done + Q->>M: tool_result + end +``` + +Hooks are configured in `settings.json` with matchers: + +```json +{ + "hooks": { + "PreToolUse": [ + { "matcher": "Bash", "command": "./check-safety.sh" } + ], + "PostToolUse": [ + { "matcher": "FileWrite", "command": "./format-on-save.sh" } + ] + } +} +``` + +--- + +**Previous:** [← State Management](./06-state-management.md) · **Next:** [API Client →](./08-api-client.md) diff --git a/learn/08-api-client.md b/learn/08-api-client.md new file mode 100644 index 0000000..f40248c --- /dev/null +++ b/learn/08-api-client.md @@ -0,0 +1,220 @@ +# 8. API Client — `claude.ts` + +> Streaming, retries, caching, and model fallback — how Claude Code talks to the Anthropic API. + +--- + +## Request Lifecycle + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#28a745', 'actorTextColor': '#e0e0e0', 'actorBorder': '#28a745', 'signalColor': '#28a745', 'noteBkgColor': '#16213e', 'noteTextColor': '#e0e0e0', 'activationBkgColor': '#1b3a1b', 'activationBorderColor': '#28a745'}}}%% +sequenceDiagram + participant Q as query.ts + participant C as claude.ts + participant R as withRetry + participant K as AnthropicClient + participant A as Anthropic API + + Q->>C: queryModel(messages, tools, options) + activate C + + C->>C: resolve model — runtime override, plan-mode swap + C->>C: normalize messages — strip internal fields + C->>C: build tool schemas — filter by deny, defer via ToolSearch + C->>C: configure betas, cache_control, effort, task_budget + C->>C: add prompt cache breakpoints + C->>C: compute metadata — user_id, session_id, device_id + + C->>R: withRetry(clientFactory, requestFn) + activate R + + loop Retry on 429, 529, timeouts + R->>K: getAnthropicClient(apiKey, model) + K->>A: beta.messages.stream(params) + activate A + + alt 200 OK + A-->>R: SSE event stream + else 429 Rate Limited + R->>R: exponential backoff + else 529 Overloaded + R->>R: backoff + optional model fallback + else 401 Auth Error + R-->>C: CannotRetryError — abort + end + deactivate A + end + + deactivate R + + C->>C: parse stream into AssistantMessage + C->>C: update usage tracking and cost + C->>C: detect prompt cache breaks + C-->>Q: yield AssistantMessage + StreamEvents + deactivate C +``` + +--- + +## Request Building + +Before each API call, `claude.ts` builds the request through several steps: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#28a745', 'primaryBorderColor': '#28a745'}}}%% +flowchart TD + subgraph ModelRes["1. Model Resolution"] + RUNTIME["Runtime override
from AppState"] + PLAN_SWAP["Plan-mode model
swap for 200K+ contexts"] + FALLBACK_M["Fallback model
on 529 overload"] + end + + subgraph MsgNorm["2. Message Normalization"] + STRIP["Strip internal fields
uuid, timestamp, etc."] + THINKING["Preserve thinking blocks
within trajectory boundaries"] + SIGNS["Strip signature blocks"] + end + + subgraph ToolBuild["3. Tool Schema Building"] + FILTER_DENY["Filter denied tools"] + DEFER["Defer tools via
ToolSearch deferred loading"] + EAGER["Eager tools always
in prompt"] + end + + subgraph Config["4. Request Configuration"] + BETAS["Beta features
prompt caching, token counting"] + CACHE_CTL["cache_control breakpoints
system prompt caching"] + EFFORT_V["effort parameter
controls thinking depth"] + TASK_BUD["task_budget
agentic turn spend limit"] + METADATA["metadata
user_id, session_id"] + end + + ModelRes --> MsgNorm --> ToolBuild --> Config + + API_REQ["POST /v1/messages
SSE stream"]:::api + Config --> API_REQ + + classDef api fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px +``` + +--- + +## Retry Logic — `withRetry` + +The retry wrapper handles transient API failures: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#dc3545', 'primaryBorderColor': '#dc3545'}}}%% +flowchart TD + REQUEST["API Request"]:::start + + RESPONSE{"Response
status?"}:::check + + OK["200 OK
Stream response"]:::success + RATE["429 Rate Limited"]:::error + OVER["529 Overloaded"]:::error + AUTH["401 Auth Error"]:::fatal + TIMEOUT["Timeout"]:::error + + BACKOFF["Exponential backoff
wait and retry"]:::retry + FALLBACK_SWITCH["Switch to fallback model
if configured"]:::retry + ABORT["CannotRetryError
abort immediately"]:::fatal + + REQUEST --> RESPONSE + RESPONSE -->|"200"| OK + RESPONSE -->|"429"| RATE --> BACKOFF --> REQUEST + RESPONSE -->|"529"| OVER --> FALLBACK_SWITCH --> REQUEST + RESPONSE -->|"401"| AUTH --> ABORT + RESPONSE -->|"timeout"| TIMEOUT --> BACKOFF + + classDef start fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef check fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef success fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef error fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef fatal fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px + classDef retry fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px +``` + +### Streaming Fallback + +A unique feature: if the model is overloaded mid-stream (529), Claude Code can: +1. **Tombstone** the partial assistant messages +2. Switch to a fallback model +3. Restart the stream from scratch +4. The user sees no interruption — orphaned messages are removed from UI + +--- + +## Prompt Caching + +Claude Code uses Anthropic's prompt cache to avoid re-processing unchanged context: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'primaryBorderColor': '#4a9eff'}}}%% +flowchart LR + SYS["System prompt
cache_control breakpoint"]:::cached + TOOLS["Tool schemas
cache_control breakpoint"]:::cached + HISTORY["Conversation history
bytes must match exactly"]:::uncached + + API["API Request"]:::api + + HIT["Cache HIT
~90% cheaper
~10x faster"]:::hit + MISS["Cache MISS
full processing
new cache created"]:::miss + + SYS --> API + TOOLS --> API + HISTORY --> API + + API --> HIT + API --> MISS + + classDef cached fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef uncached fill:#333,stroke:#888,color:#e0e0e0,stroke-width:1px + classDef api fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef hit fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef miss fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px +``` + +Cache breaks are detected and logged. The `backfillObservableInput()` pattern exists specifically to avoid breaking the cache — the original API-bound input is never mutated. + +--- + +## SSE Stream Events + +The API returns Server-Sent Events in this order: + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'primaryBorderColor': '#4a9eff'}}}%% +sequenceDiagram + participant API as Anthropic API + participant C as claude.ts + + API->>C: message_start — model, usage, id + + loop For each content block + API->>C: content_block_start — type, index + loop Delta events + API->>C: content_block_delta — text / thinking / tool_use JSON + end + API->>C: content_block_stop + end + + API->>C: message_delta — stop_reason, final usage + API->>C: message_stop + + Note over C: Parse into AssistantMessage
Track usage + cost
Yield to query.ts +``` + +--- + +## Cost Tracking + +Every API call's usage is tracked in `cost-tracker.ts`: +- Input tokens (including cache reads/writes) +- Output tokens +- Per-model pricing +- Session totals exposed via `/cost` command + +--- + +**Previous:** [← Extension Model](./07-extension-model.md) · **Next:** [UI Architecture →](./09-ui-architecture.md) diff --git a/learn/09-ui-architecture.md b/learn/09-ui-architecture.md new file mode 100644 index 0000000..0ecd055 --- /dev/null +++ b/learn/09-ui-architecture.md @@ -0,0 +1,109 @@ +# 9. UI Architecture + +> Building a complex, interactive terminal UI with React and Ink. + +--- + +## Overview + +Claude Code uses React (via [Ink](https://github.com/vadimdemedes/ink)) to render its terminal interface. This allows it to use React's declarative component model, hooks, and state management in a CLI context. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#4a9eff', 'primaryBorderColor': '#4a9eff'}}}%% +graph TB + CLI["main.tsx
Parses args, boots React"]:::entry + INK["Ink rendered
Terminal output"]:::external + + subgraph ReactApp["React Application"] + APP["App.tsx
Providers and Routing"] + REPL["REPL.tsx (896KB)
Main Interface"] + + subgraph Components["113 UI Components"] + MESSAGES["Message Rendering
User, Assistant, Tool"] + DIALOGS["Permission Dialogs
Y/n Prompts"] + INPUT["Input Area
Text, Voice, Paste"] + INDICATORS["Spinners & Progress Bars"] + end + + subgraph Hooks["83 Custom Hooks"] + USE_STATE["useAppState"] + USE_TOOL["useCanUseTool"] + USE_INPUT["useInput"] + USE_VOICE["useVoice"] + end + end + + CLI --> APP + APP --> REPL + REPL --> Components + REPL --> Hooks + Components --> INK + + classDef entry fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef external fill:#333,stroke:#888,color:#aaa,stroke-width:1px,stroke-dasharray: 5 5 +``` + +--- + +## The REPL + +The core of the interface is `REPL.tsx` (896KB). It manages the main interaction loop from the UI perspective, orchestrating input, rendering the conversation history, and handling background tasks. + +### Key Responsibilities + +1. **Virtual Scrolling**: The terminal can't hold infinite text. The REPL implements virtual scrolling, only rendering the messages that fit in the viewport plus a small buffer, while maintaining the illusion of a continuous scrollback. +2. **Input Handling**: Manages the text input area, handling multiline input, pasting files/images, keyboard shortcuts (including vim mode), and voice input. +3. **Task Foregrounding**: Only one agent task can write to the terminal at a time. The REPL manages which task is "foregrounded" and visible. +4. **State Synchronization**: Subscribes to the single immutable `AppState` store to trigger re-renders when data changes. + +--- + +## Component Architecture + +Claude Code breaks down the UI into 113 specialized components. + +### Message Rendering + +Each message type has a dedicated component: + +- `UserMessage`: Renders user input. +- `AssistantMessage`: Renders markdown text from the model, handling streaming updates gracefully. +- `ToolUseMessage`: Displays the execution in progress (e.g., a spinner and the command being run). +- `ToolResultMessage`: Shows the outcome, often truncating long outputs and providing "Show More" functionality. + +### Layout and Styling + +Ink uses Yoga (a Flexbox engine) for layout. Components are composed using Flexbox principles, allowing for responsive terminal designs that adapt to the window size. + +--- + +## State and Hooks + +The UI is deeply integrated with the `AppState` store. + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'lineColor': '#28a745', 'primaryBorderColor': '#28a745'}}}%% +flowchart LR + STORE["AppState
Immutable Store"]:::state + + REPL["REPL Component"]:::ui + COMP["Child Components"]:::ui + + HOOKS["Selectors
useAppState(selector)"]:::hook + + STORE --> HOOKS + HOOKS --> REPL + HOOKS --> COMP + + classDef state fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef hook fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef ui fill:#2d1b4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px +``` + +Custom hooks encapsulate complex logic: +- `useInput`: Handles raw stdin data, necessary for keyboard shortcuts that bypass normal text input. +- `useCanUseTool`: Connects a UI prompt to the permission system, pausing the engine until the user responds. + +--- + +**Previous:** [← API Client](./08-api-client.md) diff --git a/learn/README.md b/learn/README.md new file mode 100644 index 0000000..4f081fe --- /dev/null +++ b/learn/README.md @@ -0,0 +1,157 @@ +# Learn Claude Code Architecture + +> A deep-dive educational guide into the internal architecture of Anthropic's Claude Code CLI — the agentic coding assistant that leaked via a `.map` file in March 2026. + +**Internal Codename:** Tengu +**Runtime:** Bun +**UI Framework:** React + Ink (React for CLI) +**Language:** TypeScript (strict) +**Scale:** ~1,900 files, 512,000+ lines of code + +--- + +## Why This Guide Exists + +Claude Code is one of the most sophisticated agentic AI systems ever built. Its architecture contains lessons in: + +- **Agentic loop design** — How to build a multi-turn, tool-calling AI agent +- **Terminal UI at scale** — 113 React components running in a terminal +- **Permission systems** — Layered security for autonomous code execution +- **Context management** — Keeping conversations within token limits +- **Extension architectures** — Plugins, skills, hooks, and sub-agents + +Whether you're building your own AI agent, contributing to open-source AI tools, or just curious about how the sausage is made — this guide is for you. + +--- + +## Guide Structure + +Start from the top and work down, or jump to whatever interests you: + +| # | Guide | What You'll Learn | +|---|-------|-------------------| +| 1 | [System Overview](./01-system-overview.md) | Bird's-eye view of every layer — entries, UI, core engine, tools, state | +| 2 | [The Agentic Loop](./02-agentic-loop.md) | How `query.ts` drives the model → tool → model cycle | +| 3 | [Tool System](./03-tool-system.md) | How 42 built-in tools are defined, validated, and executed | +| 4 | [Permission System](./04-permission-system.md) | Deny rules, allow rules, hooks, classifier, user prompts | +| 5 | [Context Management](./05-context-management.md) | Snip, micro, auto, reactive compact — the full pipeline | +| 6 | [State Management](./06-state-management.md) | AppState store, immutability, and reactive side effects | +| 7 | [Extension Model](./07-extension-model.md) | Skills, plugins, hooks, sub-agents, and swarms | +| 8 | [API Client](./08-api-client.md) | `claude.ts` — streaming, retries, caching, and model fallback | +| 9 | [UI Architecture](./09-ui-architecture.md) | React Ink, 113 components, and the 896KB REPL | + +--- + +## Key Files Quick Reference + +| File | Size | Role | +|------|------|------| +| `src/main.tsx` | 804KB | CLI entrypoint — Commander.js parser + React/Ink bootstrap | +| `src/screens/REPL.tsx` | 896KB | Interactive terminal shell — the heart of the UI | +| `src/query.ts` | 1,730 lines | The agentic loop — model ↔ tool cycling | +| `src/QueryEngine.ts` | 1,296 lines | Session lifecycle owner — wraps `query()` | +| `src/Tool.ts` | 793 lines | Tool interface definition — every tool implements this | +| `src/commands.ts` | ~25K | Slash command registry | +| `src/state/AppStateStore.ts` | 570 lines | Immutable store with 50+ fields | +| `src/services/compact/` | 11 files | The entire compaction pipeline | + +--- + +## Architecture At A Glance + +```mermaid +%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#e0e0e0', 'primaryBorderColor': '#4a9eff', 'lineColor': '#4a9eff', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460', 'edgeLabelBackground': '#1a1a2e'}}}%% +graph TB + CLI["CLI Entry
main.tsx — 804KB"]:::entry + SDK["SDK Entry
Programmatic API"]:::entry + MCP_S["MCP Server
Expose as MCP"]:::entry + + REPL["REPL.tsx — 896KB
Interactive Terminal Shell"]:::ui + Comps["113 Components
Messages, Diffs, Dialogs"]:::ui + Hooks["83 React Hooks
Permissions, Input, IDE"]:::ui + + QE["QueryEngine.ts
Session Lifecycle Owner"]:::core + QL["query.ts — 1730 lines
Agentic Loop"]:::core + CL["claude.ts — 3420 lines
Anthropic API Client"]:::core + + TD["Tool Interface
Tool.ts"]:::tool + BT["42 Built-in Tools"]:::tool + MT["MCP Tools — dynamic"]:::tool + TO["Tool Orchestration
Parallel Execution"]:::tool + + Compact["Compaction Pipeline
snip / micro / auto /
reactive / collapse"]:::ctx + + Rules["Allow + Deny Rules"]:::perm + HK["PreToolUse Hooks"]:::perm + Classifier["Auto-mode Classifier"]:::perm + + AS["AppState Store
Immutable — 50+ fields"]:::state + SS["Session Storage
Transcripts + Resume"]:::state + Cfg["Config Layer
Global / Project / CLAUDE.md"]:::state + + Skills["Skills"]:::ext + Plugins["Plugins"]:::ext + Agents["Sub-agents + Swarms"]:::ext + + API["Anthropic Messages API"]:::external + MCP_Ext["External MCP Servers"]:::external + GrowthBook["GrowthBook + Statsig"]:::external + + CLI --> REPL + SDK --> QE + MCP_S --> QE + + REPL --> Comps + REPL --> Hooks + REPL --> QE + REPL --> AS + + QE --> QL + QL --> CL + QL --> Compact + QL --> TO + + CL --> API + CL --> GrowthBook + + TO --> TD + TD --> BT + TD --> MT + BT --> Rules + BT --> HK + MT --> Rules + + Rules --> Classifier + + QE --> SS + REPL --> Cfg + + Skills --> TD + Plugins --> TD + Plugins --> MCP_Ext + Agents --> QL + MCP_Ext --> MT + + classDef entry fill:#0d4f4f,stroke:#17a2b8,color:#e0e0e0,stroke-width:2px + classDef ui fill:#1a1a4e,stroke:#6f42c1,color:#e0e0e0,stroke-width:2px + classDef core fill:#2d1b4e,stroke:#e83e8c,color:#e0e0e0,stroke-width:2px + classDef tool fill:#1b3a1b,stroke:#28a745,color:#e0e0e0,stroke-width:2px + classDef ctx fill:#3d2b00,stroke:#fd7e14,color:#e0e0e0,stroke-width:2px + classDef perm fill:#4a1a1a,stroke:#dc3545,color:#e0e0e0,stroke-width:2px + classDef state fill:#1a2d4a,stroke:#4a9eff,color:#e0e0e0,stroke-width:2px + classDef ext fill:#2d2d0d,stroke:#ffc107,color:#e0e0e0,stroke-width:2px + classDef external fill:#333,stroke:#888,color:#aaa,stroke-width:1px,stroke-dasharray: 5 5 +``` + +--- + +## Prerequisites + +To get the most out of these guides, you should be comfortable with: + +- **TypeScript** — The entire codebase is strict TypeScript +- **React** — The UI uses React (via Ink for the terminal) +- **Async generators** — The agentic loop and streaming are built on `async function*` +- **LLM APIs** — Familiarity with the Anthropic Messages API helps + +---