Architecture

Claude Code's architecture is organized around a multi-phase startup sequence, a dual-layer state management system, and an async-generator-based query pipeline that drives the agentic loop.

Startup Lifecycle

The application boot sequence is split into five phases, each designed to minimize time-to-interactive by running expensive operations in parallel where possible.

Phase 1: Pre-Profiling Side Effects

Before any module evaluation, main.tsx fires three parallel side effects at the top-level:

profileCheckpoint('main_tsx_entry'): marks entry for the startup profiler

startMdmRawRead(): spawns MDM subprocesses (plutil on macOS, reg query on Windows) to read managed device settings in the background

startKeychainPrefetch(): fires macOS keychain reads for OAuth tokens and legacy API keys in parallel, saving ~65ms that would otherwise be spent in sequential synchronous spawns

These run concurrently with the remaining ~135ms of import evaluation.

Phase 2: Configuration Setup

Handled by init() in entrypoints/init.ts:

enableConfigs(): validates and activates the configuration system

applySafeConfigEnvironmentVariables(): applies non-sensitive env vars before the trust dialog

applyExtraCACertsFromConfig(): injects custom CA certificates before any TLS connections (Bun caches the cert store at boot via BoringSSL)

setupGracefulShutdown(): registers process signal handlers for clean exit

Telemetry initialization: deferred via dynamic import() to avoid loading ~400KB of OpenTelemetry + protobuf modules at startup

Phase 3: File System & Project Setup

Back in main.tsx, after init() completes:

Resolves the working directory and git root

Sets up the project root for session identity

Ensures MDM settings are loaded (ensureMdmSettingsLoaded())

Completes the keychain prefetch (ensureKeychainPrefetchCompleted())

Applies full config environment variables (post-trust)

Runs settings migrations (model renames, permission migrations, etc.)

Phase 4: Prefetch

Non-blocking prefetches are fired in parallel to reduce latency once the user starts typing:

Bootstrap data from the API (fetchBootstrapData)

GrowthBook feature flag initialization (initializeGrowthBook)

MCP official registry URLs (prefetchOfficialMcpUrls)

Referral pass eligibility (prefetchPassesEligibility)

AWS/GCP credentials for Bedrock/Vertex providers

Fast mode status (prefetchFastModeStatus)

Policy limits and remote managed settings loading

Phase 5: Interactive vs Headless Branching

The final phase diverges based on whether the session is interactive or headless:

Interactive: renders the Ink/React UI via renderAndRun(), shows setup screens (trust dialog, onboarding), then enters the REPL loop

Headless (-p flag or SDK): constructs a QueryEngine directly, calls submitMessage(), and streams results to stdout or the SDK consumer

State Management

Claude Code uses two complementary state layers: a mutable bootstrap store for session-global values and an immutable AppState store for UI-driven state.

AppState is defined in state/AppStateStore.ts and wrapped with DeepImmutable<> to prevent accidental mutation. It is managed through a store that provides getAppState() and setAppState() accessors.

export type AppState = DeepImmutable<{
  settings: SettingsJson
  verbose: boolean
  mainLoopModel: ModelSetting
  mainLoopModelForSession: ModelSetting
  statusLineText: string | undefined
  isBriefOnly: boolean
  toolPermissionContext: ToolPermissionContext
  agent: string | undefined
  kairosEnabled: boolean
  mcp: {
    clients: MCPServerConnection[]
    tools: Tool[]
    commands: Command[]
    resources: Record<string, ServerResource[]>
    pluginReconnectKey: number
  }
  plugins: {
    enabled: LoadedPlugin[]
    disabled: LoadedPlugin[]
    commands: Command[]
    errors: PluginError[]
  }
  // ... additional UI state fields
}>

Key characteristics:

Immutable updates: state changes go through setAppState(prev => ({ ...prev, field: newValue })), similar to React's useState
UI-driven: the Ink/React rendering layer subscribes to AppState changes and re-renders accordingly
Per-session: a fresh AppState is created for each session via getDefaultAppState()

The comment at line 31 of bootstrap/state.ts reads: "DO NOT ADD MORE STATE HERE - BE JUDICIOUS WITH GLOBAL STATE". New state should prefer AppState unless it is truly session-global and needed before the UI initializes.

Query Processing Pipeline

The query pipeline is the agentic core of Claude Code. It transforms a user message into a fully-resolved assistant response, including any tool calls the model decides to make.

User Input
  |
  v
processUserInput()          : Parse slash commands, attachments, context files
  |
  v
Permission Check            : canUseTool() validates tool access per permission mode
  |
  v
Message Normalization       : normalizeMessagesForAPI() prepares conversation history
  |
  v
Context Building            : System prompt + user context + system context assembled
  |                            prependUserContext() / appendSystemContext()
  v
API Call                    : Streaming request to Claude model
  |                            Yields StreamEvent objects as tokens arrive
  v
Tool Orchestration          : StreamingToolExecutor processes tool_use blocks
  |                            runTools() executes tools in parallel where safe
  |                            Tool results fed back as tool_result messages
  v
Stop Hooks                  : handleStopHooks() runs post-sampling hooks
  |                            Can trigger additional turns or modifications
  v
Response Streaming          : AsyncGenerator yields final Message objects
  |                            Auto-compact check (token warning state)
  |                            Max output tokens recovery (up to 3 retries)
  v
Terminal | Continue         : Loop terminates or continues for tool results

The Query Loop

The query() function in query.ts is an async generator that implements the agentic loop:

export async function* query(
  params: QueryParams,
): AsyncGenerator<
  StreamEvent | RequestStartEvent | Message | TombstoneMessage | ToolUseSummaryMessage,
  Terminal
> {
  // ...delegates to queryLoop()
}

The queryLoop maintains mutable cross-iteration state:

type State = {
  messages: Message[]
  toolUseContext: ToolUseContext
  autoCompactTracking: AutoCompactTrackingState | undefined
  maxOutputTokensRecoveryCount: number
  hasAttemptedReactiveCompact: boolean
  maxOutputTokensOverride: number | undefined
  pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
  stopHookActive: boolean | undefined
  turnCount: number
  transition: Continue | undefined
}

Each iteration of the while (true) loop:

Builds the query configuration via buildQueryConfig()
Normalizes messages for the API
Prepends user context and appends system context
Fires the streaming API call
Processes tool use blocks through StreamingToolExecutor
Handles auto-compaction if the context window is getting large
Runs stop hooks to determine if the turn should end
Decides whether to continue (tool results pending) or terminate

QueryEngine

QueryEngine in QueryEngine.ts wraps the query pipeline for the SDK/headless path. It owns the full conversation lifecycle:

export class QueryEngine {
  private config: QueryEngineConfig
  private mutableMessages: Message[]
  private abortController: AbortController
  private permissionDenials: SDKPermissionDenial[]
  private totalUsage: NonNullableUsage

  async *submitMessage(
    prompt: string | ContentBlockParam[],
    options?: { uuid?: string; isMeta?: boolean },
  ): AsyncGenerator<SDKMessage, void, unknown>
}

One QueryEngine instance is created per conversation. Each submitMessage() call starts a new turn within the same conversation, preserving message history, file state cache, and cumulative usage across turns.

Streaming Architecture

Claude Code uses async generators throughout its streaming pipeline. This allows each layer to yield events as they arrive without buffering the entire response.

The event types flow through the system:

// Low-level API events
type StreamEvent = {
  type: 'stream_event'
  event: /* SSE event from the API */
}

// Conversation-level messages
type Message =
  | UserMessage
  | AssistantMessage
  | SystemMessage
  | AttachmentMessage

// Query-level signals
type RequestStartEvent = {
  type: 'request_start'
  requestId: string
}

The REPL UI consumes these events to render streaming text, tool invocations, and progress indicators in real-time. The SDK path maps them to SDKMessage types for programmatic consumers.

The query loop includes a max output tokens recovery mechanism that automatically retries up to 3 times when the model hits the output token limit mid-response. It also supports reactive compaction (feature-gated) that compresses conversation history when the context window approaches capacity.

On this page
Architecture
Startup Lifecycle
Phase 1: Pre-Profiling Side Effects
Phase 2: Configuration Setup
Phase 3: File System & Project Setup
Phase 4: Prefetch
Phase 5: Interactive vs Headless Branching
State Management
Query Processing Pipeline
The Query Loop
QueryEngine
Streaming Architecture

Introduction Previous Entry PointsNext