AI Assistant

Architecture

Claude Code's architecture is organized around a multi-phase startup sequence, a dual-layer state management system, and an async-generator-based query pipeline that drives the agentic loop.

Startup Lifecycle

The application boot sequence is split into five phases, each designed to minimize time-to-interactive by running expensive operations in parallel where possible.

1

Phase 1: Pre-Profiling Side Effects
2

Before any module evaluation, main.tsx fires three parallel side effects at the top-level:
3

  • profileCheckpoint('main_tsx_entry'): marks entry for the startup profiler
  • startMdmRawRead(): spawns MDM subprocesses (plutil on macOS, reg query on Windows) to read managed device settings in the background
  • startKeychainPrefetch(): fires macOS keychain reads for OAuth tokens and legacy API keys in parallel, saving ~65ms that would otherwise be spent in sequential synchronous spawns
  • 4

    These run concurrently with the remaining ~135ms of import evaluation.
    5

    Phase 2: Configuration Setup
    6

    Handled by init() in entrypoints/init.ts:
    7

  • enableConfigs(): validates and activates the configuration system
  • applySafeConfigEnvironmentVariables(): applies non-sensitive env vars before the trust dialog
  • applyExtraCACertsFromConfig(): injects custom CA certificates before any TLS connections (Bun caches the cert store at boot via BoringSSL)
  • setupGracefulShutdown(): registers process signal handlers for clean exit
  • Telemetry initialization: deferred via dynamic import() to avoid loading ~400KB of OpenTelemetry + protobuf modules at startup
  • 8

    Phase 3: File System & Project Setup
    9

    Back in main.tsx, after init() completes:
    10

  • Resolves the working directory and git root
  • Sets up the project root for session identity
  • Ensures MDM settings are loaded (ensureMdmSettingsLoaded())
  • Completes the keychain prefetch (ensureKeychainPrefetchCompleted())
  • Applies full config environment variables (post-trust)
  • Runs settings migrations (model renames, permission migrations, etc.)
  • 11

    Phase 4: Prefetch
    12

    Non-blocking prefetches are fired in parallel to reduce latency once the user starts typing:
    13

  • Bootstrap data from the API (fetchBootstrapData)
  • GrowthBook feature flag initialization (initializeGrowthBook)
  • MCP official registry URLs (prefetchOfficialMcpUrls)
  • Referral pass eligibility (prefetchPassesEligibility)
  • AWS/GCP credentials for Bedrock/Vertex providers
  • Fast mode status (prefetchFastModeStatus)
  • Policy limits and remote managed settings loading
  • 14

    Phase 5: Interactive vs Headless Branching
    15

    The final phase diverges based on whether the session is interactive or headless:
    16

  • Interactive: renders the Ink/React UI via renderAndRun(), shows setup screens (trust dialog, onboarding), then enters the REPL loop
  • Headless (-p flag or SDK): constructs a QueryEngine directly, calls submitMessage(), and streams results to stdout or the SDK consumer
  • State Management

    Claude Code uses two complementary state layers: a mutable bootstrap store for session-global values and an immutable AppState store for UI-driven state.

    AppState is defined in state/AppStateStore.ts and wrapped with DeepImmutable<> to prevent accidental mutation. It is managed through a store that provides getAppState() and setAppState() accessors.

    export type AppState = DeepImmutable<{
      settings: SettingsJson
      verbose: boolean
      mainLoopModel: ModelSetting
      mainLoopModelForSession: ModelSetting
      statusLineText: string | undefined
      isBriefOnly: boolean
      toolPermissionContext: ToolPermissionContext
      agent: string | undefined
      kairosEnabled: boolean
      mcp: {
        clients: MCPServerConnection[]
        tools: Tool[]
        commands: Command[]
        resources: Record<string, ServerResource[]>
        pluginReconnectKey: number
      }
      plugins: {
        enabled: LoadedPlugin[]
        disabled: LoadedPlugin[]
        commands: Command[]
        errors: PluginError[]
      }
      // ... additional UI state fields
    }>

    Key characteristics:

    • Immutable updates: state changes go through setAppState(prev => ({ ...prev, field: newValue })), similar to React's useState
    • UI-driven: the Ink/React rendering layer subscribes to AppState changes and re-renders accordingly
    • Per-session: a fresh AppState is created for each session via getDefaultAppState()

    The comment at line 31 of bootstrap/state.ts reads: "DO NOT ADD MORE STATE HERE - BE JUDICIOUS WITH GLOBAL STATE". New state should prefer AppState unless it is truly session-global and needed before the UI initializes.

    Query Processing Pipeline

    The query pipeline is the agentic core of Claude Code. It transforms a user message into a fully-resolved assistant response, including any tool calls the model decides to make.

    User Input
      |
      v
    processUserInput()          : Parse slash commands, attachments, context files
      |
      v
    Permission Check            : canUseTool() validates tool access per permission mode
      |
      v
    Message Normalization       : normalizeMessagesForAPI() prepares conversation history
      |
      v
    Context Building            : System prompt + user context + system context assembled
      |                            prependUserContext() / appendSystemContext()
      v
    API Call                    : Streaming request to Claude model
      |                            Yields StreamEvent objects as tokens arrive
      v
    Tool Orchestration          : StreamingToolExecutor processes tool_use blocks
      |                            runTools() executes tools in parallel where safe
      |                            Tool results fed back as tool_result messages
      v
    Stop Hooks                  : handleStopHooks() runs post-sampling hooks
      |                            Can trigger additional turns or modifications
      v
    Response Streaming          : AsyncGenerator yields final Message objects
      |                            Auto-compact check (token warning state)
      |                            Max output tokens recovery (up to 3 retries)
      v
    Terminal | Continue         : Loop terminates or continues for tool results

    The Query Loop

    The query() function in query.ts is an async generator that implements the agentic loop:

    export async function* query(
      params: QueryParams,
    ): AsyncGenerator<
      StreamEvent | RequestStartEvent | Message | TombstoneMessage | ToolUseSummaryMessage,
      Terminal
    > {
      // ...delegates to queryLoop()
    }

    The queryLoop maintains mutable cross-iteration state:

    type State = {
      messages: Message[]
      toolUseContext: ToolUseContext
      autoCompactTracking: AutoCompactTrackingState | undefined
      maxOutputTokensRecoveryCount: number
      hasAttemptedReactiveCompact: boolean
      maxOutputTokensOverride: number | undefined
      pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
      stopHookActive: boolean | undefined
      turnCount: number
      transition: Continue | undefined
    }

    Each iteration of the while (true) loop:

    1. Builds the query configuration via buildQueryConfig()
    2. Normalizes messages for the API
    3. Prepends user context and appends system context
    4. Fires the streaming API call
    5. Processes tool use blocks through StreamingToolExecutor
    6. Handles auto-compaction if the context window is getting large
    7. Runs stop hooks to determine if the turn should end
    8. Decides whether to continue (tool results pending) or terminate

    QueryEngine

    QueryEngine in QueryEngine.ts wraps the query pipeline for the SDK/headless path. It owns the full conversation lifecycle:

    export class QueryEngine {
      private config: QueryEngineConfig
      private mutableMessages: Message[]
      private abortController: AbortController
      private permissionDenials: SDKPermissionDenial[]
      private totalUsage: NonNullableUsage
    
      async *submitMessage(
        prompt: string | ContentBlockParam[],
        options?: { uuid?: string; isMeta?: boolean },
      ): AsyncGenerator<SDKMessage, void, unknown>
    }

    One QueryEngine instance is created per conversation. Each submitMessage() call starts a new turn within the same conversation, preserving message history, file state cache, and cumulative usage across turns.

    Streaming Architecture

    Claude Code uses async generators throughout its streaming pipeline. This allows each layer to yield events as they arrive without buffering the entire response.

    The event types flow through the system:

    // Low-level API events
    type StreamEvent = {
      type: 'stream_event'
      event: /* SSE event from the API */
    }
    
    // Conversation-level messages
    type Message =
      | UserMessage
      | AssistantMessage
      | SystemMessage
      | AttachmentMessage
    
    // Query-level signals
    type RequestStartEvent = {
      type: 'request_start'
      requestId: string
    }

    The REPL UI consumes these events to render streaming text, tool invocations, and progress indicators in real-time. The SDK path maps them to SDKMessage types for programmatic consumers.

    The query loop includes a max output tokens recovery mechanism that automatically retries up to 3 times when the model hits the output token limit mid-response. It also supports reactive compaction (feature-gated) that compresses conversation history when the context window approaches capacity.