Gemini CLI Architecture: Core vs CLI Abstraction

Hrishi Olickel & Claude Opus 4·

Gemini CLI Architecture: Core vs CLI Abstraction

Overview

The Gemini CLI project is structured as a monorepo with two main packages that follow a clear separation of concerns:

  • @google/gemini-cli-core: The core logic package containing all business logic, AI interactions, and tool implementations
  • @google/gemini-cli: The CLI package providing the user interface layer (both interactive and non-interactive modes)

High-Level Architecture

Package Structure

Core Package (packages/core)

The core package is a headless library that provides:

  • AI model interactions (Gemini, Claude)
  • Tool system and registry
  • Configuration management
  • File discovery and Git services
  • Telemetry and logging
  • Authentication handling

CLI Package (packages/cli)

The CLI package is the user-facing application that provides:

  • Interactive terminal UI (using React + Ink)
  • Command-line argument parsing
  • User settings management
  • Theme system
  • Interactive prompts and confirmations
  • Non-interactive mode for scripting

Package Dependencies

Key Interfaces

1. Configuration Interface

The core provides a Config class that encapsulates all configuration:

TSX
// Core exports export class Config { // Model configuration getModel(): string; getEmbeddingModel(): string; // Directory and file paths getWorkingDir(): string; getProjectRoot(): string; getProjectTempDir(): string; // Services getGeminiClient(): GeminiClient; getToolRegistry(): ToolRegistry; getFileService(): FileDiscoveryService; getGitService(): GitService; // Feature flags getDebugMode(): boolean; getCheckpointingEnabled(): boolean; getFullContext(): boolean; // Tool configuration getCoreTools(): string[]; getExcludeTools(): string[]; getToolDiscoveryCommand(): string | undefined; // Authentication getContentGeneratorConfig(): ContentGeneratorConfig; refreshAuth(authType: AuthType): Promise<void>; }

The CLI creates and configures this Config instance based on:

  • Command-line arguments
  • User settings files
  • Environment variables
  • Project-specific configuration

2. Gemini Client Interface

The core provides GeminiClient for AI interactions:

TSX
export class GeminiClient { // Initialization initialize(config: ContentGeneratorConfig): Promise<void>; // Chat management getChat(): GeminiChat; resetChat(): Promise<void>; // History management getHistory(): Promise<Content[]>; setHistory(history: Content[]): Promise<void>; addHistory(content: Content): Promise<void>; // Message streaming sendMessageStream( query: PartListUnion, signal: AbortSignal ): AsyncIterable<ServerGeminiStreamEvent>; }

3. Streaming Events Interface

The core defines a comprehensive event system for streaming responses:

TSX
export enum GeminiEventType { Content = "content", ToolCallRequest = "tool_call_request", ToolCallConfirmation = "tool_call_confirmation", ToolCallResponse = "tool_call_response", Error = "error", UserCancelled = "user_cancelled", ChatCompressed = "chat_compressed", UsageMetadata = "usage_metadata", Thought = "thought", } export type ServerGeminiStreamEvent = | ServerGeminiContentEvent | ServerGeminiToolCallRequestEvent | ServerGeminiErrorEvent | ServerGeminiChatCompressedEvent | ServerGeminiUsageMetadataEvent | ServerGeminiThoughtEvent; // ... etc

Event Flow Diagram

4. Tool System Interface

The core provides a comprehensive tool system:

TSX
// Base tool interface export interface Tool< TParams = unknown, TResult extends ToolResult = ToolResult > { name: string; displayName: string; description: string; schema: FunctionDeclaration; isOutputMarkdown: boolean; canUpdateOutput: boolean; validateToolParams(params: TParams): string | null; getDescription(params: TParams): string; shouldConfirmExecute( params: TParams, signal: AbortSignal ): Promise<ToolCallConfirmationDetails | false>; execute( params: TParams, signal: AbortSignal, updateOutput?: (output: string) => void ): Promise<TResult>; } // Tool registry for managing tools export class ToolRegistry { registerTool(tool: Tool): void; discoverTools(): Promise<void>; getFunctionDeclarations(): FunctionDeclaration[]; getAllTools(): Tool[]; getTool(name: string): Tool | undefined; } // Tool execution result export interface ToolResult { llmContent: PartListUnion; // Content for AI model returnDisplay: string | FileDiff; // Display for user }

5. Tool Execution Flow

The CLI manages tool execution through several layers:

  1. Tool Scheduling (CLI layer):

    • useReactToolScheduler hook manages tool lifecycle
    • Handles user confirmations
    • Tracks tool status (pending, executing, completed)
    • Updates UI in real-time
  2. Tool Execution (Core layer):

    • executeToolCall function in core handles actual execution
    • Validates parameters
    • Executes tool logic
    • Returns results
  3. Tool Response Flow:

    TSX
    // CLI receives tool request from AI ToolCallRequestInfoscheduleToolCalls() // CLI prompts for confirmation if needed shouldConfirmExecute()User confirmation // Core executes tool executeToolCall()Tool.execute() // CLI sends response back to AI submitQuery(toolResponses)

Tool Lifecycle State Machine

Key Abstractions

1. Separation of UI and Logic

The CLI package handles all UI concerns:

  • Terminal rendering (React + Ink components)
  • User input handling
  • Theme management
  • Loading indicators
  • Progress displays

The Core package handles all business logic:

  • AI model interactions
  • Tool implementations
  • File system operations
  • Authentication
  • Error handling

2. Event-Driven Architecture

The core uses an event-driven approach for streaming:

  • Core emits typed events during AI interactions
  • CLI consumes events and updates UI accordingly
  • Events include content, tool calls, errors, metadata

3. Tool System Abstraction

Tools are completely defined in Core:

  • Tool interface and base class
  • Built-in tool implementations
  • Tool discovery mechanisms (MCP, custom commands)

CLI only handles:

  • Tool confirmation UI
  • Progress display
  • Result rendering

4. Configuration Management

Core defines configuration structure, CLI populates it:

  • CLI reads from multiple sources (files, env, args)
  • CLI creates Config instance
  • Core uses Config for all operations

5. Authentication Abstraction

Core defines authentication types and interfaces:

TSX
export enum AuthType { USE_GEMINI = "use_gemini", USE_GOOGLE_AI_STUDIO = "use_google_ai_studio", USE_VERTEX = "use_vertex", USE_ANTHROPIC = "use_anthropic", }

CLI handles:

  • Authentication UI flows
  • Token storage
  • User prompts for auth

Non-Interactive Mode

The CLI provides a non-interactive mode that demonstrates the clean separation:

TSX
// Minimal UI logic in non-interactive mode export async function runNonInteractive(config: Config, input: string) { const geminiClient = config.getGeminiClient() const toolRegistry = await config.getToolRegistry() const chat = await geminiClient.getChat() // Direct streaming without UI const responseStream = await chat.sendMessageStream(...) // Simple stdout writing for await (const resp of responseStream) { process.stdout.write(getResponseText(resp)) } }

Extension Points

1. Custom Tools

  • Implement the Tool interface in a separate package
  • Register with ToolRegistry
  • CLI automatically handles UI for any tool

2. Custom Authentication

  • Implement ContentGenerator interface
  • Add new AuthType
  • CLI will handle auth flow UI

3. Custom Themes

  • Define theme in CLI package
  • Core remains unaware of presentation

4. Alternative UIs

  • Core can be used with any UI framework
  • Web UI, desktop app, VS Code extension possible
  • Just implement event handling and tool confirmation

Benefits of This Architecture

  1. Testability: Core logic can be tested without UI
  2. Reusability: Core can be used in different contexts
  3. Maintainability: Clear boundaries between concerns
  4. Extensibility: Easy to add new tools, auth methods, UIs
  5. Type Safety: Strong interfaces between packages
  6. Modularity: Packages can evolve independently

Data Flow Architecture

Summary

The Gemini CLI architecture demonstrates a clean separation between core business logic and user interface concerns. The Core package provides a comprehensive API for AI interactions, tool execution, and system management, while the CLI package focuses solely on user interaction and presentation. This separation enables the core functionality to be reused in different contexts while maintaining a rich, interactive terminal experience for users.

The diagrams above illustrate:

  • High-level architecture showing the main components and their relationships
  • Package dependencies showing external libraries used by each package
  • Event flow demonstrating the streaming interaction pattern
  • Tool lifecycle showing the state machine for tool execution
  • Configuration flow showing how settings are aggregated and used
  • Extension points showing how the architecture supports customization
  • Data flow showing the complete flow from user input to display

This architecture provides excellent separation of concerns, testability, and extensibility while maintaining a cohesive system.