Gemini CLI Architecture: Core vs CLI Abstraction
Overview
The Gemini CLI project is structured as a monorepo with two main packages that follow a clear separation of concerns:
@google/gemini-cli-core: The core logic package containing all business logic, AI interactions, and tool implementations@google/gemini-cli: The CLI package providing the user interface layer (both interactive and non-interactive modes)
High-Level Architecture
Package Structure
Core Package (packages/core)
The core package is a headless library that provides:
- AI model interactions (Gemini, Claude)
- Tool system and registry
- Configuration management
- File discovery and Git services
- Telemetry and logging
- Authentication handling
CLI Package (packages/cli)
The CLI package is the user-facing application that provides:
- Interactive terminal UI (using React + Ink)
- Command-line argument parsing
- User settings management
- Theme system
- Interactive prompts and confirmations
- Non-interactive mode for scripting
Package Dependencies
Key Interfaces
1. Configuration Interface
The core provides a Config class that encapsulates all configuration:
// Core exports
export class Config {
// Model configuration
getModel(): string;
getEmbeddingModel(): string;
// Directory and file paths
getWorkingDir(): string;
getProjectRoot(): string;
getProjectTempDir(): string;
// Services
getGeminiClient(): GeminiClient;
getToolRegistry(): ToolRegistry;
getFileService(): FileDiscoveryService;
getGitService(): GitService;
// Feature flags
getDebugMode(): boolean;
getCheckpointingEnabled(): boolean;
getFullContext(): boolean;
// Tool configuration
getCoreTools(): string[];
getExcludeTools(): string[];
getToolDiscoveryCommand(): string | undefined;
// Authentication
getContentGeneratorConfig(): ContentGeneratorConfig;
refreshAuth(authType: AuthType): Promise<void>;
}The CLI creates and configures this Config instance based on:
- Command-line arguments
- User settings files
- Environment variables
- Project-specific configuration
2. Gemini Client Interface
The core provides GeminiClient for AI interactions:
export class GeminiClient {
// Initialization
initialize(config: ContentGeneratorConfig): Promise<void>;
// Chat management
getChat(): GeminiChat;
resetChat(): Promise<void>;
// History management
getHistory(): Promise<Content[]>;
setHistory(history: Content[]): Promise<void>;
addHistory(content: Content): Promise<void>;
// Message streaming
sendMessageStream(
query: PartListUnion,
signal: AbortSignal
): AsyncIterable<ServerGeminiStreamEvent>;
}3. Streaming Events Interface
The core defines a comprehensive event system for streaming responses:
export enum GeminiEventType {
Content = "content",
ToolCallRequest = "tool_call_request",
ToolCallConfirmation = "tool_call_confirmation",
ToolCallResponse = "tool_call_response",
Error = "error",
UserCancelled = "user_cancelled",
ChatCompressed = "chat_compressed",
UsageMetadata = "usage_metadata",
Thought = "thought",
}
export type ServerGeminiStreamEvent =
| ServerGeminiContentEvent
| ServerGeminiToolCallRequestEvent
| ServerGeminiErrorEvent
| ServerGeminiChatCompressedEvent
| ServerGeminiUsageMetadataEvent
| ServerGeminiThoughtEvent;
// ... etcEvent Flow Diagram
4. Tool System Interface
The core provides a comprehensive tool system:
// Base tool interface
export interface Tool<
TParams = unknown,
TResult extends ToolResult = ToolResult
> {
name: string;
displayName: string;
description: string;
schema: FunctionDeclaration;
isOutputMarkdown: boolean;
canUpdateOutput: boolean;
validateToolParams(params: TParams): string | null;
getDescription(params: TParams): string;
shouldConfirmExecute(
params: TParams,
signal: AbortSignal
): Promise<ToolCallConfirmationDetails | false>;
execute(
params: TParams,
signal: AbortSignal,
updateOutput?: (output: string) => void
): Promise<TResult>;
}
// Tool registry for managing tools
export class ToolRegistry {
registerTool(tool: Tool): void;
discoverTools(): Promise<void>;
getFunctionDeclarations(): FunctionDeclaration[];
getAllTools(): Tool[];
getTool(name: string): Tool | undefined;
}
// Tool execution result
export interface ToolResult {
llmContent: PartListUnion; // Content for AI model
returnDisplay: string | FileDiff; // Display for user
}5. Tool Execution Flow
The CLI manages tool execution through several layers:
-
Tool Scheduling (CLI layer):
useReactToolSchedulerhook manages tool lifecycle- Handles user confirmations
- Tracks tool status (pending, executing, completed)
- Updates UI in real-time
-
Tool Execution (Core layer):
executeToolCallfunction in core handles actual execution- Validates parameters
- Executes tool logic
- Returns results
-
Tool Response Flow:
TSX// CLI receives tool request from AI ToolCallRequestInfo → scheduleToolCalls() // CLI prompts for confirmation if needed → shouldConfirmExecute() → User confirmation // Core executes tool → executeToolCall() → Tool.execute() // CLI sends response back to AI → submitQuery(toolResponses)
Tool Lifecycle State Machine
Key Abstractions
1. Separation of UI and Logic
The CLI package handles all UI concerns:
- Terminal rendering (React + Ink components)
- User input handling
- Theme management
- Loading indicators
- Progress displays
The Core package handles all business logic:
- AI model interactions
- Tool implementations
- File system operations
- Authentication
- Error handling
2. Event-Driven Architecture
The core uses an event-driven approach for streaming:
- Core emits typed events during AI interactions
- CLI consumes events and updates UI accordingly
- Events include content, tool calls, errors, metadata
3. Tool System Abstraction
Tools are completely defined in Core:
- Tool interface and base class
- Built-in tool implementations
- Tool discovery mechanisms (MCP, custom commands)
CLI only handles:
- Tool confirmation UI
- Progress display
- Result rendering
4. Configuration Management
Core defines configuration structure, CLI populates it:
- CLI reads from multiple sources (files, env, args)
- CLI creates Config instance
- Core uses Config for all operations
5. Authentication Abstraction
Core defines authentication types and interfaces:
export enum AuthType {
USE_GEMINI = "use_gemini",
USE_GOOGLE_AI_STUDIO = "use_google_ai_studio",
USE_VERTEX = "use_vertex",
USE_ANTHROPIC = "use_anthropic",
}CLI handles:
- Authentication UI flows
- Token storage
- User prompts for auth
Non-Interactive Mode
The CLI provides a non-interactive mode that demonstrates the clean separation:
// Minimal UI logic in non-interactive mode
export async function runNonInteractive(config: Config, input: string) {
const geminiClient = config.getGeminiClient()
const toolRegistry = await config.getToolRegistry()
const chat = await geminiClient.getChat()
// Direct streaming without UI
const responseStream = await chat.sendMessageStream(...)
// Simple stdout writing
for await (const resp of responseStream) {
process.stdout.write(getResponseText(resp))
}
}
Extension Points
1. Custom Tools
- Implement the
Toolinterface in a separate package - Register with
ToolRegistry - CLI automatically handles UI for any tool
2. Custom Authentication
- Implement
ContentGeneratorinterface - Add new
AuthType - CLI will handle auth flow UI
3. Custom Themes
- Define theme in CLI package
- Core remains unaware of presentation
4. Alternative UIs
- Core can be used with any UI framework
- Web UI, desktop app, VS Code extension possible
- Just implement event handling and tool confirmation
Benefits of This Architecture
- Testability: Core logic can be tested without UI
- Reusability: Core can be used in different contexts
- Maintainability: Clear boundaries between concerns
- Extensibility: Easy to add new tools, auth methods, UIs
- Type Safety: Strong interfaces between packages
- Modularity: Packages can evolve independently
Data Flow Architecture
Summary
The Gemini CLI architecture demonstrates a clean separation between core business logic and user interface concerns. The Core package provides a comprehensive API for AI interactions, tool execution, and system management, while the CLI package focuses solely on user interaction and presentation. This separation enables the core functionality to be reused in different contexts while maintaining a rich, interactive terminal experience for users.
The diagrams above illustrate:
- High-level architecture showing the main components and their relationships
- Package dependencies showing external libraries used by each package
- Event flow demonstrating the streaming interaction pattern
- Tool lifecycle showing the state machine for tool execution
- Configuration flow showing how settings are aggregated and used
- Extension points showing how the architecture supports customization
- Data flow showing the complete flow from user input to display
This architecture provides excellent separation of concerns, testability, and extensibility while maintaining a cohesive system.