Architecture
This page describes the technical architecture of Chrome ACP in detail.
Overview
Chrome ACP consists of four main components:
Why a Proxy Server?
Chrome extensions run in a browser sandbox and cannot spawn subprocesses. The proxy server acts as a local bridge that:
- Spawns the ACP agent subprocess
- Communicates with the agent via stdin/stdout (NDJSON)
- Relays messages to/from browser clients via WebSocket
- Exposes browser tools to agents via MCP
Components
Chrome Extension
Location: packages/chrome-extension
A Chrome Manifest V3 extension with:
- Sidepanel UI - Chat interface in Chrome's side panel
- Background service worker - Maintains WebSocket connection to proxy
- Browser tools -
browser_tabs,browser_read,browser_execute
Uses chrome.scripting.executeScript() with world: "MAIN" to run scripts in the page's JavaScript context.
Web Client
Location: packages/web-client
A Progressive Web App (PWA) served by the proxy server. Features:
- Same chat UI as the extension (shared components)
- No browser tool access (runs in regular web context)
- Mobile-friendly for QR code connections
Proxy Server
Location: packages/proxy-server
A Node.js server built with Hono:
- Serves web client at
/app - WebSocket endpoint at
/ws - MCP endpoint at
/mcp(Streamable HTTP) - File explorer API for workspace browsing
Shared Package
Location: packages/shared
Shared code used by both chrome-extension and web-client:
ACPClient- WebSocket client for proxy communication- UI components (shadcn/ui + Vercel AI Elements)
- TypeScript type definitions
WebSocket Protocol
Communication between browser clients and proxy server uses JSON messages.
Client → Server Messages
Server → Client Messages
Session Update Types
The session_update message has different subtypes:
Content Types
Messages support multiple content types:
Image Support
Agents can declare image support via promptCapabilities:
The client checks promptCapabilities.image to enable/disable image attachments.
Permission System
Some operations require user confirmation:
Flow:
- Agent requests operation → Proxy sends
permission_request - UI shows permission buttons → User clicks one
- Client sends
permission_responsewith selectedoptionId - Proxy forwards decision → Agent continues or aborts
Model Selection
Agents can expose multiple models:
The UI shows a model selector popover. When user switches:
- Client sends
set_session_modelwith newmodelId - Proxy forwards to agent
- Agent updates session model
File Explorer
The proxy server provides file system access for workspace browsing:
List Directory
Read File
MCP Integration
Browser tools are exposed to agents via MCP (Model Context Protocol):
- Transport: Streamable HTTP at
/mcp - Protocol Version:
2024-11-05
