# PicoBot PicoBot is a Rust-based personal AI assistant runtime. It runs a local gateway, connects chat channels such as the terminal TUI and Feishu/Lark, persists sessions in SQLite, and gives the agent a tool system for files, shell commands, web access, memory, scheduling, skills, MCP tools, and delegated sub-agents. ## What It Does - Runs as a gateway server on `127.0.0.1:19876` by default. - Provides a Ratatui terminal client over WebSocket. - Supports Feishu/Lark messages, reactions, file upload/download, and media references. - Calls OpenAI-compatible providers and Anthropic Messages API providers. - Persists conversations, messages, memories, scheduled jobs, LLM call metadata, and background sub-agent tasks in SQLite. - Loads skills from workspace, user, and shared skill directories, with built-in skills installed on first use. - Compresses long contexts and stores timeline summaries for later recall. - Can register tools discovered from configured MCP servers. ## Architecture ```text Channel -> MessageBus -> SessionManager -> AgentLoop -> LLM Provider | | | v | Tools v SQLite Control messages -> SessionManager -> MessageBus -> OutboundDispatcher -> Channel ``` The main runtime boundary is: - `channels` only receive and send external messages. - `bus` is an async queue, not a router. - `session` owns dialog lifecycle, persistence, memory recall, prompt assembly, compression, and task cancellation. - `agent` runs the stateless LLM/tool loop. - `providers` are HTTP clients for model APIs. - `tools` execute agent actions and return string results. - `storage` owns SQLite schema and CRUD. - `scheduler` polls due jobs and feeds prompts back into sessions. ## Features ### Channels - `cli_chat`: terminal TUI client connected through `/ws`. - `feishu`: Feishu/Lark channel with configurable allow list, media directory, and reaction emoji. ### LLM Providers - OpenAI-compatible chat completions, including DashScope, Volcengine, and similar APIs. - Anthropic Messages API. - Model-specific `input_type` metadata for text/image capability checks. - JSON Schema cleanup for cross-provider tool compatibility. ### Sessions And Memory - Session IDs use `::`. - Each channel/chat can have multiple dialogs. - Dialog operations include create, list, switch, rename, delete, compact, dump, info, and stop. - Session history is persisted to SQLite and can be incrementally restored after compression. - Knowledge memories are recalled into the system prompt each turn. - Timeline memories are produced by context compression and can be searched later. ### Tools Base tools registered for the agent: | Tool | Purpose | |------|---------| | `calculator` | Math expressions and statistics | | `file_read` / `file_write` / `file_edit` | Workspace file operations | | `file_search` / `content_search` | File and content search | | `bash` | Run shell commands in the workspace | | `http_request` | HTTP API requests | | `web_fetch` | Fetch and extract web page text | | `get_skill` | List or load local skills | | `memory_store` / `memory_recall` / `timeline_recall` / `memory_forget` | Long-term memory operations | | `delegate` | Run inline, background, or parallel sub-agents | | `send_message` | Send outbound messages to configured channels | | `chat_manager` | Inspect sessions, channels, and stored messages | | `cron_add/list/remove/enable/disable/update` | Manage scheduled jobs when scheduler is enabled | | `browser` | Optional WebDriver browser automation when enabled | | MCP tools | Dynamically registered from configured MCP servers | ### Skills Skills are directories containing `SKILL.md`. Load priority is: 1. `{workspace}/skills` 2. `~/.picobot/skills` 3. `~/.agents/skills` Same-name skills in higher-priority locations override lower-priority ones. Built-in skills from `resources/skills` are embedded into the binary and installed into `~/.picobot/skills` if missing. ## Quick Start ### Prerequisites - Rust toolchain with edition 2024 support. - A configured LLM provider API key. ### Build ```bash cargo build ``` ### Configure PicoBot loads `~/.picobot/config.json` first, then falls back to `./config.json`. On gateway startup, a template is released to `~/.picobot/config.example.json` if it does not exist. The source template is [resources/templates/config.example.json](/home/xiaoxixi/code/PicoBot/resources/templates/config.example.json). Minimal example: ```json { "providers": { "openai": { "type": "openai", "base_url": "https://api.openai.com/v1", "api_key": "", "extra_headers": {} } }, "models": { "gpt-4o": { "model_id": "gpt-4o", "temperature": 0.7, "max_tokens": 4096, "input_type": ["text", "image"] } }, "agents": { "default": { "provider": "openai", "model": "gpt-4o", "max_tool_iterations": 99, "token_limit": 128000 } }, "workspace_dir": "~/.picobot/workspace" } ``` The `.env` file in the current directory is parsed by PicoBot itself. Values like `` in JSON are replaced from the process environment after `.env` is loaded. ### Run ```bash cargo run -- gateway ``` The gateway switches the process working directory to `workspace_dir` and stores `picobot.db` there by default. In another terminal: ```bash cargo run -- chat ``` The client connects to `ws://127.0.0.1:19876/ws` by default. Override with `--gateway-url`. ## Configuration Top-level config fields: | Field | Purpose | |-------|---------| | `providers` | Named LLM provider configs | | `models` | Named model configs | | `agents` | Agent-to-provider/model binding | | `gateway` | Bind address, session DB path, cleanup, scheduler, background task limits | | `client` | Default WebSocket URL for the TUI client | | `channels` | Channel configs, currently Feishu/Lark | | `memory` | Recall and consolidation settings | | `mcp` | MCP server configs | | `browser` | Optional WebDriver browser tool config | | `workspace_dir` | Workspace used for file tools, shell commands, DB default, and workspace skills | Important defaults: | Key | Default | |-----|---------| | `gateway.host` | `127.0.0.1` | | `gateway.port` | `19876` | | `gateway.max_concurrent_background_tasks` | `10` | | `gateway.scheduler.enabled` | `true` if `scheduler` is omitted and defaulted | | `client.gateway_url` | `ws://127.0.0.1:19876/ws` | | `memory.recall_limit` | `5` | | `memory.timeline_retention_days` | `90` | | `mcp.tool_timeout_secs` | `180` | | `browser.enabled` | `false` | MCP servers support `stdio`, `sse`, and `streamable-http` transports. Browser automation requires a compatible Chrome/Chromium and chromedriver/WebDriver endpoint. ## Slash Commands Available from CLI chat and channel text messages: | Command | Description | |---------|-------------| | `/new` | Create a new dialog | | `/sessions` | List recent dialogs | | `/switch ` | Switch dialog | | `/rename ` | Rename current dialog | | `/delete` | Delete current dialog | | `/compact` | Manually trigger context compression | | `/info` | Show current dialog information | | `/dump` | Save current dialog as Markdown | | `/?`, `/help` | Show help | | `/mcp` | Show MCP server and tool status | | `/stop` | Stop active tasks and clear queued messages | ## WebSocket API The gateway exposes: | Method | Path | Description | |--------|------|-------------| | `GET` | `/health` | Returns service health and version | | `GET` | `/ws` | WebSocket upgrade for chat clients | Inbound WebSocket message types: | Type | Main fields | |------|-------------| | `user_input` | `content`, optional `channel`, `chat_id`, `sender_id` | | `clear_history` | optional `chat_id`, `session_id` | | `create_session` | optional `title` | | `list_sessions` | `include_archived` | | `load_session` | `session_id` | | `rename_session` | optional `session_id`, `title` | | `archive_session` | optional `session_id` | | `delete_session` | optional `session_id` | | `get_slash_commands` | none | | `ping` | none | Outbound WebSocket message types include `assistant_response`, `error`, `session_established`, `session_created`, `session_list`, `session_loaded`, `session_renamed`, `session_archived`, `session_deleted`, `history_cleared`, `slash_commands_list`, `pong`, `command_executed`, and `system_notification`. ## Testing ```bash # Unit tests cargo test --lib # Integration tests require real API keys in tests/test.env cp tests/test.env.example tests/test.env cargo test --test test_integration -- --ignored cargo test --test test_tool_calling -- --ignored cargo test --test test_request_format -- --ignored ``` Integration tests are ignored by default because they make real provider calls. ## Project Layout ```text src/ agent/ LLM loop, context compression, system prompts, media handling, sub-agents bus/ Inbound, outbound, and control message queues channels/ CLI chat and Feishu/Lark integrations client/ Ratatui terminal UI config/ Config loading, env substitution, path expansion gateway/ Axum HTTP/WebSocket server and GatewayState wiring mcp/ MCP client connections and tool wrappers memory/ Memory manager and memory types observability/ Agent/tool telemetry observer interfaces providers/ OpenAI-compatible and Anthropic clients scheduler/ Scheduled job runtime session/ Session lifecycle, dialog commands, persistence integration skills/ Skill loading and embedded built-in skill installation storage/ SQLite schema and CRUD tools/ Agent tool implementations resources/ skills/ Built-in skills embedded at build time templates/ Config, AGENTS.md, and USER.md templates released on first run tests/ Unit and ignored integration tests reference/ Third-party reference code; do not modify as project source ``` ## Key Dependencies | Crate | Purpose | |-------|---------| | `axum`, `tokio`, `tokio-tungstenite` | Gateway and WebSocket runtime | | `sqlx` | SQLite persistence | | `reqwest` | LLM and HTTP clients | | `ratatui`, `crossterm`, `termimad` | Terminal UI | | `rmcp` | MCP client support | | `fantoccini` | Optional browser automation | | `cron`, `chrono-tz` | Scheduling | | `jieba-rs` | Chinese tokenization for memory search | | `zstd`, `tar` | Embedded built-in skill packaging |