From 48c8a51d9a47743fb027d1e123a0f013536f7fa1 Mon Sep 17 00:00:00 2001 From: xiaoski Date: Mon, 15 Jun 2026 23:57:08 +0800 Subject: [PATCH] Update README.md and add runtime architecture and message flow diagrams - Translated and updated the README.md to provide a clearer overview of PicoBot's functionality and architecture in Chinese. - Added a new SVG diagram for message flow to illustrate the process from user input through various components back to the channel. - Created a new SVG diagram for runtime architecture to depict the high-level structure of PicoBot, including channels, gateway, message bus, session manager, agent loop, tools, providers, storage, scheduler, skills, and MCP. --- README.md | 397 ++++++++++++++------------- docs/assets/message-flow.svg | 101 +++++++ docs/assets/runtime-architecture.svg | 88 ++++++ 3 files changed, 398 insertions(+), 188 deletions(-) create mode 100644 docs/assets/message-flow.svg create mode 100644 docs/assets/runtime-architecture.svg diff --git a/README.md b/README.md index a642fdb..d76c4f0 100644 --- a/README.md +++ b/README.md @@ -1,114 +1,45 @@ # PicoBot -PicoBot is a Rust-based personal AI assistant runtime. It runs a local gateway, connects chat channels such as the terminal TUI and Feishu/Lark, persists sessions in SQLite, and gives the agent a tool system for files, shell commands, web access, memory, scheduling, skills, MCP tools, and delegated sub-agents. +PicoBot 是一个用 Rust 编写的个人 AI 助手运行时。它在本地启动 Gateway,接入 CLI TUI、飞书/Lark 等聊天渠道,把会话、消息、记忆和定时任务持久化到 SQLite,并为 Agent 提供文件、Shell、HTTP、Web、MCP、Skill、记忆、浏览器和子 Agent 等工具能力。 -## What It Does +它更像一个可扩展的“个人助手操作系统”:渠道负责收发消息,SessionManager 负责会话和上下文,AgentLoop 负责模型与工具循环,Storage 负责可靠落盘。 -- Runs as a gateway server on `127.0.0.1:19876` by default. -- Provides a Ratatui terminal client over WebSocket. -- Supports Feishu/Lark messages, reactions, file upload/download, and media references. -- Calls OpenAI-compatible providers and Anthropic Messages API providers. -- Persists conversations, messages, memories, scheduled jobs, LLM call metadata, and background sub-agent tasks in SQLite. -- Loads skills from workspace, user, and shared skill directories, with built-in skills installed on first use. -- Compresses long contexts and stores timeline summaries for later recall. -- Can register tools discovered from configured MCP servers. +![PicoBot runtime architecture](docs/assets/runtime-architecture.svg) -## Architecture +## 适合做什么 -```text -Channel -> MessageBus -> SessionManager -> AgentLoop -> LLM Provider - | | - | v - | Tools - v - SQLite +- 在终端里和本地 AI 助手持续对话。 +- 将同一套 Agent 能力接入飞书/Lark。 +- 让 Agent 使用本地文件、Shell、搜索、HTTP、浏览器、MCP 工具完成任务。 +- 把长期偏好、事实和历史摘要存成可检索记忆。 +- 用 Cron 定时执行任务,并把结果发回目标渠道。 +- 通过 Skills 为 Agent 注入项目知识和专用操作指南。 -Control messages -> SessionManager -> MessageBus -> OutboundDispatcher -> Channel -``` +## 快速开始 -The main runtime boundary is: +### 1. 准备环境 -- `channels` only receive and send external messages. -- `bus` is an async queue, not a router. -- `session` owns dialog lifecycle, persistence, memory recall, prompt assembly, compression, and task cancellation. -- `agent` runs the stateless LLM/tool loop. -- `providers` are HTTP clients for model APIs. -- `tools` execute agent actions and return string results. -- `storage` owns SQLite schema and CRUD. -- `scheduler` polls due jobs and feeds prompts back into sessions. +需要: -## Features +- Rust toolchain,项目使用 edition 2024。 +- 一个可用的 LLM Provider API Key。 -### Channels - -- `cli_chat`: terminal TUI client connected through `/ws`. -- `feishu`: Feishu/Lark channel with configurable allow list, media directory, and reaction emoji. - -### LLM Providers - -- OpenAI-compatible chat completions, including DashScope, Volcengine, and similar APIs. -- Anthropic Messages API. -- Model-specific `input_type` metadata for text/image capability checks. -- JSON Schema cleanup for cross-provider tool compatibility. - -### Sessions And Memory - -- Session IDs use `::`. -- Each channel/chat can have multiple dialogs. -- Dialog operations include create, list, switch, rename, delete, compact, dump, info, and stop. -- Session history is persisted to SQLite and can be incrementally restored after compression. -- Knowledge memories are recalled into the system prompt each turn. -- Timeline memories are produced by context compression and can be searched later. - -### Tools - -Base tools registered for the agent: - -| Tool | Purpose | -|------|---------| -| `calculator` | Math expressions and statistics | -| `file_read` / `file_write` / `file_edit` | Workspace file operations | -| `file_search` / `content_search` | File and content search | -| `bash` | Run shell commands in the workspace | -| `http_request` | HTTP API requests | -| `web_fetch` | Fetch and extract web page text | -| `get_skill` | List or load local skills | -| `memory_store` / `memory_recall` / `timeline_recall` / `memory_forget` | Long-term memory operations | -| `delegate` | Run inline, background, or parallel sub-agents | -| `send_message` | Send outbound messages to configured channels | -| `chat_manager` | Inspect sessions, channels, and stored messages | -| `cron_add/list/remove/enable/disable/update` | Manage scheduled jobs when scheduler is enabled | -| `browser` | Optional WebDriver browser automation when enabled | -| MCP tools | Dynamically registered from configured MCP servers | - -### Skills - -Skills are directories containing `SKILL.md`. Load priority is: - -1. `{workspace}/skills` -2. `~/.picobot/skills` -3. `~/.agents/skills` - -Same-name skills in higher-priority locations override lower-priority ones. Built-in skills from `resources/skills` are embedded into the binary and installed into `~/.picobot/skills` if missing. - -## Quick Start - -### Prerequisites - -- Rust toolchain with edition 2024 support. -- A configured LLM provider API key. - -### Build +### 2. 构建项目 ```bash cargo build ``` -### Configure +### 3. 准备配置 -PicoBot loads `~/.picobot/config.json` first, then falls back to `./config.json`. On gateway startup, a template is released to `~/.picobot/config.example.json` if it does not exist. The source template is [resources/templates/config.example.json](/home/xiaoxixi/code/PicoBot/resources/templates/config.example.json). +PicoBot 按以下顺序加载配置: -Minimal example: +1. `~/.picobot/config.json` +2. 当前目录 `./config.json` + +Gateway 首次启动时会把模板释放到 `~/.picobot/config.example.json`。模板源文件在 [resources/templates/config.example.json](resources/templates/config.example.json)。 + +最小配置示例: ```json { @@ -140,152 +71,242 @@ Minimal example: } ``` -The `.env` file in the current directory is parsed by PicoBot itself. Values like `` in JSON are replaced from the process environment after `.env` is loaded. +`.env` 会由 PicoBot 自己解析。配置里的 `` 这类占位符会在 `.env` 和系统环境变量加载后替换。 -### Run +### 4. 启动 Gateway ```bash cargo run -- gateway ``` -The gateway switches the process working directory to `workspace_dir` and stores `picobot.db` there by default. +默认监听 `127.0.0.1:19876`。Gateway 启动后会把进程工作目录切到 `workspace_dir`,默认 SQLite 数据库也会写到该 workspace 下的 `picobot.db`。 -In another terminal: +### 5. 启动 CLI 客户端 + +另开一个终端: ```bash cargo run -- chat ``` -The client connects to `ws://127.0.0.1:19876/ws` by default. Override with `--gateway-url`. +CLI 默认连接 `ws://127.0.0.1:19876/ws`。如需指定地址,可使用 `--gateway-url`。 -## Configuration +## 运行时数据流 -Top-level config fields: +用户消息进入 PicoBot 后,会被转换为统一的 inbound message,经由 MessageBus 交给 SessionManager。SessionManager 选择当前 dialog、组装上下文、调用 AgentLoop;AgentLoop 调用模型和工具,最终响应通过 outbound bus 回到原渠道。 -| Field | Purpose | -|-------|---------| -| `providers` | Named LLM provider configs | -| `models` | Named model configs | -| `agents` | Agent-to-provider/model binding | -| `gateway` | Bind address, session DB path, cleanup, scheduler, background task limits | -| `client` | Default WebSocket URL for the TUI client | -| `channels` | Channel configs, currently Feishu/Lark | -| `memory` | Recall and consolidation settings | -| `mcp` | MCP server configs | -| `browser` | Optional WebDriver browser tool config | -| `workspace_dir` | Workspace used for file tools, shell commands, DB default, and workspace skills | +![Message flow](docs/assets/message-flow.svg) -Important defaults: +核心边界: -| Key | Default | -|-----|---------| +| 模块 | 职责 | +|------|------| +| `channels` | 接入外部渠道,只做收发,不直接处理会话或 LLM | +| `bus` | 异步消息队列,承载 inbound、outbound、control 三类消息 | +| `session` | 管理会话生命周期、dialog 操作、上下文、记忆召回、压缩和持久化 | +| `agent` | 执行无状态 LLM/tool 循环,处理模型响应和工具调用 | +| `providers` | OpenAI 兼容接口和 Anthropic Messages API 客户端 | +| `tools` | Agent 可调用工具集合 | +| `storage` | SQLite schema、CRUD、消息和任务持久化 | +| `scheduler` | 轮询 Cron 任务并把任务 prompt 送入目标会话 | +| `skills` | 加载 Skill,并把 Skill 指南注入系统提示 | +| `mcp` | 连接 MCP Server,将远端工具包装成普通 Tool | + +## 核心能力 + +### 渠道 + +| 渠道 | 说明 | +|------|------| +| `cli_chat` | Ratatui 终端客户端,通过 WebSocket 连接 Gateway | +| `feishu` | 飞书/Lark 消息、反应、文件上传下载和媒体引用 | + +### 会话 + +Session ID 使用三段式: + +```text +:: +``` + +同一个 `channel:chat_id` 下可以有多个 dialog。当前支持的 dialog 操作包括创建、列表、切换、重命名、归档、删除、清空历史、压缩、导出、查看信息和停止任务。 + +常用 slash commands: + +| 命令 | 说明 | +|------|------| +| `/new` | 创建新 dialog | +| `/sessions` | 列出最近 dialog | +| `/switch ` | 切换 dialog | +| `/rename ` | 重命名当前 dialog | +| `/delete` | 删除当前 dialog 并创建新 dialog | +| `/compact` | 手动压缩上下文 | +| `/info` | 查看当前 dialog 信息 | +| `/dump` | 导出当前 dialog 为 Markdown | +| `/mcp` | 查看 MCP 服务器和工具状态 | +| `/stop` | 停止当前任务并清空队列 | +| `/?`, `/help` | 查看帮助 | + +### 记忆 + +PicoBot 有两类记忆: + +| 类型 | 用途 | 生命周期 | +|------|------|----------| +| Knowledge | 偏好、事实、项目规则、长期可复用信息 | 长期保留,手动删除 | +| Timeline | 长对话压缩后的历史摘要 | 默认保留 90 天 | + +每轮处理用户消息时,MemoryManager 会按用户输入召回最多 `memory.recall_limit` 条 Knowledge,并注入系统提示。上下文压缩产生的摘要会保存为 Timeline,后续可通过 `timeline_recall` 工具检索。 + +### 工具 + +基础工具集: + +| 工具 | 说明 | +|------|------| +| `calculator` | 数学表达式和统计计算 | +| `file_read` / `file_write` / `file_edit` | 文件读写和编辑 | +| `file_search` / `content_search` | 文件名和内容搜索 | +| `bash` | 在 workspace 中执行 Shell 命令 | +| `http_request` / `web_fetch` | HTTP 请求和网页文本抽取 | +| `get_skill` | 列出或读取本地 Skill | +| `memory_store` / `memory_recall` / `timeline_recall` / `memory_forget` | 长期记忆操作 | +| `delegate` | 启动 inline、background 或 parallel 子 Agent | +| `send_message` | 向指定渠道发送消息 | +| `chat_manager` | 查看渠道、会话和历史消息 | +| `cron_add/list/remove/enable/disable/update` | 管理定时任务 | +| `browser` | 可选 WebDriver 浏览器自动化 | +| MCP tools | 从配置的 MCP Server 动态发现并注册 | + +### Skills + +Skill 是包含 `SKILL.md` 的目录。加载优先级从高到低: + +1. `{workspace}/skills` +2. `~/.picobot/skills` +3. `~/.agents/skills` + +同名 Skill 会按高优先级覆盖低优先级。内置 Skill 位于 [resources/skills](resources/skills),首次运行时会安装到 `~/.picobot/skills`。 + +## 配置速查 + +顶层配置字段: + +| 字段 | 说明 | +|------|------| +| `providers` | LLM Provider 配置 | +| `models` | 模型参数与输入能力 | +| `agents` | Agent 使用哪个 provider/model | +| `gateway` | HTTP/WebSocket、数据库、调度器、后台任务限制 | +| `client` | CLI 客户端默认 Gateway URL | +| `channels` | 渠道配置,目前主要是飞书/Lark | +| `memory` | 记忆召回、归并和 Timeline 保留策略 | +| `mcp` | MCP Server 配置 | +| `browser` | 可选浏览器自动化配置 | +| `workspace_dir` | 文件工具、Shell、数据库和 workspace skills 的工作目录 | + +重要默认值: + +| 配置 | 默认值 | +|------|--------| | `gateway.host` | `127.0.0.1` | | `gateway.port` | `19876` | | `gateway.max_concurrent_background_tasks` | `10` | -| `gateway.scheduler.enabled` | `true` if `scheduler` is omitted and defaulted | +| `gateway.scheduler.enabled` | `true` | | `client.gateway_url` | `ws://127.0.0.1:19876/ws` | | `memory.recall_limit` | `5` | | `memory.timeline_retention_days` | `90` | | `mcp.tool_timeout_secs` | `180` | | `browser.enabled` | `false` | -MCP servers support `stdio`, `sse`, and `streamable-http` transports. Browser automation requires a compatible Chrome/Chromium and chromedriver/WebDriver endpoint. - -## Slash Commands - -Available from CLI chat and channel text messages: - -| Command | Description | -|---------|-------------| -| `/new` | Create a new dialog | -| `/sessions` | List recent dialogs | -| `/switch <dialog_id>` | Switch dialog | -| `/rename <title>` | Rename current dialog | -| `/delete` | Delete current dialog | -| `/compact` | Manually trigger context compression | -| `/info` | Show current dialog information | -| `/dump` | Save current dialog as Markdown | -| `/?`, `/help` | Show help | -| `/mcp` | Show MCP server and tool status | -| `/stop` | Stop active tasks and clear queued messages | +更完整的配置字段说明见 [resources/skills/about-picobot/references/config.md](resources/skills/about-picobot/references/config.md)。 ## WebSocket API -The gateway exposes: +Gateway 暴露: -| Method | Path | Description | -|--------|------|-------------| -| `GET` | `/health` | Returns service health and version | -| `GET` | `/ws` | WebSocket upgrade for chat clients | +| Method | Path | 说明 | +|--------|------|------| +| `GET` | `/health` | 健康检查和版本信息 | +| `GET` | `/ws` | WebSocket 聊天协议 | -Inbound WebSocket message types: +Inbound 消息类型: -| Type | Main fields | -|------|-------------| -| `user_input` | `content`, optional `channel`, `chat_id`, `sender_id` | -| `clear_history` | optional `chat_id`, `session_id` | -| `create_session` | optional `title` | +| Type | 主要字段 | +|------|----------| +| `user_input` | `content`,可选 `channel`、`chat_id`、`sender_id` | +| `clear_history` | 可选 `chat_id`、`session_id` | +| `create_session` | 可选 `title` | | `list_sessions` | `include_archived` | | `load_session` | `session_id` | -| `rename_session` | optional `session_id`, `title` | -| `archive_session` | optional `session_id` | -| `delete_session` | optional `session_id` | -| `get_slash_commands` | none | -| `ping` | none | +| `rename_session` | 可选 `session_id`,`title` | +| `archive_session` | 可选 `session_id` | +| `delete_session` | 可选 `session_id` | +| `get_slash_commands` | 无 | +| `ping` | 无 | -Outbound WebSocket message types include `assistant_response`, `error`, `session_established`, `session_created`, `session_list`, `session_loaded`, `session_renamed`, `session_archived`, `session_deleted`, `history_cleared`, `slash_commands_list`, `pong`, `command_executed`, and `system_notification`. +Outbound 消息类型包括 `assistant_response`、`error`、`session_established`、`session_created`、`session_list`、`session_loaded`、`session_renamed`、`session_archived`、`session_deleted`、`history_cleared`、`slash_commands_list`、`pong`、`command_executed` 和 `system_notification`。 -## Testing +## 测试 ```bash -# Unit tests +# 单元测试 cargo test --lib -# Integration tests require real API keys in tests/test.env +# 集成测试需要 tests/test.env 中有真实 API key cp tests/test.env.example tests/test.env cargo test --test test_integration -- --ignored cargo test --test test_tool_calling -- --ignored cargo test --test test_request_format -- --ignored ``` -Integration tests are ignored by default because they make real provider calls. +集成测试默认 `#[ignore]`,因为它们会真实调用模型 API。 -## Project Layout +## 项目结构 ```text src/ - agent/ LLM loop, context compression, system prompts, media handling, sub-agents - bus/ Inbound, outbound, and control message queues - channels/ CLI chat and Feishu/Lark integrations - client/ Ratatui terminal UI - config/ Config loading, env substitution, path expansion - gateway/ Axum HTTP/WebSocket server and GatewayState wiring - mcp/ MCP client connections and tool wrappers - memory/ Memory manager and memory types - observability/ Agent/tool telemetry observer interfaces - providers/ OpenAI-compatible and Anthropic clients - scheduler/ Scheduled job runtime - session/ Session lifecycle, dialog commands, persistence integration - skills/ Skill loading and embedded built-in skill installation - storage/ SQLite schema and CRUD - tools/ Agent tool implementations + agent/ LLM loop、上下文压缩、系统提示、媒体处理、子 Agent + bus/ inbound、outbound、control 消息队列 + channels/ CLI chat 和飞书/Lark 集成 + client/ Ratatui 终端 UI + config/ 配置加载、环境变量替换、路径展开 + gateway/ Axum HTTP/WebSocket server 和 GatewayState 装配 + mcp/ MCP 客户端连接和工具包装 + memory/ 记忆管理和记忆类型 + observability/ Agent/tool telemetry observer + providers/ OpenAI 兼容和 Anthropic provider + scheduler/ 定时任务运行时 + session/ 会话生命周期、dialog 命令、持久化集成 + skills/ Skill 加载和内置 Skill 安装 + storage/ SQLite schema 和 CRUD + tools/ Agent 工具实现 resources/ - skills/ Built-in skills embedded at build time - templates/ Config, AGENTS.md, and USER.md templates released on first run -tests/ Unit and ignored integration tests -reference/ Third-party reference code; do not modify as project source + skills/ 构建时嵌入的内置 Skills + templates/ 首次运行释放的配置和用户模板 +tests/ 单元测试和 ignored 集成测试 +docs/ 分析报告、文档插图和补充资料 ``` -## Key Dependencies +## 关键依赖 -| Crate | Purpose | -|-------|---------| -| `axum`, `tokio`, `tokio-tungstenite` | Gateway and WebSocket runtime | -| `sqlx` | SQLite persistence | -| `reqwest` | LLM and HTTP clients | -| `ratatui`, `crossterm`, `termimad` | Terminal UI | -| `rmcp` | MCP client support | -| `fantoccini` | Optional browser automation | -| `cron`, `chrono-tz` | Scheduling | -| `jieba-rs` | Chinese tokenization for memory search | -| `zstd`, `tar` | Embedded built-in skill packaging | +| Crate | 用途 | +|-------|------| +| `axum`, `tokio`, `tokio-tungstenite` | Gateway 和 WebSocket runtime | +| `sqlx` | SQLite 持久化 | +| `reqwest` | LLM 和 HTTP 客户端 | +| `ratatui`, `crossterm`, `termimad` | 终端 UI | +| `rmcp` | MCP 客户端 | +| `fantoccini` | 可选浏览器自动化 | +| `cron`, `chrono-tz` | 定时任务 | +| `jieba-rs` | 中文记忆检索分词 | +| `zstd`, `tar` | 内置 Skill 打包和释放 | + +## 进一步阅读 + +- [架构机制](resources/skills/about-picobot/references/architecture.md) +- [配置说明](resources/skills/about-picobot/references/config.md) +- [命令说明](resources/skills/about-picobot/references/commands.md) +- [工具说明](resources/skills/about-picobot/references/tools.md) +- [数据库结构](resources/skills/about-picobot/references/db-schema.md) +- [代码质量分析](docs/CODE_QUALITY_ANALYSIS.md) diff --git a/docs/assets/message-flow.svg b/docs/assets/message-flow.svg new file mode 100644 index 0000000..5400292 --- /dev/null +++ b/docs/assets/message-flow.svg @@ -0,0 +1,101 @@ +<svg xmlns="http://www.w3.org/2000/svg" width="1120" height="460" viewBox="0 0 1120 460" role="img" aria-labelledby="title desc"> + <title id="title">PicoBot message flow + Message flow from channel input through bus, session manager, agent loop, tools, provider, storage, outbound dispatcher, and back to the channel. + + + + + + + + + + + + Message Flow + A user message becomes a session-scoped agent run, then returns to the original channel. + + + + + + + 1 + Channel + CLI / Feishu + + + + + + 2 + MessageBus + inbound queue + + + + + + 3 + SessionManager + dialog + context + + + + + + 4 + AgentLoop + LLM/tool loop + + + + + + 5 + Tools + side effects + + + + + + 6 + Response + outbound bus + + + + + + + + + + recall memory + build prompt + + + persist messages and metadata + + + provider calls + + + OutboundDispatcher routes the reply back to the same channel/chat scope. + diff --git a/docs/assets/runtime-architecture.svg b/docs/assets/runtime-architecture.svg new file mode 100644 index 0000000..8efb5dc --- /dev/null +++ b/docs/assets/runtime-architecture.svg @@ -0,0 +1,88 @@ + + PicoBot runtime architecture + High level architecture diagram showing channels, gateway, message bus, session manager, agent loop, tools, providers, storage, scheduler, skills, and MCP. + + + + + + + + + + + + PicoBot Runtime Architecture + Channels stay thin, SessionManager owns conversation state, AgentLoop remains stateless. + + + Channels + + CLI TUI + + Feishu/Lark + + WebSocket + Only receive and send + + + Gateway Core + + MessageBus + + SessionManager + + AgentLoop + + Outbound + Dispatcher + + Control Channel + + + Capabilities + + Tools + + Providers + + SQLite + + Skills + MCP tools join ToolRegistry + + + Scheduler + + Memory + + MCP + + + + + + + + + + + + + +