423 lines
15 KiB
Markdown
423 lines
15 KiB
Markdown
# PicoBot
|
|
|
|
A multi-channel AI agent framework with a WebSocket gateway and TUI client, supporting OpenAI-compatible and Anthropic LLM providers, tool calling, session persistence, and cron-based scheduling.
|
|
|
|
## System Architecture
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph Clients
|
|
TUI["🖥️ CLI Chat (TUI)"]
|
|
FS["📱 Feishu/Lark"]
|
|
end
|
|
|
|
subgraph Gateway["Gateway Server (127.0.0.1:19876)"]
|
|
HTTP["HTTP Endpoints<br/>GET /health<br/>GET /ws (WebSocket upgrade)"]
|
|
WS["WebSocket Handler"]
|
|
CD["ChannelManager"]
|
|
SP["SessionManager"]
|
|
AL["AgentLoop"]
|
|
end
|
|
|
|
subgraph Bus["MessageBus"]
|
|
IB["Inbound Channel"]
|
|
OB["Outbound Channel"]
|
|
CC["Control Channel"]
|
|
end
|
|
|
|
subgraph Storage
|
|
SQLite[("SQLite<br/>.picobot_sessions.db")]
|
|
end
|
|
|
|
subgraph AI["AI Providers"]
|
|
OpenAI["OpenAI / DashScope"]
|
|
Anthropic["Anthropic Claude"]
|
|
end
|
|
|
|
TUI <-->|WebSocket| WS
|
|
FS <-->|Webhook| HTTP
|
|
|
|
CD -->|InboundMessage| IB
|
|
IB -->|DialogEvent| SP
|
|
CC -->|ControlMessage| SP
|
|
SP <--> AL
|
|
AL -->|API Call| OpenAI
|
|
AL -->|API Call| Anthropic
|
|
AL -->|Tool Call| Tools
|
|
SP -->|OutboundMessage| OB
|
|
OB --> CD
|
|
SP --> SQLite
|
|
Tools --> SQLite
|
|
|
|
subgraph Tools
|
|
Bash["Bash"]
|
|
FileIO["File Read/Write/Edit"]
|
|
Web["HTTP Request / Web Fetch"]
|
|
Calc["Calculator"]
|
|
Skill["Get Skill"]
|
|
Msg["Send Message"]
|
|
Cron["Cron Jobs"]
|
|
end
|
|
```
|
|
|
|
### Core Data Flow
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Channel as Channel<br/>(CLI/Feishu)
|
|
participant Bus as MessageBus
|
|
participant SM as SessionManager
|
|
participant AL as AgentLoop
|
|
participant LLM as LLM Provider
|
|
participant Tool as Tools
|
|
|
|
Channel->>Bus: InboundMessage (user input)
|
|
Bus->>SM: DialogEvent
|
|
SM->>SM: Load/Resolve Session
|
|
SM->>AL: Process (session state)
|
|
AL->>LLM: ChatCompletionRequest
|
|
LLM-->>AL: response / tool_calls
|
|
alt Tool Calls
|
|
AL->>Tool: execute tool
|
|
Tool-->>AL: result
|
|
AL->>LLM: continue with tool result
|
|
end
|
|
AL-->>SM: AgentProcessResult (text + token count)
|
|
SM->>SM: Persist to SQLite
|
|
SM->>Bus: OutboundMessage
|
|
Bus->>Channel: response to user
|
|
```
|
|
|
|
## Features
|
|
|
|
### Multi-Channel Support
|
|
- **CLI Chat Client** — Full TUI with session management, Markdown rendering, slash commands
|
|
- **Feishu (Lark)** — Webhook-based integration with typing indicators and media support
|
|
|
|
### Multi-Provider LLM
|
|
- OpenAI-compatible API (GPT-4, DashScope, Volcengine, etc.)
|
|
- Anthropic Messages API (Claude)
|
|
- Cross-provider JSON Schema normalization for tool calling compatibility
|
|
|
|
### Session Management
|
|
- Multi-session conversations per channel/chat
|
|
- Create, switch, rename, archive, delete dialogs via slash commands or WebSocket
|
|
- SQLite-persisted session history with automatic TTL-based cleanup
|
|
- Context compression for long conversations approaching token limits
|
|
|
|
### Tool System
|
|
| Tool | Description |
|
|
|------|-------------|
|
|
| `bash` | Execute shell commands in workspace |
|
|
| `file_read` | Read file contents |
|
|
| `file_write` | Create/overwrite files |
|
|
| `file_edit` | Precise string substitution in files |
|
|
| `http_request` | Make HTTP API requests |
|
|
| `web_fetch` | Fetch and parse web pages |
|
|
| `calculator` | Evaluate mathematical expressions |
|
|
| `get_skill` | Load agent skills from local skill files |
|
|
| `send_message` | Send messages to other channels |
|
|
| `cron_add/list/remove/enable/disable/update` | Manage scheduled jobs |
|
|
|
|
### Scheduling
|
|
- Cron-based recurring jobs with optional timezone support
|
|
- One-shot (`at`) and interval (`every`) schedules
|
|
- Jobs trigger agent processing via specified channel/chat
|
|
|
|
### Skills System
|
|
- Load Markdown skill files from `~/.picobot/skills` and `~/.agent/skills`
|
|
- Skills inject specialized system prompts for specific tasks
|
|
- Automatic hot-reload on file changes
|
|
|
|
### Observability
|
|
- Observer pattern for agent and tool telemetry
|
|
- Events: `AgentStart`, `AgentEnd`, `ToolCallStart`, `ToolCall`
|
|
- Structured JSON logging with file rotation
|
|
|
|
## Quick Start
|
|
|
|
### Prerequisites
|
|
- Rust nightly (edition 2024) — use `rustup` to install
|
|
|
|
### Build
|
|
|
|
```bash
|
|
cargo build
|
|
```
|
|
|
|
### Configure
|
|
|
|
1. Create `config.json` (or `~/.picobot/config.json`):
|
|
|
|
```json
|
|
{
|
|
"providers": {
|
|
"openai": {
|
|
"type": "openai",
|
|
"base_url": "https://api.openai.com/v1",
|
|
"api_key": "<OPENAI_API_KEY>"
|
|
}
|
|
},
|
|
"models": {
|
|
"gpt-4o": {
|
|
"model_id": "gpt-4o",
|
|
"temperature": 0.7,
|
|
"max_tokens": 4096
|
|
}
|
|
},
|
|
"agents": {
|
|
"default": {
|
|
"provider": "openai",
|
|
"model": "gpt-4o",
|
|
"max_tool_iterations": 20,
|
|
"token_limit": 128000
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
2. Set API keys via `.env` file (one `KEY=VALUE` per line):
|
|
|
|
```env
|
|
OPENAI_API_KEY=sk-xxxxx
|
|
```
|
|
|
|
### Run
|
|
|
|
**Start gateway server:**
|
|
|
|
```bash
|
|
cargo run -- gateway
|
|
```
|
|
|
|
Binds `127.0.0.1:19876` by default. Override with `--host` and `--port`.
|
|
|
|
**Connect CLI client:**
|
|
|
|
```bash
|
|
cargo run -- chat
|
|
```
|
|
|
|
Connects to `ws://127.0.0.1:19876/ws`. Override with `--gateway-url`.
|
|
|
|
## Configuration Reference
|
|
|
|
Config load order: `~/.picobot/config.json` → `./config.json` (fallback).
|
|
|
|
### Full Config Structure
|
|
|
|
```mermaid
|
|
graph LR
|
|
Config["config.json"]
|
|
Config --> Providers["providers<br/>ProviderConfig{}"]
|
|
Config --> Models["models<br/>ModelConfig{}"]
|
|
Config --> Agents["agents<br/>AgentConfig{}"]
|
|
Config --> Gateway["gateway<br/>GatewayConfig"]
|
|
Config --> Client["client<br/>ClientConfig"]
|
|
Config --> Channels["channels<br/>ChannelConfig{}"]
|
|
Config --> Workspace["workspace_dir"]
|
|
|
|
Providers --> PT["type (openai / anthropic)<br/>base_url<br/>api_key<br/>extra_headers"]
|
|
Models --> MT["model_id<br/>temperature<br/>max_tokens"]
|
|
Agents --> AT["provider (ref)<br/>model (ref)<br/>max_tool_iterations<br/>token_limit"]
|
|
Gateway --> GT["host / port<br/>session_ttl_hours<br/>cleanup_interval_minutes<br/>session_db_path<br/>scheduler"]
|
|
Channels --> CT["feishu: app_id, app_secret<br/>allow_from, agent, media_dir"]
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
The `.env` file in the working directory is loaded manually (not via dotenv crate). Placeholders in `config.json` written as `<VAR_NAME>` are substituted at load time.
|
|
|
|
### Gateway Config
|
|
|
|
| Key | Type | Default | Description |
|
|
|-----|------|---------|-------------|
|
|
| `host` | string | `127.0.0.1` | Bind address |
|
|
| `port` | u16 | `19876` | Listen port |
|
|
| `session_ttl_hours` | number | `4` | Inactive session expiration (hours) |
|
|
| `cleanup_interval_minutes` | number | `60` | Session cleanup interval |
|
|
| `session_db_path` | string | workspace `.picobot_sessions.db` | SQLite database path |
|
|
| `scheduler.enabled` | bool | `false` | Enable cron scheduler |
|
|
|
|
### Agent Config
|
|
|
|
| Key | Type | Default | Description |
|
|
|-----|------|---------|-------------|
|
|
| `provider` | string | — | Provider name (key in `providers`) |
|
|
| `model` | string | — | Model name (key in `models`) |
|
|
| `max_tool_iterations` | number | `20` | Max tool call iterations per turn |
|
|
| `token_limit` | number | `128000` | Context window token limit |
|
|
|
|
## Slash Commands
|
|
|
|
Available in CLI chat and Feishu:
|
|
|
|
| Command | Alias | Description |
|
|
|---------|-------|-------------|
|
|
| `/new` | `/刷新` | Create a new dialog |
|
|
| `/list` | `/对话列表` | List all dialogs |
|
|
| `/switch <id>` | — | Switch to a dialog |
|
|
| `/rename <title>` | — | Rename current dialog |
|
|
| `/archive` | — | Archive current dialog |
|
|
| `/delete` | — | Delete current dialog |
|
|
| `/clear` | `/清空` | Clear current dialog history |
|
|
|
|
## WebSocket Protocol
|
|
|
|
The gateway exposes a WebSocket endpoint at `/ws`. Messages use typed JSON with a `type` discriminator field.
|
|
|
|
### Client → Server (WsInbound)
|
|
|
|
| Type | Fields |
|
|
|------|--------|
|
|
| `user_input` | `content`, `channel?`, `chat_id?`, `sender_id?` |
|
|
| `create_session` | `title?` |
|
|
| `list_sessions` | `include_archived` |
|
|
| `load_session` | `session_id` |
|
|
| `rename_session` | `session_id?`, `title` |
|
|
| `archive_session` | `session_id?` |
|
|
| `delete_session` | `session_id?` |
|
|
| `clear_history` | `chat_id?`, `session_id?` |
|
|
| `get_slash_commands` | — |
|
|
| `ping` | — |
|
|
|
|
### Server → Client (WsOutbound)
|
|
|
|
| Type | Fields |
|
|
|------|--------|
|
|
| `assistant_response` | `session_id`, `response`, `tokens_used?`, `tool_calls?` |
|
|
| `session_list` | `sessions[]` |
|
|
| `session_loaded` | `session_id`, `messages[]` |
|
|
| `session_created` | `session_id`, `title` |
|
|
| `session_renamed` | `session_id`, `title` |
|
|
| `session_archived` | `session_id` |
|
|
| `session_deleted` | `session_id` |
|
|
| `slash_commands` | `commands[]` |
|
|
| `error` | `message` |
|
|
| `pong` | — |
|
|
|
|
## HTTP Endpoints
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `GET` | `/health` | Health check — returns `{"status":"ok","version":"x.y.z"}` |
|
|
| `GET` | `/ws` | WebSocket upgrade for chat clients |
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
# Unit tests (no external dependencies)
|
|
cargo test --lib
|
|
|
|
# Integration tests (require API keys)
|
|
cp tests/test.env.example tests/test.env
|
|
# Fill in your API keys in tests/test.env
|
|
cargo test --test test_integration -- --ignored
|
|
cargo test --test test_tool_calling -- --ignored
|
|
cargo test --test test_request_format -- --ignored
|
|
|
|
# Run all tests
|
|
cargo test -- --ignored
|
|
```
|
|
|
|
Integration tests are `#[ignore]` by default because they make real API calls.
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
├── src/
|
|
│ ├── main.rs # CLI entrypoint (clap-based subcommands)
|
|
│ ├── lib.rs # Module declarations
|
|
│ ├── gateway/ # HTTP/WS server, GatewayState initialization
|
|
│ │ ├── mod.rs
|
|
│ │ ├── http.rs # Health endpoint
|
|
│ │ └── ws.rs # WebSocket handler
|
|
│ ├── client/ # TUI chat client
|
|
│ │ ├── mod.rs
|
|
│ │ └── tui/ # Ratatui-based terminal UI
|
|
│ ├── channels/ # Channel integrations
|
|
│ │ ├── base.rs # Channel trait
|
|
│ │ ├── cli_chat.rs # CLI WebSocket channel
|
|
│ │ ├── feishu.rs # Feishu/Lark webhook channel
|
|
│ │ ├── manager.rs # ChannelManager
|
|
│ │ └── slash_command.rs # Slash command parser
|
|
│ ├── bus/ # Async message bus
|
|
│ │ ├── mod.rs # MessageBus (tokio mpsc channels)
|
|
│ │ ├── message.rs # Message types
|
|
│ │ └── dispatcher.rs # OutboundDispatcher
|
|
│ ├── session/ # Session & dialog management
|
|
│ │ ├── mod.rs
|
|
│ │ ├── session.rs # Session, SessionManager
|
|
│ │ ├── session_id.rs # UnifiedSessionId
|
|
│ │ ├── commands.rs # SessionCommand enum
|
|
│ │ └── events.rs # SessionEvent, DialogInfo
|
|
│ ├── agent/ # LLM interaction loop
|
|
│ │ ├── mod.rs
|
|
│ │ ├── agent_loop.rs # AgentLoop (stateless)
|
|
│ │ ├── context_compressor.rs # Token estimation & summarization
|
|
│ │ └── system_prompt.rs # System prompt builder
|
|
│ ├── providers/ # LLM API clients
|
|
│ │ ├── mod.rs # Factory: create_provider()
|
|
│ │ ├── traits.rs # LLMProvider trait
|
|
│ │ ├── openai.rs # OpenAI-compatible client
|
|
│ │ └── anthropic.rs # Anthropic Messages API client
|
|
│ ├── tools/ # Agent tools
|
|
│ │ ├── mod.rs # create_default_tools()
|
|
│ │ ├── registry.rs # ToolRegistry
|
|
│ │ ├── traits.rs # Tool trait, ToolResult
|
|
│ │ ├── schema.rs # Cross-provider JSON Schema cleaner
|
|
│ │ ├── bash.rs # Shell command execution
|
|
│ │ ├── calculator.rs # Math expression evaluator
|
|
│ │ ├── chat_manager.rs # Session management tool
|
|
│ │ ├── cron.rs # Cron job management tools
|
|
│ │ ├── file_read.rs # File reader
|
|
│ │ ├── file_write.rs # File writer
|
|
│ │ ├── file_edit.rs # File editor (string substitution)
|
|
│ │ ├── get_skill.rs # Skill loader tool
|
|
│ │ ├── http_request.rs # HTTP request tool
|
|
│ │ ├── send_message.rs # Cross-channel messaging
|
|
│ │ └── web_fetch.rs # Web page fetcher
|
|
│ ├── skills/ # Skills loading from markdown files
|
|
│ │ └── mod.rs # SkillsLoader, Skill
|
|
│ ├── storage/ # SQLite persistence
|
|
│ │ ├── mod.rs # Storage, schema init
|
|
│ │ ├── session.rs # Session CRUD operations
|
|
│ │ ├── message.rs # Message persistence
|
|
│ │ ├── scheduler.rs # ScheduledJob, JobRun storage
|
|
│ │ └── error.rs # StorageError
|
|
│ ├── scheduler/ # Cron scheduler runtime
|
|
│ │ ├── mod.rs # Scheduler, next_run_for_schedule()
|
|
│ │ └── types.rs # Schedule enum (At/Every/Cron)
|
|
│ ├── observability/ # Telemetry observer pattern
|
|
│ │ └── mod.rs # Observer trait, ObserverEvent, MultiObserver
|
|
│ ├── protocol.rs # WebSocket message types (WsInbound/WsOutbound)
|
|
│ ├── config/ # Config loading & env substitution
|
|
│ │ └── mod.rs # Config, LLMProviderConfig, load_env_file()
|
|
│ └── logging.rs # Tracing subscriber init with file rotation
|
|
├── tests/
|
|
│ ├── test_integration.rs # LLM provider integration tests
|
|
│ ├── test_tool_calling.rs # Tool calling integration tests
|
|
│ ├── test_request_format.rs # Request format tests
|
|
│ ├── test_scheduler.rs # Scheduler unit tests
|
|
│ ├── test.env.example # Test environment template
|
|
│ └── test.env # Actual test keys (gitignored)
|
|
├── reference/ # Third-party reference code (do not modify)
|
|
├── config.example.json # Full config example
|
|
└── Cargo.toml
|
|
```
|
|
|
|
## Key Dependencies
|
|
|
|
| Crate | Purpose |
|
|
|-------|---------|
|
|
| `axum` + `tokio-tungstenite` | HTTP server & WebSocket |
|
|
| `sqlx` (SQLite) | Session/Message/Job persistence |
|
|
| `reqwest` (rustls) | LLM API & external HTTP calls |
|
|
| `ratatui` + `crossterm` | Terminal UI |
|
|
| `clap` | CLI argument parsing |
|
|
| `tracing` + `tracing-subscriber` | Structured logging |
|
|
| `cron` + `chrono-tz` | Cron schedule parsing |
|
|
| `meval` | Mathematical expression evaluation |
|
|
| `uuid` | Session/Dialog ID generation |
|
|
| `dirs` | Platform config directory resolution |
|