# PicoBot A multi-channel AI agent framework with a WebSocket gateway and TUI client, supporting OpenAI-compatible and Anthropic LLM providers, tool calling, session persistence, and cron-based scheduling. ## System Architecture ```mermaid graph TB subgraph Clients TUI["πŸ–₯️ CLI Chat (TUI)"] FS["πŸ“± Feishu/Lark"] end subgraph Gateway["Gateway Server (127.0.0.1:19876)"] HTTP["HTTP Endpoints
GET /health
GET /ws (WebSocket upgrade)"] WS["WebSocket Handler"] CD["ChannelManager"] SP["SessionManager"] AL["AgentLoop"] end subgraph Bus["MessageBus"] IB["Inbound Channel"] OB["Outbound Channel"] CC["Control Channel"] end subgraph Storage SQLite[("SQLite
picobot.db")] end subgraph AI["AI Providers"] OpenAI["OpenAI / DashScope"] Anthropic["Anthropic Claude"] end TUI <-->|WebSocket| WS FS <-->|Webhook| HTTP CD -->|InboundMessage| IB IB -->|DialogEvent| SP CC -->|ControlMessage| SP SP <--> AL AL -->|API Call| OpenAI AL -->|API Call| Anthropic AL -->|Tool Call| Tools SP -->|OutboundMessage| OB OB --> CD SP --> SQLite Tools --> SQLite subgraph Tools Bash["Bash"] FileIO["File Read/Write/Edit"] Web["HTTP Request / Web Fetch"] Calc["Calculator"] Skill["Get Skill"] Msg["Send Message"] Cron["Cron Jobs"] end ``` ### Core Data Flow ```mermaid sequenceDiagram participant Channel as Channel
(CLI/Feishu) participant Bus as MessageBus participant SM as SessionManager participant AL as AgentLoop participant LLM as LLM Provider participant Tool as Tools Channel->>Bus: InboundMessage (user input) Bus->>SM: DialogEvent SM->>SM: Load/Resolve Session SM->>AL: Process (session state) AL->>LLM: ChatCompletionRequest LLM-->>AL: response / tool_calls alt Tool Calls AL->>Tool: execute tool Tool-->>AL: result AL->>LLM: continue with tool result end AL-->>SM: AgentProcessResult (text + token count) SM->>SM: Persist to SQLite SM->>Bus: OutboundMessage Bus->>Channel: response to user ``` ## Features ### Multi-Channel Support - **CLI Chat Client** β€” Full TUI with session management, Markdown rendering, slash commands - **Feishu (Lark)** β€” Webhook-based integration with typing indicators and media support ### Multi-Provider LLM - OpenAI-compatible API (GPT-4, DashScope, Volcengine, etc.) - Anthropic Messages API (Claude) - Cross-provider JSON Schema normalization for tool calling compatibility ### Session Management - Multi-session conversations per channel/chat - Create, switch, rename, archive, delete dialogs via slash commands or WebSocket - SQLite-persisted session history with automatic TTL-based cleanup - Context compression for long conversations approaching token limits ### Tool System | Tool | Description | |------|-------------| | `bash` | Execute shell commands in workspace | | `file_read` | Read file contents | | `file_write` | Create/overwrite files | | `file_edit` | Precise string substitution in files | | `http_request` | Make HTTP API requests | | `web_fetch` | Fetch and parse web pages | | `calculator` | Evaluate mathematical expressions | | `get_skill` | Load agent skills from local skill files | | `send_message` | Send messages to other channels | | `cron_add/list/remove/enable/disable/update` | Manage scheduled jobs | ### Scheduling - Cron-based recurring jobs with optional timezone support - One-shot (`at`) and interval (`every`) schedules - Jobs trigger agent processing via specified channel/chat ### Skills System - Load Markdown skill files from `~/.picobot/skills` and `~/.agent/skills` - Skills inject specialized system prompts for specific tasks - Automatic hot-reload on file changes ### Observability - Observer pattern for agent and tool telemetry - Events: `AgentStart`, `AgentEnd`, `ToolCallStart`, `ToolCall` - Structured JSON logging with file rotation ## Quick Start ### Prerequisites - Rust nightly (edition 2024) β€” use `rustup` to install ### Build ```bash cargo build ``` ### Configure 1. Create `config.json` (or `~/.picobot/config.json`): ```json { "providers": { "openai": { "type": "openai", "base_url": "https://api.openai.com/v1", "api_key": "" } }, "models": { "gpt-4o": { "model_id": "gpt-4o", "temperature": 0.7, "max_tokens": 4096 } }, "agents": { "default": { "provider": "openai", "model": "gpt-4o", "max_tool_iterations": 20, "token_limit": 128000 } } } ``` 2. Set API keys via `.env` file (one `KEY=VALUE` per line): ```env OPENAI_API_KEY=sk-xxxxx ``` ### Run **Start gateway server:** ```bash cargo run -- gateway ``` Binds `127.0.0.1:19876` by default. Override with `--host` and `--port`. **Connect CLI client:** ```bash cargo run -- chat ``` Connects to `ws://127.0.0.1:19876/ws`. Override with `--gateway-url`. ## Configuration Reference Config load order: `~/.picobot/config.json` β†’ `./config.json` (fallback). ### Full Config Structure ```mermaid graph LR Config["config.json"] Config --> Providers["providers
ProviderConfig{}"] Config --> Models["models
ModelConfig{}"] Config --> Agents["agents
AgentConfig{}"] Config --> Gateway["gateway
GatewayConfig"] Config --> Client["client
ClientConfig"] Config --> Channels["channels
ChannelConfig{}"] Config --> Workspace["workspace_dir"] Providers --> PT["type (openai / anthropic)
base_url
api_key
extra_headers"] Models --> MT["model_id
temperature
max_tokens"] Agents --> AT["provider (ref)
model (ref)
max_tool_iterations
token_limit"] Gateway --> GT["host / port
session_ttl_hours
cleanup_interval_minutes
session_db_path
scheduler"] Channels --> CT["feishu: app_id, app_secret
allow_from, agent, media_dir"] ``` ### Environment Variables The `.env` file in the working directory is loaded manually (not via dotenv crate). Placeholders in `config.json` written as `` are substituted at load time. ### Gateway Config | Key | Type | Default | Description | |-----|------|---------|-------------| | `host` | string | `127.0.0.1` | Bind address | | `port` | u16 | `19876` | Listen port | | `session_ttl_hours` | number | `4` | Inactive session expiration (hours) | | `cleanup_interval_minutes` | number | `60` | Session cleanup interval | | `session_db_path` | string | workspace `picobot.db` | SQLite database path | | `scheduler.enabled` | bool | `false` | Enable cron scheduler | ### Agent Config | Key | Type | Default | Description | |-----|------|---------|-------------| | `provider` | string | β€” | Provider name (key in `providers`) | | `model` | string | β€” | Model name (key in `models`) | | `max_tool_iterations` | number | `20` | Max tool call iterations per turn | | `token_limit` | number | `128000` | Context window token limit | ## Slash Commands Available in CLI chat and Feishu: | Command | Alias | Description | |---------|-------|-------------| | `/new` | `/εˆ·ζ–°` | Create a new dialog | | `/list` | `/ε―Ήθ―εˆ—θ‘¨` | List all dialogs | | `/switch ` | β€” | Switch to a dialog | | `/rename ` | β€” | Rename current dialog | | `/archive` | β€” | Archive current dialog | | `/delete` | β€” | Delete current dialog | | `/clear` | `/ζΈ…η©Ί` | Clear current dialog history | ## WebSocket Protocol The gateway exposes a WebSocket endpoint at `/ws`. Messages use typed JSON with a `type` discriminator field. ### Client β†’ Server (WsInbound) | Type | Fields | |------|--------| | `user_input` | `content`, `channel?`, `chat_id?`, `sender_id?` | | `create_session` | `title?` | | `list_sessions` | `include_archived` | | `load_session` | `session_id` | | `rename_session` | `session_id?`, `title` | | `archive_session` | `session_id?` | | `delete_session` | `session_id?` | | `clear_history` | `chat_id?`, `session_id?` | | `get_slash_commands` | β€” | | `ping` | β€” | ### Server β†’ Client (WsOutbound) | Type | Fields | |------|--------| | `assistant_response` | `session_id`, `response`, `tokens_used?`, `tool_calls?` | | `session_list` | `sessions[]` | | `session_loaded` | `session_id`, `messages[]` | | `session_created` | `session_id`, `title` | | `session_renamed` | `session_id`, `title` | | `session_archived` | `session_id` | | `session_deleted` | `session_id` | | `slash_commands` | `commands[]` | | `error` | `message` | | `pong` | β€” | ## HTTP Endpoints | Method | Path | Description | |--------|------|-------------| | `GET` | `/health` | Health check β€” returns `{"status":"ok","version":"x.y.z"}` | | `GET` | `/ws` | WebSocket upgrade for chat clients | ## Testing ```bash # Unit tests (no external dependencies) cargo test --lib # Integration tests (require API keys) cp tests/test.env.example tests/test.env # Fill in your API keys in tests/test.env cargo test --test test_integration -- --ignored cargo test --test test_tool_calling -- --ignored cargo test --test test_request_format -- --ignored # Run all tests cargo test -- --ignored ``` Integration tests are `#[ignore]` by default because they make real API calls. ## Project Structure ``` β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ main.rs # CLI entrypoint (clap-based subcommands) β”‚ β”œβ”€β”€ lib.rs # Module declarations β”‚ β”œβ”€β”€ gateway/ # HTTP/WS server, GatewayState initialization β”‚ β”‚ β”œβ”€β”€ mod.rs β”‚ β”‚ β”œβ”€β”€ http.rs # Health endpoint β”‚ β”‚ └── ws.rs # WebSocket handler β”‚ β”œβ”€β”€ client/ # TUI chat client β”‚ β”‚ β”œβ”€β”€ mod.rs β”‚ β”‚ └── tui/ # Ratatui-based terminal UI β”‚ β”œβ”€β”€ channels/ # Channel integrations β”‚ β”‚ β”œβ”€β”€ base.rs # Channel trait β”‚ β”‚ β”œβ”€β”€ cli_chat.rs # CLI WebSocket channel β”‚ β”‚ β”œβ”€β”€ feishu.rs # Feishu/Lark webhook channel β”‚ β”‚ β”œβ”€β”€ manager.rs # ChannelManager β”‚ β”‚ └── slash_command.rs # Slash command parser β”‚ β”œβ”€β”€ bus/ # Async message bus β”‚ β”‚ β”œβ”€β”€ mod.rs # MessageBus (tokio mpsc channels) β”‚ β”‚ β”œβ”€β”€ message.rs # Message types β”‚ β”‚ └── dispatcher.rs # OutboundDispatcher β”‚ β”œβ”€β”€ session/ # Session & dialog management β”‚ β”‚ β”œβ”€β”€ mod.rs β”‚ β”‚ β”œβ”€β”€ session.rs # Session, SessionManager β”‚ β”‚ β”œβ”€β”€ session_id.rs # UnifiedSessionId β”‚ β”‚ β”œβ”€β”€ commands.rs # SessionCommand enum β”‚ β”‚ └── events.rs # SessionEvent, DialogInfo β”‚ β”œβ”€β”€ agent/ # LLM interaction loop β”‚ β”‚ β”œβ”€β”€ mod.rs β”‚ β”‚ β”œβ”€β”€ agent_loop.rs # AgentLoop (stateless) β”‚ β”‚ β”œβ”€β”€ context_compressor.rs # Token estimation & summarization β”‚ β”‚ └── system_prompt.rs # System prompt builder β”‚ β”œβ”€β”€ providers/ # LLM API clients β”‚ β”‚ β”œβ”€β”€ mod.rs # Factory: create_provider() β”‚ β”‚ β”œβ”€β”€ traits.rs # LLMProvider trait β”‚ β”‚ β”œβ”€β”€ openai.rs # OpenAI-compatible client β”‚ β”‚ └── anthropic.rs # Anthropic Messages API client β”‚ β”œβ”€β”€ tools/ # Agent tools β”‚ β”‚ β”œβ”€β”€ mod.rs # create_default_tools() β”‚ β”‚ β”œβ”€β”€ registry.rs # ToolRegistry β”‚ β”‚ β”œβ”€β”€ traits.rs # Tool trait, ToolResult β”‚ β”‚ β”œβ”€β”€ schema.rs # Cross-provider JSON Schema cleaner β”‚ β”‚ β”œβ”€β”€ bash.rs # Shell command execution β”‚ β”‚ β”œβ”€β”€ calculator.rs # Math expression evaluator β”‚ β”‚ β”œβ”€β”€ chat_manager.rs # Session management tool β”‚ β”‚ β”œβ”€β”€ cron.rs # Cron job management tools β”‚ β”‚ β”œβ”€β”€ file_read.rs # File reader β”‚ β”‚ β”œβ”€β”€ file_write.rs # File writer β”‚ β”‚ β”œβ”€β”€ file_edit.rs # File editor (string substitution) β”‚ β”‚ β”œβ”€β”€ get_skill.rs # Skill loader tool β”‚ β”‚ β”œβ”€β”€ http_request.rs # HTTP request tool β”‚ β”‚ β”œβ”€β”€ send_message.rs # Cross-channel messaging β”‚ β”‚ └── web_fetch.rs # Web page fetcher β”‚ β”œβ”€β”€ skills/ # Skills loading from markdown files β”‚ β”‚ └── mod.rs # SkillsLoader, Skill β”‚ β”œβ”€β”€ storage/ # SQLite persistence β”‚ β”‚ β”œβ”€β”€ mod.rs # Storage, schema init β”‚ β”‚ β”œβ”€β”€ session.rs # Session CRUD operations β”‚ β”‚ β”œβ”€β”€ message.rs # Message persistence β”‚ β”‚ β”œβ”€β”€ scheduler.rs # ScheduledJob, JobRun storage β”‚ β”‚ └── error.rs # StorageError β”‚ β”œβ”€β”€ scheduler/ # Cron scheduler runtime β”‚ β”‚ β”œβ”€β”€ mod.rs # Scheduler, next_run_for_schedule() β”‚ β”‚ └── types.rs # Schedule enum (At/Every/Cron) β”‚ β”œβ”€β”€ observability/ # Telemetry observer pattern β”‚ β”‚ └── mod.rs # Observer trait, ObserverEvent, MultiObserver β”‚ β”œβ”€β”€ protocol.rs # WebSocket message types (WsInbound/WsOutbound) β”‚ β”œβ”€β”€ config/ # Config loading & env substitution β”‚ β”‚ └── mod.rs # Config, LLMProviderConfig, load_env_file() β”‚ └── logging.rs # Tracing subscriber init with file rotation β”œβ”€β”€ tests/ β”‚ β”œβ”€β”€ test_integration.rs # LLM provider integration tests β”‚ β”œβ”€β”€ test_tool_calling.rs # Tool calling integration tests β”‚ β”œβ”€β”€ test_request_format.rs # Request format tests β”‚ β”œβ”€β”€ test_scheduler.rs # Scheduler unit tests β”‚ β”œβ”€β”€ test.env.example # Test environment template β”‚ └── test.env # Actual test keys (gitignored) β”œβ”€β”€ reference/ # Third-party reference code (do not modify) β”œβ”€β”€ config.example.json # Full config example └── Cargo.toml ``` ## Key Dependencies | Crate | Purpose | |-------|---------| | `axum` + `tokio-tungstenite` | HTTP server & WebSocket | | `sqlx` (SQLite) | Session/Message/Job persistence | | `reqwest` (rustls) | LLM API & external HTTP calls | | `ratatui` + `crossterm` | Terminal UI | | `clap` | CLI argument parsing | | `tracing` + `tracing-subscriber` | Structured logging | | `cron` + `chrono-tz` | Cron schedule parsing | | `meval` | Mathematical expression evaluation | | `uuid` | Session/Dialog ID generation | | `dirs` | Platform config directory resolution |