# PicoBot
A multi-channel AI agent framework with a WebSocket gateway and TUI client, supporting OpenAI-compatible and Anthropic LLM providers, tool calling, session persistence, and cron-based scheduling.
## System Architecture
```mermaid
graph TB
subgraph Clients
TUI["π₯οΈ CLI Chat (TUI)"]
FS["π± Feishu/Lark"]
end
subgraph Gateway["Gateway Server (127.0.0.1:19876)"]
HTTP["HTTP Endpoints
GET /health
GET /ws (WebSocket upgrade)"]
WS["WebSocket Handler"]
CD["ChannelManager"]
SP["SessionManager"]
AL["AgentLoop"]
end
subgraph Bus["MessageBus"]
IB["Inbound Channel"]
OB["Outbound Channel"]
CC["Control Channel"]
end
subgraph Storage
SQLite[("SQLite
picobot.db")]
end
subgraph AI["AI Providers"]
OpenAI["OpenAI / DashScope"]
Anthropic["Anthropic Claude"]
end
TUI <-->|WebSocket| WS
FS <-->|Webhook| HTTP
CD -->|InboundMessage| IB
IB -->|DialogEvent| SP
CC -->|ControlMessage| SP
SP <--> AL
AL -->|API Call| OpenAI
AL -->|API Call| Anthropic
AL -->|Tool Call| Tools
SP -->|OutboundMessage| OB
OB --> CD
SP --> SQLite
Tools --> SQLite
subgraph Tools
Bash["Bash"]
FileIO["File Read/Write/Edit"]
Web["HTTP Request / Web Fetch"]
Calc["Calculator"]
Skill["Get Skill"]
Msg["Send Message"]
Cron["Cron Jobs"]
end
```
### Core Data Flow
```mermaid
sequenceDiagram
participant Channel as Channel
(CLI/Feishu)
participant Bus as MessageBus
participant SM as SessionManager
participant AL as AgentLoop
participant LLM as LLM Provider
participant Tool as Tools
Channel->>Bus: InboundMessage (user input)
Bus->>SM: DialogEvent
SM->>SM: Load/Resolve Session
SM->>AL: Process (session state)
AL->>LLM: ChatCompletionRequest
LLM-->>AL: response / tool_calls
alt Tool Calls
AL->>Tool: execute tool
Tool-->>AL: result
AL->>LLM: continue with tool result
end
AL-->>SM: AgentProcessResult (text + token count)
SM->>SM: Persist to SQLite
SM->>Bus: OutboundMessage
Bus->>Channel: response to user
```
## Features
### Multi-Channel Support
- **CLI Chat Client** β Full TUI with session management, Markdown rendering, slash commands
- **Feishu (Lark)** β Webhook-based integration with typing indicators and media support
### Multi-Provider LLM
- OpenAI-compatible API (GPT-4, DashScope, Volcengine, etc.)
- Anthropic Messages API (Claude)
- Cross-provider JSON Schema normalization for tool calling compatibility
### Session Management
- Multi-session conversations per channel/chat
- Create, switch, rename, archive, delete dialogs via slash commands or WebSocket
- SQLite-persisted session history with automatic TTL-based cleanup
- Context compression for long conversations approaching token limits
### Tool System
| Tool | Description |
|------|-------------|
| `bash` | Execute shell commands in workspace |
| `file_read` | Read file contents |
| `file_write` | Create/overwrite files |
| `file_edit` | Precise string substitution in files |
| `http_request` | Make HTTP API requests |
| `web_fetch` | Fetch and parse web pages |
| `calculator` | Evaluate mathematical expressions |
| `get_skill` | Load agent skills from local skill files |
| `send_message` | Send messages to other channels |
| `cron_add/list/remove/enable/disable/update` | Manage scheduled jobs |
### Scheduling
- Cron-based recurring jobs with optional timezone support
- One-shot (`at`) and interval (`every`) schedules
- Jobs trigger agent processing via specified channel/chat
### Skills System
- Load Markdown skill files from `~/.picobot/skills` and `~/.agent/skills`
- Skills inject specialized system prompts for specific tasks
- Automatic hot-reload on file changes
### Observability
- Observer pattern for agent and tool telemetry
- Events: `AgentStart`, `AgentEnd`, `ToolCallStart`, `ToolCall`
- Structured JSON logging with file rotation
## Quick Start
### Prerequisites
- Rust nightly (edition 2024) β use `rustup` to install
### Build
```bash
cargo build
```
### Configure
1. Create `config.json` (or `~/.picobot/config.json`):
```json
{
"providers": {
"openai": {
"type": "openai",
"base_url": "https://api.openai.com/v1",
"api_key": ""
}
},
"models": {
"gpt-4o": {
"model_id": "gpt-4o",
"temperature": 0.7,
"max_tokens": 4096
}
},
"agents": {
"default": {
"provider": "openai",
"model": "gpt-4o",
"max_tool_iterations": 20,
"token_limit": 128000
}
}
}
```
2. Set API keys via `.env` file (one `KEY=VALUE` per line):
```env
OPENAI_API_KEY=sk-xxxxx
```
### Run
**Start gateway server:**
```bash
cargo run -- gateway
```
Binds `127.0.0.1:19876` by default. Override with `--host` and `--port`.
**Connect CLI client:**
```bash
cargo run -- chat
```
Connects to `ws://127.0.0.1:19876/ws`. Override with `--gateway-url`.
## Configuration Reference
Config load order: `~/.picobot/config.json` β `./config.json` (fallback).
### Full Config Structure
```mermaid
graph LR
Config["config.json"]
Config --> Providers["providers
ProviderConfig{}"]
Config --> Models["models
ModelConfig{}"]
Config --> Agents["agents
AgentConfig{}"]
Config --> Gateway["gateway
GatewayConfig"]
Config --> Client["client
ClientConfig"]
Config --> Channels["channels
ChannelConfig{}"]
Config --> Workspace["workspace_dir"]
Providers --> PT["type (openai / anthropic)
base_url
api_key
extra_headers"]
Models --> MT["model_id
temperature
max_tokens"]
Agents --> AT["provider (ref)
model (ref)
max_tool_iterations
token_limit"]
Gateway --> GT["host / port
session_ttl_hours
cleanup_interval_minutes
session_db_path
scheduler"]
Channels --> CT["feishu: app_id, app_secret
allow_from, agent, media_dir"]
```
### Environment Variables
The `.env` file in the working directory is loaded manually (not via dotenv crate). Placeholders in `config.json` written as `` are substituted at load time.
### Gateway Config
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `host` | string | `127.0.0.1` | Bind address |
| `port` | u16 | `19876` | Listen port |
| `session_ttl_hours` | number | `4` | Inactive session expiration (hours) |
| `cleanup_interval_minutes` | number | `60` | Session cleanup interval |
| `session_db_path` | string | workspace `picobot.db` | SQLite database path |
| `scheduler.enabled` | bool | `false` | Enable cron scheduler |
### Agent Config
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `provider` | string | β | Provider name (key in `providers`) |
| `model` | string | β | Model name (key in `models`) |
| `max_tool_iterations` | number | `20` | Max tool call iterations per turn |
| `token_limit` | number | `128000` | Context window token limit |
## Slash Commands
Available in CLI chat and Feishu:
| Command | Alias | Description |
|---------|-------|-------------|
| `/new` | `/ε·ζ°` | Create a new dialog |
| `/list` | `/ε―Ήθ―ε葨` | List all dialogs |
| `/switch ` | β | Switch to a dialog |
| `/rename ` | β | Rename current dialog |
| `/archive` | β | Archive current dialog |
| `/delete` | β | Delete current dialog |
| `/clear` | `/ζΈ
η©Ί` | Clear current dialog history |
## WebSocket Protocol
The gateway exposes a WebSocket endpoint at `/ws`. Messages use typed JSON with a `type` discriminator field.
### Client β Server (WsInbound)
| Type | Fields |
|------|--------|
| `user_input` | `content`, `channel?`, `chat_id?`, `sender_id?` |
| `create_session` | `title?` |
| `list_sessions` | `include_archived` |
| `load_session` | `session_id` |
| `rename_session` | `session_id?`, `title` |
| `archive_session` | `session_id?` |
| `delete_session` | `session_id?` |
| `clear_history` | `chat_id?`, `session_id?` |
| `get_slash_commands` | β |
| `ping` | β |
### Server β Client (WsOutbound)
| Type | Fields |
|------|--------|
| `assistant_response` | `session_id`, `response`, `tokens_used?`, `tool_calls?` |
| `session_list` | `sessions[]` |
| `session_loaded` | `session_id`, `messages[]` |
| `session_created` | `session_id`, `title` |
| `session_renamed` | `session_id`, `title` |
| `session_archived` | `session_id` |
| `session_deleted` | `session_id` |
| `slash_commands` | `commands[]` |
| `error` | `message` |
| `pong` | β |
## HTTP Endpoints
| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/health` | Health check β returns `{"status":"ok","version":"x.y.z"}` |
| `GET` | `/ws` | WebSocket upgrade for chat clients |
## Testing
```bash
# Unit tests (no external dependencies)
cargo test --lib
# Integration tests (require API keys)
cp tests/test.env.example tests/test.env
# Fill in your API keys in tests/test.env
cargo test --test test_integration -- --ignored
cargo test --test test_tool_calling -- --ignored
cargo test --test test_request_format -- --ignored
# Run all tests
cargo test -- --ignored
```
Integration tests are `#[ignore]` by default because they make real API calls.
## Project Structure
```
βββ src/
β βββ main.rs # CLI entrypoint (clap-based subcommands)
β βββ lib.rs # Module declarations
β βββ gateway/ # HTTP/WS server, GatewayState initialization
β β βββ mod.rs
β β βββ http.rs # Health endpoint
β β βββ ws.rs # WebSocket handler
β βββ client/ # TUI chat client
β β βββ mod.rs
β β βββ tui/ # Ratatui-based terminal UI
β βββ channels/ # Channel integrations
β β βββ base.rs # Channel trait
β β βββ cli_chat.rs # CLI WebSocket channel
β β βββ feishu.rs # Feishu/Lark webhook channel
β β βββ manager.rs # ChannelManager
β β βββ slash_command.rs # Slash command parser
β βββ bus/ # Async message bus
β β βββ mod.rs # MessageBus (tokio mpsc channels)
β β βββ message.rs # Message types
β β βββ dispatcher.rs # OutboundDispatcher
β βββ session/ # Session & dialog management
β β βββ mod.rs
β β βββ session.rs # Session, SessionManager
β β βββ session_id.rs # UnifiedSessionId
β β βββ commands.rs # SessionCommand enum
β β βββ events.rs # SessionEvent, DialogInfo
β βββ agent/ # LLM interaction loop
β β βββ mod.rs
β β βββ agent_loop.rs # AgentLoop (stateless)
β β βββ context_compressor.rs # Token estimation & summarization
β β βββ system_prompt.rs # System prompt builder
β βββ providers/ # LLM API clients
β β βββ mod.rs # Factory: create_provider()
β β βββ traits.rs # LLMProvider trait
β β βββ openai.rs # OpenAI-compatible client
β β βββ anthropic.rs # Anthropic Messages API client
β βββ tools/ # Agent tools
β β βββ mod.rs # create_default_tools()
β β βββ registry.rs # ToolRegistry
β β βββ traits.rs # Tool trait, ToolResult
β β βββ schema.rs # Cross-provider JSON Schema cleaner
β β βββ bash.rs # Shell command execution
β β βββ calculator.rs # Math expression evaluator
β β βββ chat_manager.rs # Session management tool
β β βββ cron.rs # Cron job management tools
β β βββ file_read.rs # File reader
β β βββ file_write.rs # File writer
β β βββ file_edit.rs # File editor (string substitution)
β β βββ get_skill.rs # Skill loader tool
β β βββ http_request.rs # HTTP request tool
β β βββ send_message.rs # Cross-channel messaging
β β βββ web_fetch.rs # Web page fetcher
β βββ skills/ # Skills loading from markdown files
β β βββ mod.rs # SkillsLoader, Skill
β βββ storage/ # SQLite persistence
β β βββ mod.rs # Storage, schema init
β β βββ session.rs # Session CRUD operations
β β βββ message.rs # Message persistence
β β βββ scheduler.rs # ScheduledJob, JobRun storage
β β βββ error.rs # StorageError
β βββ scheduler/ # Cron scheduler runtime
β β βββ mod.rs # Scheduler, next_run_for_schedule()
β β βββ types.rs # Schedule enum (At/Every/Cron)
β βββ observability/ # Telemetry observer pattern
β β βββ mod.rs # Observer trait, ObserverEvent, MultiObserver
β βββ protocol.rs # WebSocket message types (WsInbound/WsOutbound)
β βββ config/ # Config loading & env substitution
β β βββ mod.rs # Config, LLMProviderConfig, load_env_file()
β βββ logging.rs # Tracing subscriber init with file rotation
βββ tests/
β βββ test_integration.rs # LLM provider integration tests
β βββ test_tool_calling.rs # Tool calling integration tests
β βββ test_request_format.rs # Request format tests
β βββ test_scheduler.rs # Scheduler unit tests
β βββ test.env.example # Test environment template
β βββ test.env # Actual test keys (gitignored)
βββ reference/ # Third-party reference code (do not modify)
βββ config.example.json # Full config example
βββ Cargo.toml
```
## Key Dependencies
| Crate | Purpose |
|-------|---------|
| `axum` + `tokio-tungstenite` | HTTP server & WebSocket |
| `sqlx` (SQLite) | Session/Message/Job persistence |
| `reqwest` (rustls) | LLM API & external HTTP calls |
| `ratatui` + `crossterm` | Terminal UI |
| `clap` | CLI argument parsing |
| `tracing` + `tracing-subscriber` | Structured logging |
| `cron` + `chrono-tz` | Cron schedule parsing |
| `meval` | Mathematical expression evaluation |
| `uuid` | Session/Dialog ID generation |
| `dirs` | Platform config directory resolution |