PicoBot/README.md
xiaoski 2f11aed44a feat(skills): add built-in skill packaging mechanism and about-picobot documentation
- Add build.rs: scan resources/skills/, compress each with tar+zstd, embed via include_bytes!
- Add src/skills/builtin.rs: runtime auto-install built-in skills to ~/.picobot/skills/
- Add about-picobot built-in skill: SKILL.md index + references/ (config, db-schema, architecture, faq, commands) + assets/config.example.json
- Update skill loading: reverse priority (agents < picobot < workspace), deduplicate by name
- Update skills prompt: re-query get_skill when user asks about installed skills
- Change max_tool_iterations default from 20 to 99
2026-05-15 12:00:18 +08:00

15 KiB

PicoBot

A multi-channel AI agent framework with a WebSocket gateway and TUI client, supporting OpenAI-compatible and Anthropic LLM providers, tool calling, session persistence, and cron-based scheduling.

System Architecture

graph TB
    subgraph Clients
        TUI["🖥️ CLI Chat (TUI)"]
        FS["📱 Feishu/Lark"]
    end

    subgraph Gateway["Gateway Server (127.0.0.1:19876)"]
        HTTP["HTTP Endpoints<br/>GET /health<br/>GET /ws (WebSocket upgrade)"]
        WS["WebSocket Handler"]
        CD["ChannelManager"]
        SP["SessionManager"]
        AL["AgentLoop"]
    end

    subgraph Bus["MessageBus"]
        IB["Inbound Channel"]
        OB["Outbound Channel"]
        CC["Control Channel"]
    end

    subgraph Storage
        SQLite[("SQLite<br/>picobot.db")]
    end

    subgraph AI["AI Providers"]
        OpenAI["OpenAI / DashScope"]
        Anthropic["Anthropic Claude"]
    end

    TUI <-->|WebSocket| WS
    FS <-->|Webhook| HTTP

    CD -->|InboundMessage| IB
    IB -->|DialogEvent| SP
    CC -->|ControlMessage| SP
    SP <--> AL
    AL -->|API Call| OpenAI
    AL -->|API Call| Anthropic
    AL -->|Tool Call| Tools
    SP -->|OutboundMessage| OB
    OB --> CD
    SP --> SQLite
    Tools --> SQLite

    subgraph Tools
        Bash["Bash"]
        FileIO["File Read/Write/Edit"]
        Web["HTTP Request / Web Fetch"]
        Calc["Calculator"]
        Skill["Get Skill"]
        Msg["Send Message"]
        Cron["Cron Jobs"]
    end

Core Data Flow

sequenceDiagram
    participant Channel as Channel<br/>(CLI/Feishu)
    participant Bus as MessageBus
    participant SM as SessionManager
    participant AL as AgentLoop
    participant LLM as LLM Provider
    participant Tool as Tools

    Channel->>Bus: InboundMessage (user input)
    Bus->>SM: DialogEvent
    SM->>SM: Load/Resolve Session
    SM->>AL: Process (session state)
    AL->>LLM: ChatCompletionRequest
    LLM-->>AL: response / tool_calls
    alt Tool Calls
        AL->>Tool: execute tool
        Tool-->>AL: result
        AL->>LLM: continue with tool result
    end
    AL-->>SM: AgentProcessResult (text + token count)
    SM->>SM: Persist to SQLite
    SM->>Bus: OutboundMessage
    Bus->>Channel: response to user

Features

Multi-Channel Support

  • CLI Chat Client — Full TUI with session management, Markdown rendering, slash commands
  • Feishu (Lark) — Webhook-based integration with typing indicators and media support

Multi-Provider LLM

  • OpenAI-compatible API (GPT-4, DashScope, Volcengine, etc.)
  • Anthropic Messages API (Claude)
  • Cross-provider JSON Schema normalization for tool calling compatibility

Session Management

  • Multi-session conversations per channel/chat
  • Create, switch, rename, archive, delete dialogs via slash commands or WebSocket
  • SQLite-persisted session history with automatic TTL-based cleanup
  • Context compression for long conversations approaching token limits

Tool System

Tool Description
bash Execute shell commands in workspace
file_read Read file contents
file_write Create/overwrite files
file_edit Precise string substitution in files
http_request Make HTTP API requests
web_fetch Fetch and parse web pages
calculator Evaluate mathematical expressions
get_skill Load agent skills from local skill files
send_message Send messages to other channels
cron_add/list/remove/enable/disable/update Manage scheduled jobs

Scheduling

  • Cron-based recurring jobs with optional timezone support
  • One-shot (at) and interval (every) schedules
  • Jobs trigger agent processing via specified channel/chat

Skills System

  • Load Markdown skill files from ~/.picobot/skills and ~/.agents/skills
  • Skills inject specialized system prompts for specific tasks
  • Automatic hot-reload on file changes

Observability

  • Observer pattern for agent and tool telemetry
  • Events: AgentStart, AgentEnd, ToolCallStart, ToolCall
  • Structured JSON logging with file rotation

Quick Start

Prerequisites

  • Rust nightly (edition 2024) — use rustup to install

Build

cargo build

Configure

  1. Create config.json (or ~/.picobot/config.json):
{
    "providers": {
        "openai": {
            "type": "openai",
            "base_url": "https://api.openai.com/v1",
            "api_key": "<OPENAI_API_KEY>"
        }
    },
    "models": {
        "gpt-4o": {
            "model_id": "gpt-4o",
            "temperature": 0.7,
            "max_tokens": 4096
        }
    },
    "agents": {
        "default": {
            "provider": "openai",
            "model": "gpt-4o",
            "max_tool_iterations": 99,
            "token_limit": 128000
        }
    }
}
  1. Set API keys via .env file (one KEY=VALUE per line):
OPENAI_API_KEY=sk-xxxxx

Run

Start gateway server:

cargo run -- gateway

Binds 127.0.0.1:19876 by default. Override with --host and --port.

Connect CLI client:

cargo run -- chat

Connects to ws://127.0.0.1:19876/ws. Override with --gateway-url.

Configuration Reference

Config load order: ~/.picobot/config.json./config.json (fallback).

Full Config Structure

graph LR
    Config["config.json"]
    Config --> Providers["providers<br/>ProviderConfig{}"]
    Config --> Models["models<br/>ModelConfig{}"]
    Config --> Agents["agents<br/>AgentConfig{}"]
    Config --> Gateway["gateway<br/>GatewayConfig"]
    Config --> Client["client<br/>ClientConfig"]
    Config --> Channels["channels<br/>ChannelConfig{}"]
    Config --> Workspace["workspace_dir"]

    Providers --> PT["type (openai / anthropic)<br/>base_url<br/>api_key<br/>extra_headers"]
    Models --> MT["model_id<br/>temperature<br/>max_tokens"]
    Agents --> AT["provider (ref)<br/>model (ref)<br/>max_tool_iterations<br/>token_limit"]
    Gateway --> GT["host / port<br/>session_db_path<br/>scheduler"]
    Channels --> CT["feishu: app_id, app_secret<br/>allow_from, agent, media_dir"]

Environment Variables

The .env file in the working directory is loaded manually (not via dotenv crate). Placeholders in config.json written as <VAR_NAME> are substituted at load time.

Gateway Config

Key Type Default Description
host string 127.0.0.1 Bind address
port u16 19876 Listen port
session_db_path string workspace picobot.db SQLite database path
scheduler.enabled bool false Enable cron scheduler

Agent Config

Key Type Default Description
provider string Provider name (key in providers)
model string Model name (key in models)
max_tool_iterations number 99 Max tool call iterations per turn
token_limit number 128000 Context window token limit

Slash Commands

Available in CLI chat and Feishu:

Command Alias Description
/new /刷新 Create a new dialog
/list /对话列表 List all dialogs
/switch <id> Switch to a dialog
/rename <title> Rename current dialog
/archive Archive current dialog
/delete Delete current dialog
/clear /清空 Clear current dialog history

WebSocket Protocol

The gateway exposes a WebSocket endpoint at /ws. Messages use typed JSON with a type discriminator field.

Client → Server (WsInbound)

Type Fields
user_input content, channel?, chat_id?, sender_id?
create_session title?
list_sessions include_archived
load_session session_id
rename_session session_id?, title
archive_session session_id?
delete_session session_id?
clear_history chat_id?, session_id?
get_slash_commands
ping

Server → Client (WsOutbound)

Type Fields
assistant_response session_id, response, tokens_used?, tool_calls?
session_list sessions[]
session_loaded session_id, messages[]
session_created session_id, title
session_renamed session_id, title
session_archived session_id
session_deleted session_id
slash_commands commands[]
error message
pong

HTTP Endpoints

Method Path Description
GET /health Health check — returns {"status":"ok","version":"x.y.z"}
GET /ws WebSocket upgrade for chat clients

Testing

# Unit tests (no external dependencies)
cargo test --lib

# Integration tests (require API keys)
cp tests/test.env.example tests/test.env
# Fill in your API keys in tests/test.env
cargo test --test test_integration -- --ignored
cargo test --test test_tool_calling -- --ignored
cargo test --test test_request_format -- --ignored

# Run all tests
cargo test -- --ignored

Integration tests are #[ignore] by default because they make real API calls.

Project Structure

├── src/
│   ├── main.rs              # CLI entrypoint (clap-based subcommands)
│   ├── lib.rs                # Module declarations
│   ├── gateway/              # HTTP/WS server, GatewayState initialization
│   │   ├── mod.rs
│   │   ├── http.rs           # Health endpoint
│   │   └── ws.rs             # WebSocket handler
│   ├── client/               # TUI chat client
│   │   ├── mod.rs
│   │   └── tui/              # Ratatui-based terminal UI
│   ├── channels/             # Channel integrations
│   │   ├── base.rs           # Channel trait
│   │   ├── cli_chat.rs       # CLI WebSocket channel
│   │   ├── feishu.rs         # Feishu/Lark webhook channel
│   │   ├── manager.rs        # ChannelManager
│   │   └── slash_command.rs  # Slash command parser
│   ├── bus/                  # Async message bus
│   │   ├── mod.rs            # MessageBus (tokio mpsc channels)
│   │   ├── message.rs        # Message types
│   │   └── dispatcher.rs     # OutboundDispatcher
│   ├── session/              # Session & dialog management
│   │   ├── mod.rs
│   │   ├── session.rs        # Session, SessionManager
│   │   ├── session_id.rs     # UnifiedSessionId
│   │   ├── commands.rs       # SessionCommand enum
│   │   └── events.rs         # SessionEvent, DialogInfo
│   ├── agent/                # LLM interaction loop
│   │   ├── mod.rs
│   │   ├── agent_loop.rs     # AgentLoop (stateless)
│   │   ├── context_compressor.rs  # Token estimation & summarization
│   │   └── system_prompt.rs  # System prompt builder
│   ├── providers/            # LLM API clients
│   │   ├── mod.rs            # Factory: create_provider()
│   │   ├── traits.rs         # LLMProvider trait
│   │   ├── openai.rs         # OpenAI-compatible client
│   │   └── anthropic.rs      # Anthropic Messages API client
│   ├── tools/                # Agent tools
│   │   ├── mod.rs            # create_default_tools()
│   │   ├── registry.rs       # ToolRegistry
│   │   ├── traits.rs         # Tool trait, ToolResult
│   │   ├── schema.rs         # Cross-provider JSON Schema cleaner
│   │   ├── bash.rs           # Shell command execution
│   │   ├── calculator.rs     # Math expression evaluator
│   │   ├── chat_manager.rs   # Session management tool
│   │   ├── cron.rs           # Cron job management tools
│   │   ├── file_read.rs      # File reader
│   │   ├── file_write.rs     # File writer
│   │   ├── file_edit.rs      # File editor (string substitution)
│   │   ├── get_skill.rs      # Skill loader tool
│   │   ├── http_request.rs   # HTTP request tool
│   │   ├── send_message.rs   # Cross-channel messaging
│   │   └── web_fetch.rs      # Web page fetcher
│   ├── skills/               # Skills loading from markdown files
│   │   └── mod.rs            # SkillsLoader, Skill
│   ├── storage/              # SQLite persistence
│   │   ├── mod.rs            # Storage, schema init
│   │   ├── session.rs        # Session CRUD operations
│   │   ├── message.rs        # Message persistence
│   │   ├── scheduler.rs      # ScheduledJob, JobRun storage
│   │   └── error.rs          # StorageError
│   ├── scheduler/            # Cron scheduler runtime
│   │   ├── mod.rs            # Scheduler, next_run_for_schedule()
│   │   └── types.rs          # Schedule enum (At/Every/Cron)
│   ├── observability/        # Telemetry observer pattern
│   │   └── mod.rs            # Observer trait, ObserverEvent, MultiObserver
│   ├── protocol.rs           # WebSocket message types (WsInbound/WsOutbound)
│   ├── config/               # Config loading & env substitution
│   │   └── mod.rs            # Config, LLMProviderConfig, load_env_file()
│   └── logging.rs            # Tracing subscriber init with file rotation
├── tests/
│   ├── test_integration.rs   # LLM provider integration tests
│   ├── test_tool_calling.rs  # Tool calling integration tests
│   ├── test_request_format.rs # Request format tests
│   ├── test_scheduler.rs     # Scheduler unit tests
│   ├── test.env.example      # Test environment template
│   └── test.env              # Actual test keys (gitignored)
├── reference/                # Third-party reference code (do not modify)
├── resources/                 # Assets embedded in binary
│   └── templates/             # Templates released to ~/.picobot/ on first run
├── config.example.json       # Full config example
└── Cargo.toml

Key Dependencies

Crate Purpose
axum + tokio-tungstenite HTTP server & WebSocket
sqlx (SQLite) Session/Message/Job persistence
reqwest (rustls) LLM API & external HTTP calls
ratatui + crossterm Terminal UI
clap CLI argument parsing
tracing + tracing-subscriber Structured logging
cron + chrono-tz Cron schedule parsing
meval Mathematical expression evaluation
uuid Session/Dialog ID generation
dirs Platform config directory resolution