docs: add memory system design document

This commit is contained in:
xiaoxixi 2026-05-07 22:56:19 +08:00
parent 2fe953cdad
commit 28ef813c37

View File

@ -0,0 +1,226 @@
# PicoBot Memory System Design
Date: 2026-05-07
## 1. Overview
Introduce a memory system that allows PicoBot agents to remember user preferences, project context, facts, and conversation history across sessions. The memory system is **unified with the existing context compression pipeline**: compression automatically produces `timeline` memory entries and advances a `last_consolidated_at` pointer to avoid redundant reprocessing.
### Design Principles
- **Compression is memory** (inspired by nanobot): when old messages are compressed, the summary is persisted — not discarded
- **FTS5 only** (no vector embeddings): keyword search via SQLite FTS5, sufficient for current scale
- **Extend existing infrastructure**: reuse `Storage` connection pool, `ContextCompressor`, `SystemPromptBuilder`
- **YAGNI**: no knowledge graph, no response cache, no namespace isolation, no audit trail
## 2. Core Architecture
```
ContextCompressor (existing) MemoryManager (new)
│ │
│ compress_if_needed() │ store / recall / forget
│ ├─ LLM summary → inject │
│ └─ store(timeline entry) ──────┘
│ └─ advance last_consolidated_at
SystemPromptBuilder ── recall(knowledge, limit=5) ──→ inject into system prompt
AgentLoop ── after_turn ──→ memory_store / memory_recall / memory_forget tools
```
## 3. Memory Categories
| Category | Purpose | Written By | Retrieved By |
|----------|---------|-----------|--------------|
| `knowledge` | Long-term facts, preferences, patterns, insights | Agent via `memory_store` tool | FTS5 → injected into system prompt every turn |
| `timeline` | Compressed conversation summaries | ContextCompressor automatically | FTS5 + time-range queries |
## 4. Storage Schema
### New table: `memories`
Added to the existing `Storage` initialization in `src/storage/mod.rs`:
```sql
CREATE TABLE IF NOT EXISTS memories (
id TEXT PRIMARY KEY,
key TEXT NOT NULL UNIQUE,
content TEXT NOT NULL,
category TEXT NOT NULL DEFAULT 'knowledge',
importance REAL NOT NULL DEFAULT 0.5,
session_id TEXT,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE VIRTUAL TABLE IF NOT EXISTS memory_fts USING fts5(
key,
content,
content=memories,
content_rowid=rowid
);
```
### Modified table: `sessions`
```sql
ALTER TABLE sessions ADD COLUMN last_consolidated_at INTEGER;
```
## 5. Unified Compression-Memory Pipeline
### Trigger Conditions
Compression/consolidation fires when **any** of these conditions is met:
| Condition | Value | Rationale |
|-----------|-------|-----------|
| Token budget exceeds 50% threshold | `context_window / 2` | Primary trigger — context is getting full |
| Accumulated N turns without consolidation | 3 (configurable) | Catch-up for short messages that don't hit token threshold |
| Session idle | 10 minutes (configurable) | Important for async channels like Feishu |
### Flow
```
compress_if_needed(history, session_id):
1. Read last_consolidated_at from session
→ Only compress messages after that timestamp
2. If no messages to compress → return history unchanged
3. FTS5 recall(user_input, limit=recall_limit, category=knowledge)
→ Inject relevant facts into system prompt
4. LLM summarization of old messages → [Context Summary]
→ Inject into current conversation
5. Store summary as timeline entry:
key: "ctx_{session_id}_{uuid}"
content: "[YYYY-MM-DD HH:MM] summary text..."
category: timeline
6. UPDATE sessions.last_consolidated_at = now()
7. Return compressed history
```
### timeline Entry Format
Each timeline entry follows nanobot's convention:
```
[2026-05-07 14:30] User asked about Rust async patterns. Discussed tokio::select!,
semaphore-based rate limiting, and backpressure strategies. No code was written.
```
This format is grep-friendly and human-readable.
## 6. Retrieval Strategy
### Automatic Retrieval (every turn)
`SystemPromptBuilder.build_system_prompt()` calls:
```rust
memory.recall(query=user_message, limit=recall_limit, category=knowledge)
```
Results sorted by FTS5 BM25 score, injected as:
```
## Memory Context
- user_prefers_rust: User prefers Rust for all backend projects
- project_picobot_stack: PicoBot uses Rust, axum, sqlx, ratatui, tokio
- user_workflow: User prefers TDD workflow with cargo test --lib
```
### Agent-Initiated Retrieval
Agent uses `memory_recall` tool with optional `category`, `since`, `until` parameters.
### Fallback
If FTS5 returns empty results, fallback to `LIKE '%keyword%'` on `key` and `content` columns.
## 7. Agent Tools
| Tool | Parameters | Description |
|------|-----------|-------------|
| `memory_store` | `key: str`, `content: str`, `category: str`, `importance?: f64` | Write or update a memory entry. Key is semantic identifier (e.g., "user_language_pref") |
| `memory_recall` | `query: str`, `category?: str`, `since?: i64`, `until?: i64`, `limit?: usize` | Search memories by keyword and optional filters |
| `memory_forget` | `key: str` | Delete a memory entry by key |
## 8. Error Handling & Degradation
| Scenario | Strategy |
|----------|----------|
| Consolidation LLM call fails | Log warning, increment failure counter, do NOT block main flow |
| Consecutive failures >= 3 | Degrade: append raw message dump to timeline with `[RAW]` prefix, reset counter |
| FTS5 recall returns empty | Fallback to `LIKE '%keyword%'` query |
| `memory.enabled = false` | ContextCompressor works normally, no memory writes |
| MemoryManager uninitialized | ContextCompressor works with feature-gated memory write path |
## 9. Configuration
```json
{
"memory": {
"enabled": true,
"consolidation_provider": "openai",
"consolidation_model": "gpt-4o-mini",
"recall_limit": 5,
"consolidation_turn_threshold": 3,
"idle_consolidation_minutes": 10,
"timeline_retention_days": 90,
"max_failures_before_degrade": 3
}
}
```
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `false` | Master switch for memory system |
| `consolidation_provider` | string | — | Provider name for consolidation LLM calls |
| `consolidation_model` | string | — | Model name for consolidation |
| `recall_limit` | usize | `5` | Max knowledge entries injected into system prompt |
| `consolidation_turn_threshold` | usize | `3` | Turns before forced consolidation |
| `idle_consolidation_minutes` | u64 | `10` | Idle time before consolidation trigger |
| `timeline_retention_days` | u64 | `90` | Auto-cleanup age for timeline entries |
| `max_failures_before_degrade` | usize | `3` | Consecutive failures before raw archive fallback |
## 10. New Module Structure
```
src/
├── memory/
│ ├── mod.rs # MemoryManager, MemoryConfig
│ ├── types.rs # MemoryEntry, MemoryCategory, ConsolidationResult
│ └── consolidation.rs # Consolidation prompt + LLM call logic
├── storage/
│ └── memory.rs # SQLite CRUD for memories table + FTS5
├── tools/
│ ├── memory_store.rs # memory_store tool
│ ├── memory_recall.rs # memory_recall tool
│ └── memory_forget.rs # memory_forget tool
```
## 11. Integration Points (Existing Files Modified)
| File | Change |
|------|--------|
| `src/lib.rs` | Add `pub mod memory;` |
| `src/config/mod.rs` | Add `MemoryConfig` struct and deserialization |
| `src/storage/mod.rs` | Add `pub mod memory;`, init `memories` table and FTS5 in `init_schema()` |
| `src/storage/session.rs` | Add `last_consolidated_at` column read/write |
| `src/session/session.rs` | Add `last_consolidated_at: Option<i64>` field to Session |
| `src/agent/context_compressor.rs` | Add `memory: Option<Arc<MemoryManager>>` field, write timeline on compress |
| `src/agent/system_prompt.rs` | Add `memory_context` section via `MemoryManager::recall()` |
| `src/agent/agent_loop.rs` | No changes (tools registered via ToolRegistry) |
| `src/tools/mod.rs` | Register `memory_store`, `memory_recall`, `memory_forget` in `create_default_tools()` |
| `src/gateway/mod.rs` | Initialize `MemoryManager` in `GatewayState::new()`, pass to ContextCompressor |
## 12. Implementation Order
| # | Task | Dependencies |
|---|------|-------------|
| 1 | Types: `MemoryEntry`, `MemoryCategory`, `ConsolidationResult` | — |
| 2 | Config: `MemoryConfig` + deserialization | — |
| 3 | Storage: `memories` table + FTS5 + CRUD + search | #1 |
| 4 | `MemoryManager` API | #1, #2, #3 |
| 5 | Session: `last_consolidated_at` field | — |
| 6 | `ContextCompressor` memory integration | #4, #5 |
| 7 | `SystemPromptBuilder` memory context injection | #4 |
| 8 | Agent tools: `memory_store`, `memory_recall`, `memory_forget` | #4 |
| 9 | `GatewayState` initialization wiring | #4, #5, #6 |
| 10 | Unit tests | #1-#9 |