From 28ef813c3790318211e59dbc0f8a117ad993fe93 Mon Sep 17 00:00:00 2001 From: xiaoxixi Date: Thu, 7 May 2026 22:56:19 +0800 Subject: [PATCH] docs: add memory system design document --- docs/plans/2026-05-07-memory-system-design.md | 226 ++++++++++++++++++ 1 file changed, 226 insertions(+) create mode 100644 docs/plans/2026-05-07-memory-system-design.md diff --git a/docs/plans/2026-05-07-memory-system-design.md b/docs/plans/2026-05-07-memory-system-design.md new file mode 100644 index 0000000..cb53249 --- /dev/null +++ b/docs/plans/2026-05-07-memory-system-design.md @@ -0,0 +1,226 @@ +# PicoBot Memory System Design + +Date: 2026-05-07 + +## 1. Overview + +Introduce a memory system that allows PicoBot agents to remember user preferences, project context, facts, and conversation history across sessions. The memory system is **unified with the existing context compression pipeline**: compression automatically produces `timeline` memory entries and advances a `last_consolidated_at` pointer to avoid redundant reprocessing. + +### Design Principles + +- **Compression is memory** (inspired by nanobot): when old messages are compressed, the summary is persisted — not discarded +- **FTS5 only** (no vector embeddings): keyword search via SQLite FTS5, sufficient for current scale +- **Extend existing infrastructure**: reuse `Storage` connection pool, `ContextCompressor`, `SystemPromptBuilder` +- **YAGNI**: no knowledge graph, no response cache, no namespace isolation, no audit trail + +## 2. Core Architecture + +``` +ContextCompressor (existing) MemoryManager (new) + │ │ + │ compress_if_needed() │ store / recall / forget + │ ├─ LLM summary → inject │ + │ └─ store(timeline entry) ──────┘ + │ └─ advance last_consolidated_at + │ +SystemPromptBuilder ── recall(knowledge, limit=5) ──→ inject into system prompt +AgentLoop ── after_turn ──→ memory_store / memory_recall / memory_forget tools +``` + +## 3. Memory Categories + +| Category | Purpose | Written By | Retrieved By | +|----------|---------|-----------|--------------| +| `knowledge` | Long-term facts, preferences, patterns, insights | Agent via `memory_store` tool | FTS5 → injected into system prompt every turn | +| `timeline` | Compressed conversation summaries | ContextCompressor automatically | FTS5 + time-range queries | + +## 4. Storage Schema + +### New table: `memories` + +Added to the existing `Storage` initialization in `src/storage/mod.rs`: + +```sql +CREATE TABLE IF NOT EXISTS memories ( + id TEXT PRIMARY KEY, + key TEXT NOT NULL UNIQUE, + content TEXT NOT NULL, + category TEXT NOT NULL DEFAULT 'knowledge', + importance REAL NOT NULL DEFAULT 0.5, + session_id TEXT, + created_at TEXT NOT NULL, + updated_at TEXT NOT NULL +); + +CREATE VIRTUAL TABLE IF NOT EXISTS memory_fts USING fts5( + key, + content, + content=memories, + content_rowid=rowid +); +``` + +### Modified table: `sessions` + +```sql +ALTER TABLE sessions ADD COLUMN last_consolidated_at INTEGER; +``` + +## 5. Unified Compression-Memory Pipeline + +### Trigger Conditions + +Compression/consolidation fires when **any** of these conditions is met: + +| Condition | Value | Rationale | +|-----------|-------|-----------| +| Token budget exceeds 50% threshold | `context_window / 2` | Primary trigger — context is getting full | +| Accumulated N turns without consolidation | 3 (configurable) | Catch-up for short messages that don't hit token threshold | +| Session idle | 10 minutes (configurable) | Important for async channels like Feishu | + +### Flow + +``` +compress_if_needed(history, session_id): + 1. Read last_consolidated_at from session + → Only compress messages after that timestamp + 2. If no messages to compress → return history unchanged + 3. FTS5 recall(user_input, limit=recall_limit, category=knowledge) + → Inject relevant facts into system prompt + 4. LLM summarization of old messages → [Context Summary] + → Inject into current conversation + 5. Store summary as timeline entry: + key: "ctx_{session_id}_{uuid}" + content: "[YYYY-MM-DD HH:MM] summary text..." + category: timeline + 6. UPDATE sessions.last_consolidated_at = now() + 7. Return compressed history +``` + +### timeline Entry Format + +Each timeline entry follows nanobot's convention: +``` +[2026-05-07 14:30] User asked about Rust async patterns. Discussed tokio::select!, +semaphore-based rate limiting, and backpressure strategies. No code was written. +``` + +This format is grep-friendly and human-readable. + +## 6. Retrieval Strategy + +### Automatic Retrieval (every turn) + +`SystemPromptBuilder.build_system_prompt()` calls: +```rust +memory.recall(query=user_message, limit=recall_limit, category=knowledge) +``` + +Results sorted by FTS5 BM25 score, injected as: +``` +## Memory Context + +- user_prefers_rust: User prefers Rust for all backend projects +- project_picobot_stack: PicoBot uses Rust, axum, sqlx, ratatui, tokio +- user_workflow: User prefers TDD workflow with cargo test --lib +``` + +### Agent-Initiated Retrieval + +Agent uses `memory_recall` tool with optional `category`, `since`, `until` parameters. + +### Fallback + +If FTS5 returns empty results, fallback to `LIKE '%keyword%'` on `key` and `content` columns. + +## 7. Agent Tools + +| Tool | Parameters | Description | +|------|-----------|-------------| +| `memory_store` | `key: str`, `content: str`, `category: str`, `importance?: f64` | Write or update a memory entry. Key is semantic identifier (e.g., "user_language_pref") | +| `memory_recall` | `query: str`, `category?: str`, `since?: i64`, `until?: i64`, `limit?: usize` | Search memories by keyword and optional filters | +| `memory_forget` | `key: str` | Delete a memory entry by key | + +## 8. Error Handling & Degradation + +| Scenario | Strategy | +|----------|----------| +| Consolidation LLM call fails | Log warning, increment failure counter, do NOT block main flow | +| Consecutive failures >= 3 | Degrade: append raw message dump to timeline with `[RAW]` prefix, reset counter | +| FTS5 recall returns empty | Fallback to `LIKE '%keyword%'` query | +| `memory.enabled = false` | ContextCompressor works normally, no memory writes | +| MemoryManager uninitialized | ContextCompressor works with feature-gated memory write path | + +## 9. Configuration + +```json +{ + "memory": { + "enabled": true, + "consolidation_provider": "openai", + "consolidation_model": "gpt-4o-mini", + "recall_limit": 5, + "consolidation_turn_threshold": 3, + "idle_consolidation_minutes": 10, + "timeline_retention_days": 90, + "max_failures_before_degrade": 3 + } +} +``` + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `enabled` | bool | `false` | Master switch for memory system | +| `consolidation_provider` | string | — | Provider name for consolidation LLM calls | +| `consolidation_model` | string | — | Model name for consolidation | +| `recall_limit` | usize | `5` | Max knowledge entries injected into system prompt | +| `consolidation_turn_threshold` | usize | `3` | Turns before forced consolidation | +| `idle_consolidation_minutes` | u64 | `10` | Idle time before consolidation trigger | +| `timeline_retention_days` | u64 | `90` | Auto-cleanup age for timeline entries | +| `max_failures_before_degrade` | usize | `3` | Consecutive failures before raw archive fallback | + +## 10. New Module Structure + +``` +src/ +├── memory/ +│ ├── mod.rs # MemoryManager, MemoryConfig +│ ├── types.rs # MemoryEntry, MemoryCategory, ConsolidationResult +│ └── consolidation.rs # Consolidation prompt + LLM call logic +├── storage/ +│ └── memory.rs # SQLite CRUD for memories table + FTS5 +├── tools/ +│ ├── memory_store.rs # memory_store tool +│ ├── memory_recall.rs # memory_recall tool +│ └── memory_forget.rs # memory_forget tool +``` + +## 11. Integration Points (Existing Files Modified) + +| File | Change | +|------|--------| +| `src/lib.rs` | Add `pub mod memory;` | +| `src/config/mod.rs` | Add `MemoryConfig` struct and deserialization | +| `src/storage/mod.rs` | Add `pub mod memory;`, init `memories` table and FTS5 in `init_schema()` | +| `src/storage/session.rs` | Add `last_consolidated_at` column read/write | +| `src/session/session.rs` | Add `last_consolidated_at: Option` field to Session | +| `src/agent/context_compressor.rs` | Add `memory: Option>` field, write timeline on compress | +| `src/agent/system_prompt.rs` | Add `memory_context` section via `MemoryManager::recall()` | +| `src/agent/agent_loop.rs` | No changes (tools registered via ToolRegistry) | +| `src/tools/mod.rs` | Register `memory_store`, `memory_recall`, `memory_forget` in `create_default_tools()` | +| `src/gateway/mod.rs` | Initialize `MemoryManager` in `GatewayState::new()`, pass to ContextCompressor | + +## 12. Implementation Order + +| # | Task | Dependencies | +|---|------|-------------| +| 1 | Types: `MemoryEntry`, `MemoryCategory`, `ConsolidationResult` | — | +| 2 | Config: `MemoryConfig` + deserialization | — | +| 3 | Storage: `memories` table + FTS5 + CRUD + search | #1 | +| 4 | `MemoryManager` API | #1, #2, #3 | +| 5 | Session: `last_consolidated_at` field | — | +| 6 | `ContextCompressor` memory integration | #4, #5 | +| 7 | `SystemPromptBuilder` memory context injection | #4 | +| 8 | Agent tools: `memory_store`, `memory_recall`, `memory_forget` | #4 | +| 9 | `GatewayState` initialization wiring | #4, #5, #6 | +| 10 | Unit tests | #1-#9 |