docs: add memory system design document
This commit is contained in:
parent
2fe953cdad
commit
28ef813c37
226
docs/plans/2026-05-07-memory-system-design.md
Normal file
226
docs/plans/2026-05-07-memory-system-design.md
Normal file
@ -0,0 +1,226 @@
|
|||||||
|
# PicoBot Memory System Design
|
||||||
|
|
||||||
|
Date: 2026-05-07
|
||||||
|
|
||||||
|
## 1. Overview
|
||||||
|
|
||||||
|
Introduce a memory system that allows PicoBot agents to remember user preferences, project context, facts, and conversation history across sessions. The memory system is **unified with the existing context compression pipeline**: compression automatically produces `timeline` memory entries and advances a `last_consolidated_at` pointer to avoid redundant reprocessing.
|
||||||
|
|
||||||
|
### Design Principles
|
||||||
|
|
||||||
|
- **Compression is memory** (inspired by nanobot): when old messages are compressed, the summary is persisted — not discarded
|
||||||
|
- **FTS5 only** (no vector embeddings): keyword search via SQLite FTS5, sufficient for current scale
|
||||||
|
- **Extend existing infrastructure**: reuse `Storage` connection pool, `ContextCompressor`, `SystemPromptBuilder`
|
||||||
|
- **YAGNI**: no knowledge graph, no response cache, no namespace isolation, no audit trail
|
||||||
|
|
||||||
|
## 2. Core Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
ContextCompressor (existing) MemoryManager (new)
|
||||||
|
│ │
|
||||||
|
│ compress_if_needed() │ store / recall / forget
|
||||||
|
│ ├─ LLM summary → inject │
|
||||||
|
│ └─ store(timeline entry) ──────┘
|
||||||
|
│ └─ advance last_consolidated_at
|
||||||
|
│
|
||||||
|
SystemPromptBuilder ── recall(knowledge, limit=5) ──→ inject into system prompt
|
||||||
|
AgentLoop ── after_turn ──→ memory_store / memory_recall / memory_forget tools
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3. Memory Categories
|
||||||
|
|
||||||
|
| Category | Purpose | Written By | Retrieved By |
|
||||||
|
|----------|---------|-----------|--------------|
|
||||||
|
| `knowledge` | Long-term facts, preferences, patterns, insights | Agent via `memory_store` tool | FTS5 → injected into system prompt every turn |
|
||||||
|
| `timeline` | Compressed conversation summaries | ContextCompressor automatically | FTS5 + time-range queries |
|
||||||
|
|
||||||
|
## 4. Storage Schema
|
||||||
|
|
||||||
|
### New table: `memories`
|
||||||
|
|
||||||
|
Added to the existing `Storage` initialization in `src/storage/mod.rs`:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE IF NOT EXISTS memories (
|
||||||
|
id TEXT PRIMARY KEY,
|
||||||
|
key TEXT NOT NULL UNIQUE,
|
||||||
|
content TEXT NOT NULL,
|
||||||
|
category TEXT NOT NULL DEFAULT 'knowledge',
|
||||||
|
importance REAL NOT NULL DEFAULT 0.5,
|
||||||
|
session_id TEXT,
|
||||||
|
created_at TEXT NOT NULL,
|
||||||
|
updated_at TEXT NOT NULL
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE VIRTUAL TABLE IF NOT EXISTS memory_fts USING fts5(
|
||||||
|
key,
|
||||||
|
content,
|
||||||
|
content=memories,
|
||||||
|
content_rowid=rowid
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Modified table: `sessions`
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE sessions ADD COLUMN last_consolidated_at INTEGER;
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5. Unified Compression-Memory Pipeline
|
||||||
|
|
||||||
|
### Trigger Conditions
|
||||||
|
|
||||||
|
Compression/consolidation fires when **any** of these conditions is met:
|
||||||
|
|
||||||
|
| Condition | Value | Rationale |
|
||||||
|
|-----------|-------|-----------|
|
||||||
|
| Token budget exceeds 50% threshold | `context_window / 2` | Primary trigger — context is getting full |
|
||||||
|
| Accumulated N turns without consolidation | 3 (configurable) | Catch-up for short messages that don't hit token threshold |
|
||||||
|
| Session idle | 10 minutes (configurable) | Important for async channels like Feishu |
|
||||||
|
|
||||||
|
### Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
compress_if_needed(history, session_id):
|
||||||
|
1. Read last_consolidated_at from session
|
||||||
|
→ Only compress messages after that timestamp
|
||||||
|
2. If no messages to compress → return history unchanged
|
||||||
|
3. FTS5 recall(user_input, limit=recall_limit, category=knowledge)
|
||||||
|
→ Inject relevant facts into system prompt
|
||||||
|
4. LLM summarization of old messages → [Context Summary]
|
||||||
|
→ Inject into current conversation
|
||||||
|
5. Store summary as timeline entry:
|
||||||
|
key: "ctx_{session_id}_{uuid}"
|
||||||
|
content: "[YYYY-MM-DD HH:MM] summary text..."
|
||||||
|
category: timeline
|
||||||
|
6. UPDATE sessions.last_consolidated_at = now()
|
||||||
|
7. Return compressed history
|
||||||
|
```
|
||||||
|
|
||||||
|
### timeline Entry Format
|
||||||
|
|
||||||
|
Each timeline entry follows nanobot's convention:
|
||||||
|
```
|
||||||
|
[2026-05-07 14:30] User asked about Rust async patterns. Discussed tokio::select!,
|
||||||
|
semaphore-based rate limiting, and backpressure strategies. No code was written.
|
||||||
|
```
|
||||||
|
|
||||||
|
This format is grep-friendly and human-readable.
|
||||||
|
|
||||||
|
## 6. Retrieval Strategy
|
||||||
|
|
||||||
|
### Automatic Retrieval (every turn)
|
||||||
|
|
||||||
|
`SystemPromptBuilder.build_system_prompt()` calls:
|
||||||
|
```rust
|
||||||
|
memory.recall(query=user_message, limit=recall_limit, category=knowledge)
|
||||||
|
```
|
||||||
|
|
||||||
|
Results sorted by FTS5 BM25 score, injected as:
|
||||||
|
```
|
||||||
|
## Memory Context
|
||||||
|
|
||||||
|
- user_prefers_rust: User prefers Rust for all backend projects
|
||||||
|
- project_picobot_stack: PicoBot uses Rust, axum, sqlx, ratatui, tokio
|
||||||
|
- user_workflow: User prefers TDD workflow with cargo test --lib
|
||||||
|
```
|
||||||
|
|
||||||
|
### Agent-Initiated Retrieval
|
||||||
|
|
||||||
|
Agent uses `memory_recall` tool with optional `category`, `since`, `until` parameters.
|
||||||
|
|
||||||
|
### Fallback
|
||||||
|
|
||||||
|
If FTS5 returns empty results, fallback to `LIKE '%keyword%'` on `key` and `content` columns.
|
||||||
|
|
||||||
|
## 7. Agent Tools
|
||||||
|
|
||||||
|
| Tool | Parameters | Description |
|
||||||
|
|------|-----------|-------------|
|
||||||
|
| `memory_store` | `key: str`, `content: str`, `category: str`, `importance?: f64` | Write or update a memory entry. Key is semantic identifier (e.g., "user_language_pref") |
|
||||||
|
| `memory_recall` | `query: str`, `category?: str`, `since?: i64`, `until?: i64`, `limit?: usize` | Search memories by keyword and optional filters |
|
||||||
|
| `memory_forget` | `key: str` | Delete a memory entry by key |
|
||||||
|
|
||||||
|
## 8. Error Handling & Degradation
|
||||||
|
|
||||||
|
| Scenario | Strategy |
|
||||||
|
|----------|----------|
|
||||||
|
| Consolidation LLM call fails | Log warning, increment failure counter, do NOT block main flow |
|
||||||
|
| Consecutive failures >= 3 | Degrade: append raw message dump to timeline with `[RAW]` prefix, reset counter |
|
||||||
|
| FTS5 recall returns empty | Fallback to `LIKE '%keyword%'` query |
|
||||||
|
| `memory.enabled = false` | ContextCompressor works normally, no memory writes |
|
||||||
|
| MemoryManager uninitialized | ContextCompressor works with feature-gated memory write path |
|
||||||
|
|
||||||
|
## 9. Configuration
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"memory": {
|
||||||
|
"enabled": true,
|
||||||
|
"consolidation_provider": "openai",
|
||||||
|
"consolidation_model": "gpt-4o-mini",
|
||||||
|
"recall_limit": 5,
|
||||||
|
"consolidation_turn_threshold": 3,
|
||||||
|
"idle_consolidation_minutes": 10,
|
||||||
|
"timeline_retention_days": 90,
|
||||||
|
"max_failures_before_degrade": 3
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Key | Type | Default | Description |
|
||||||
|
|-----|------|---------|-------------|
|
||||||
|
| `enabled` | bool | `false` | Master switch for memory system |
|
||||||
|
| `consolidation_provider` | string | — | Provider name for consolidation LLM calls |
|
||||||
|
| `consolidation_model` | string | — | Model name for consolidation |
|
||||||
|
| `recall_limit` | usize | `5` | Max knowledge entries injected into system prompt |
|
||||||
|
| `consolidation_turn_threshold` | usize | `3` | Turns before forced consolidation |
|
||||||
|
| `idle_consolidation_minutes` | u64 | `10` | Idle time before consolidation trigger |
|
||||||
|
| `timeline_retention_days` | u64 | `90` | Auto-cleanup age for timeline entries |
|
||||||
|
| `max_failures_before_degrade` | usize | `3` | Consecutive failures before raw archive fallback |
|
||||||
|
|
||||||
|
## 10. New Module Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
src/
|
||||||
|
├── memory/
|
||||||
|
│ ├── mod.rs # MemoryManager, MemoryConfig
|
||||||
|
│ ├── types.rs # MemoryEntry, MemoryCategory, ConsolidationResult
|
||||||
|
│ └── consolidation.rs # Consolidation prompt + LLM call logic
|
||||||
|
├── storage/
|
||||||
|
│ └── memory.rs # SQLite CRUD for memories table + FTS5
|
||||||
|
├── tools/
|
||||||
|
│ ├── memory_store.rs # memory_store tool
|
||||||
|
│ ├── memory_recall.rs # memory_recall tool
|
||||||
|
│ └── memory_forget.rs # memory_forget tool
|
||||||
|
```
|
||||||
|
|
||||||
|
## 11. Integration Points (Existing Files Modified)
|
||||||
|
|
||||||
|
| File | Change |
|
||||||
|
|------|--------|
|
||||||
|
| `src/lib.rs` | Add `pub mod memory;` |
|
||||||
|
| `src/config/mod.rs` | Add `MemoryConfig` struct and deserialization |
|
||||||
|
| `src/storage/mod.rs` | Add `pub mod memory;`, init `memories` table and FTS5 in `init_schema()` |
|
||||||
|
| `src/storage/session.rs` | Add `last_consolidated_at` column read/write |
|
||||||
|
| `src/session/session.rs` | Add `last_consolidated_at: Option<i64>` field to Session |
|
||||||
|
| `src/agent/context_compressor.rs` | Add `memory: Option<Arc<MemoryManager>>` field, write timeline on compress |
|
||||||
|
| `src/agent/system_prompt.rs` | Add `memory_context` section via `MemoryManager::recall()` |
|
||||||
|
| `src/agent/agent_loop.rs` | No changes (tools registered via ToolRegistry) |
|
||||||
|
| `src/tools/mod.rs` | Register `memory_store`, `memory_recall`, `memory_forget` in `create_default_tools()` |
|
||||||
|
| `src/gateway/mod.rs` | Initialize `MemoryManager` in `GatewayState::new()`, pass to ContextCompressor |
|
||||||
|
|
||||||
|
## 12. Implementation Order
|
||||||
|
|
||||||
|
| # | Task | Dependencies |
|
||||||
|
|---|------|-------------|
|
||||||
|
| 1 | Types: `MemoryEntry`, `MemoryCategory`, `ConsolidationResult` | — |
|
||||||
|
| 2 | Config: `MemoryConfig` + deserialization | — |
|
||||||
|
| 3 | Storage: `memories` table + FTS5 + CRUD + search | #1 |
|
||||||
|
| 4 | `MemoryManager` API | #1, #2, #3 |
|
||||||
|
| 5 | Session: `last_consolidated_at` field | — |
|
||||||
|
| 6 | `ContextCompressor` memory integration | #4, #5 |
|
||||||
|
| 7 | `SystemPromptBuilder` memory context injection | #4 |
|
||||||
|
| 8 | Agent tools: `memory_store`, `memory_recall`, `memory_forget` | #4 |
|
||||||
|
| 9 | `GatewayState` initialization wiring | #4, #5, #6 |
|
||||||
|
| 10 | Unit tests | #1-#9 |
|
||||||
Loading…
x
Reference in New Issue
Block a user