- 移除 TodoItem 中的 priority、created_at 和 updated_at 字段 - 强制每个任务都必须有唯一 id,且由用户负责生成 - 修改合并模式逻辑,merge=true 下保留未提及的旧任务 - 支持已完成和已取消任务重新激活(状态改回 pending 或 in_progress) - 禁止 in_progress 状态退回到 pending,必须标记为 completed 或 cancelled - 优化状态转换校验,允许特定状态间合法切换 - 简化任务变更消息,移除详细的新增/更新/移除统计 - 更新文档和示例,明确 id 必须由用户生成和使用 - 修复和补充测试,增强状态转换和合并模式验证 - 调整任务时间戳生成逻辑,统一使用当前时间及索引 - 该变更提供更合理的任务状态机械及管理模式,提升稳定性和易用性
13 KiB
知识整理工作流:Analysis
Loaded by states: CONTENT_READ, ISSUE_ANALYSIS, RULE_GENERATION.
This file owns low-confidence partial reads, issue analysis, classification rules, and target tree generation. It MUST NOT create execution plans, ask for execution confirmation, or perform write operations.
Required Context
Before executing rules in this file:
resource_itemsMUST already exist fromlark-drive-workflow-knowledge-organize-discovery.md.- For document partial reads, follow
../../lark-doc/SKILL.mdand../../lark-doc/references/lark-doc-fetch.md. - For sheet / bitable down-drill, follow
../../lark-sheets/SKILL.mdor../../lark-base/SKILL.mdonly when title and path are insufficient.
State: CONTENT_READ
Entry: resource_items exists.
MUST:
- Build
low_confidence_items. - Apply
Low-Confidence Partial Read. - Read only supported docs through
lark-doc-fetch. - Switch to
lark-sheets/lark-baseonly when sheet / bitable title and path are insufficient. - Record read evidence for classification.
- Continue reading low-confidence resources in internal batches until all supported low-confidence resources in the current inventory are processed or a blocker occurs.
- Apply
Analysis Progress Reporting. - Output progress / summary without asking the user to continue between batches.
Exit: low-confidence items are classified or marked needs_review=true.
Low-Confidence Partial Read
Low-confidence resources include:
- 标题为空
- 标题为
test/测试/ 纯数字 / 无意义短词 - 标题、路径、类型之间没有足够分类线索
- 同一标题或相似标题出现在多个候选分类中
- 用户要求按项目 / 客户 / 业务线归类,但标题和路径没有明确项目 / 客户 / 业务线名称
| Condition | Agent MUST Do | Agent MUST NOT Do |
|---|---|---|
| Title / path / type clearly determine classification | Classify directly | Do not perform content read |
| Resource is low-confidence and docs-fetch-supported | Read outline via lark-doc-fetch |
Do not skip partial read |
| Candidate project / customer / business / document-type terms exist | After outline, run keyword partial read with candidate terms | Do not use broad generic keywords |
| Partial read returns usable block id and classification is still unclear | Read the relevant section via lark-doc-fetch |
Do not read the full document |
| Partial read still cannot classify | Set needs_review=true; classify to manual confirmation target |
Do not invent classification |
| Read fails or permission is insufficient | Set needs_review=true; record failure reason |
Do not retry indefinitely |
Partial Read Limits
| Limit | Default |
|---|---|
batch_size |
20 resources per internal batch |
progress_report_interval |
50 low-confidence resources |
max_attempts_per_resource |
3 partial reads: outline, keyword, section |
Batching rules:
- Sort low-confidence resources by impact before reading: root-level loose items, duplicated titles, project/customer ambiguity, then empty or meaningless titles.
- Read supported low-confidence resources across internal batches without asking the user to continue after each batch.
- Process reads in internal batches of
batch_size; do not ask the user between internal batches unless auth, permission, or API errors block progress. - After each internal batch, update
low_confidence_itemswith read evidence orneeds_review=true. - After every
progress_report_intervalprocessed resources, output a progress summary and continue automatically. - If unread low-confidence resources remain because of auth, permission, API, unsupported type, or tool budget blockers, set
partial=true, report unread count, and default remaining unread items toneeds_review=truewith target path set to manual confirmation target. - Never bypass these limits by reading full documents.
Low-Confidence Read Start Notice
When low_confidence_total > 100, output this notice before reading:
低置信度资源较多,共 <low_confidence_total> 项。我会分批做轻量读取并定期汇报进度;不会读取全文,也不会执行移动或创建。
Low-Confidence Read Summary
Use this as progress / final summary output. Do not ask the user to continue unless a blocker occurs.
低置信度内容读取进度
- 低置信度资源总数:<low_confidence_total>
- 已读取:<read_done>/<low_confidence_total>
- 已补充证据并完成分类:<classified_count>
- 暂入待人工确认:<needs_review_count>
- 失败:<failed_count>
继续分析整理问题。
Output this summary:
- After every 50 processed low-confidence resources.
- Once after low-confidence reading finishes.
- About every 60 seconds during long-running reads, even if fewer than 50 additional resources were processed.
Analysis Progress Reporting
Applies to CONTENT_READ, ISSUE_ANALYSIS, and RULE_GENERATION.
Rules:
- For
CONTENT_READ, useLow-Confidence Read Summaryas the progress report format. - For
ISSUE_ANALYSIS, if analysis runs longer than about 60 seconds, output progress about every 60 seconds with current stage, processed resource count when known, detected problem type count when known, and the next analysis step. - For
RULE_GENERATION, if classification rule or target-tree generation runs longer than about 60 seconds, output progress about every 60 seconds with current stage, classified item count when known, unresolved item count when known, and target category / path count when known. - Progress reports MUST be factual and stage-specific. Do not output generic "still running" messages without counts or the current stage.
- Do not ask the user to continue between internal batches unless auth, permission, API, target scope, or environment blockers occur.
- Do not expose internal chain-of-thought, raw tokens, or intermediate rule drafts.
Examples:
分析进度:正在归纳整理问题,已处理 <processed_count>/<resource_count> 项资源,已识别 <problem_type_count> 类问题。继续生成整理思路,不会执行移动或创建。
规则生成进度:正在生成分类规则和目标目录,已归类 <classified_count> 项,待人工确认 <needs_review_count> 项。继续生成完整计划前置数据。
State: ISSUE_ANALYSIS
Entry: resource_items and partial-read evidence are ready.
MUST:
- Detect problems from organization perspective only. Do not generate research conclusions.
- Generate an organization approach based on inventory, low-confidence read evidence, and detected problems.
- Include how non-reused source containers will be handled after their contents are moved.
- Apply
Analysis Progress Reporting. - Output
Inventory And Organization Approach Decision. - Stop and wait for the user to confirm the approach before
RULE_GENERATION.
Problem rules:
| Problem | Detection Rule |
|---|---|
| 根目录堆积 | 根目录直接资源过多,或超过总资源的明显比例 |
| 同类文件分散 | 标题 / 类型相似的资源分布在多个无关路径 |
| 命名不统一 | 同类资源日期、客户、项目命名格式明显不一致 |
| 临时内容过多 | 标题 / 路径含 临时、测试、tmp、draft、转移、未整理 |
| 空目录 | 目录类节点无后代资源 |
| 重复目录 | 目录名归一化后相同或高度相似 |
| 过旧归档内容 | 旧年份资源仍散落在活跃目录 |
MUST output evidence count or example paths. Do not output only abstract judgment.
Problem Pagination
| Output Area | Rule |
|---|---|
| Problem overview | Show at most 5 problem types per page |
| Problem examples | Show at most 3 example paths per problem type |
| Pagination | Affects display only; complete issue_summary MUST remain internal |
Inventory And Organization Approach Decision
盘点与整理思路
盘点结果:
| 指标 | 数量 |
|------|------|
| 总资源数 | |
| 各类型资源数 | |
| 一级目录数量 | |
| 根目录直接资源数 | |
| 空目录数量 | |
| 低置信度资源数 | |
| 已完成低置信度读取 | |
| 待人工确认 | |
| partial | |
共发现 <problem_type_count> 类问题,当前展示第 <page>/<total_pages> 页。
| 问题 | 证据数量 | 样例路径 | 说明 |
|------|----------|----------|------|
整理思路:
- <approach item 1>
- <approach item 2>
- 对证据不足、读取失败或权限不足的资源放入"待人工确认"
- 如存在不再复用的来源目录,内容迁出后将目录本体收起到 `待人工确认/待清理旧目录`,避免整理后一级目录仍杂乱
- 不删除、不重命名、不修改权限
是否基于这个整理思路生成目标目录和移动 / 创建计划?
你可以选择:
1. 基于这个思路生成目标目录和计划
2. 调整整理思路
3. 查看问题详情
4. 取消本次整理
State: RULE_GENERATION
Entry: user confirms the organization approach.
MUST:
- Generate
classification_rules. - Generate
target_tree. - Generate
target_treeto at least two levels; include third level when needed for project / customer / document-type grouping. - Reuse existing clear structure when possible.
- Identify reused top-level containers and non-reused source containers, and set
source_container_disposition. - For non-reused source containers, ensure
target_treeincludes a source-container cleanup target, defaulting to待人工确认/待清理旧目录, unless the user explicitly asks to keep source containers in place. - Ensure target tree can contain every planned
target_path. - Ensure the target tree contains a manual confirmation target named
待人工确认unless the user explicitly provides an equivalent name. - Apply
Analysis Progress Reporting. - Continue to
PLAN_GENERATIONwithout a separate target-tree-only confirmation.
Classification
| Condition | Agent MUST Do |
|---|---|
| Existing structure is clear | Reuse existing directory names and hierarchy |
| Title / path / type is enough | Classify without content read |
| Item remains uncertain after mandatory partial read | Put into manual confirmation target and set needs_review=true |
| Item is temporary / test / draft | Prefer temporary / test target |
| Root has many loose resources | Prefer organizing root-level obvious items first |
| User asks project / customer grouping | Use project / customer names from title, path, and partial read evidence |
| Naming is inconsistent | Report the issue with examples only; do not generate rename actions |
Adaptive Classification
The agent MUST NOT start from a fixed default category list. A fixed taxonomy can bias classification and confuse users when category names or numeric prefixes do not match their resources.
Derive categories from the current resource_items and partial-read evidence:
- First group resources by clear signals from title, current path, type, and mandatory partial-read evidence.
- Prefer category names that appear in the user's own content, such as project names, customer names, business lines, document types, years, or existing folder / Wiki node names.
- Create a category only when there is enough evidence for at least one resource.
- Do not create generic buckets such as archive, temporary, test, meeting, dashboard, or operations unless the current resources contain matching evidence.
- Do not add numeric prefixes to category names unless the user explicitly asks for ordered naming.
- Always keep a manual confirmation target named
待人工确认or an equivalent user-specified name for unresolved items.
Target Tree
target_tree is generated in this state but shown together with the move / create plan in PLAN_GENERATION. Do not stop after displaying a target tree alone.
Analysis Failure Handling
| Failure / Blocker | Agent MUST Do | Agent MUST NOT Do |
|---|---|---|
| Missing API scope | Follow lark-shared permission handling and stop |
Do not retry the same command repeatedly |
| Resource access denied | Stop and follow the main workflow Permission Request Gate |
Do not request permission automatically or in batch |
| Partial document read fails for a low-confidence item | Mark item needs_review=true, record reason, and route to manual confirmation target |
Do not classify by guessing |
| Item remains ambiguous after partial read | Mark needs_review=true and route to manual confirmation target |
Do not invent classification |