Skip to content

Knowledge System

Unified agent self-documentation — typed units, DAG hierarchy, full-content search index with BM25 + dense embeddings

Entry points

  • work_buddy.knowledge

Details

Two parallel knowledge stores share a common KnowledgeUnit base:

  • System docs (knowledge/store/*.json) — directions, system, capability, workflow units
  • Personal knowledge (Obsidian vault) — VaultUnit, user-authored patterns and feedback

Queried via: knowledge (unified), knowledge_docs (system only), knowledge_personal (vault only), agent_docs (legacy alias).

Search index

A dedicated in-memory index (work_buddy/knowledge/index.py) searches the FULL CONTENT of every unit — metadata, summary, and body text. Not just search_phrases() metadata.

Components: - BM25 (dual): content-weighted (0.7) + metadata-weighted (0.3) via rank_bm25 - Dense vectors: full-content embeddings via embedding service (1024-dim) - RRF fusion when both are available, BM25-only fallback when embedding is down

Lifecycle: - Built eagerly on MCP registry init (BM25 inline ~50ms, dense in background ~3.5s) - Invalidated on invalidate_store() / invalidate_vault() - Generation guards prevent stale background threads from writing into a rebuilt index - MCP: knowledge_index_rebuild (force rebuild), knowledge_index_status (health check)

Progressive disclosure

index (name + children) → summary (+ content.summary) → full (+ content.full with context chain resolution).

Key files

  • knowledge/store/*.json — canonical data
  • work_buddy/knowledge/model.py — type hierarchy
  • work_buddy/knowledge/store.py — loader + cache + invalidation
  • work_buddy/knowledge/index.py — BM25 + dense search index
  • work_buddy/knowledge/search.py — federated search (delegates to index)
  • work_buddy/knowledge/query.py — MCP-facing callables