Skip to content

Work Buddy Knowledge Handbook

This documentation is auto-generated from the knowledge store.

397 units across 4 types:

  • capability: 215
  • concept: 53
  • directions: 61
  • integration: 13
  • reference: 8
  • service: 5
  • system: 21
  • workflow: 21

Architecture

  • Architecture & Repo Structure concept — Repository layout, subsystem organization, and development conventions
  • Artifact System reference — Shared lifecycle infrastructure for any persisted resource — pluggable Storage × Lifecycle × Provenance composition with capability declarations.
  • Data Backups concept — Off-machine snapshot + restore system for work-buddy's vital SQLite databases. Hot-backup -> tarball -> manifest -> GitHub Releases. Tiered retention, gh-CLI driven, integrates with the health Component system.
  • Backup Now Directions directions — Take an immediate manual snapshot of work-buddy's vital SQLite databases as an anchor point before a risky operation.
  • Backup Restore Directions directions — Restore work-buddy's vital SQLite databases from a local or remote snapshot, with manifest validation and migration forward-roll.
  • Capability Registry concept — How capabilities are registered, probed for tool availability, disabled when a probe fails, and recovered cheaply via per-capability re-probe (CP-A3) instead of a full registry rebuild. Authoritative reference for the heavy-vs-light recovery decision.
  • Context Pipeline reference — Unified two-stage context collection + curation. ContextCollector fetches raw JSON from registered sources (git, tasks, projects, chrome + 9 markdown wrappers), ContextCurator renders into depth-adapted markdown or JSON. Feeds LLM prompts (build_triage_context retrofits onto this) and bundle files (collect.py retrofits onto this).
  • Contracts concept — Explicit work commitments — schema, lifecycle, bounded deliverables
  • Control Graph concept — Unified view-model over preferences, requirements, health, and registry. Powers the Settings tab; backs the fix + help systems.
  • Conversation Observability system — Durable session-attributed activity DB for Claude Code: commits, file writes, GitHub PR activity, uncommitted work, observed-session metadata, and optional LLM topic summaries. Replaces ad-hoc per-call JSONL scans in sessions/inspector.py.
  • Data-First Capabilities concept — The Op/capability-declaration split — executable Ops registered by stable ID, inert capability declarations that reference them, and the loader + load-time validator that resolves declarations against the Op registry.
  • Embedding Service service — Local HTTP service on port 5124 providing dense vector embeddings for search and similarity; exposes a symmetric default model plus an asymmetric query/document pair via role-aware client wrappers.
  • Dashboard Event Bus concept — In-process pub/sub + SSE stream + cross-process bridge that powers real-time dashboard updates without a global panel-refresh timer.
  • Feature Cards concept — Component-gated dashboard cards — the reusable pattern for widgets whose existence (and backend work) is justified by an opted-in component.
  • Health System (mental model) concept — Four-layer mental model of work-buddy's health system: do I want this? / is the setup correct? / is it running right now? / how does the user repair it?
  • Components & health checks layer system — Runtime probes: is this component (service, integration, plugin) actually running and reachable right now?
  • Fixers layer system — Per-requirement repair functions: clicking 'Fix' on a failed requirement runs the registered fixer to bring it back to passing.
  • Requirements layer system — Configuration-time validation: does the environment have what a component needs (plugins installed, configs present, secrets reachable, directories created)?
  • Inference concept — Local inference subsystems — admission control, LLM backends, embedding service, and provider dispatch for LM Studio / LM Link.
  • Local Inference Broker reference — Admission-control + priority scheduling + per-call metrics for every local-inference call. Work-buddy is the scheduler of record for LM Studio / LM Link traffic, not LM Studio itself.
  • Knowledge System concept — Unified agent self-documentation — typed units, DAG hierarchy, full-content search index with BM25 + dense embeddings
  • LLM Runner concept — Unified LLM call entry point (LLMRunner, llm_call) with semantic tier enum, normalized LLMResponse, and built-in tier escalation. Replaces the split between run_task (structured output) and llm_with_tools (tool calls).
  • Local LLM With Tools concept — [LEGACY — phase-8 deprecated] Local LM Studio-served models invoke a restricted whitelist of work-buddy MCP tools via /api/v1/chat. Gateway-enforced security via session_acl. Superseded for internal callers by architecture/llm-runner; retained here because the MCP-exposed llm_with_tools capability still uses this path.
  • MarkdownDB concept — Markdown-canonical two-way markdown <-> SQLite sync abstraction. Subclass per entity (FieldSpec list + parse/render); the base supplies orphan handling, the per-field drift loop, LWW conflict resolution, dual-surface mutation, and materialization. Backed by an append-only lww_meta write-provenance sidecar.
  • MCP Server Import Discipline concept — Critical safety constraint: why heavy library imports in capability callables deadlock the MCP server, and the correct pattern to avoid it
  • Schema Migration Ladder concept — Per-DB versioned SQLite migration runner: PRAGMA user_version as authority, _migration_history for audit, AST-based hashing (ignores cosmetic edits, catches behavioral ones, stable across Python versions), downgrade guard, transaction-wrapped apply, baseline-stamp for adopting on legacy DBs.
  • Repository Structure concept — Directory layout — what lives where, with subsystem README pointers
  • Resilience Framework system — Unified fault-mitigation foundation for guarded calls — propagating Deadline, outcome taxonomy, execution seam, composable strategy library, pipeline/registry, and the broker/Obsidian adapters.
  • Async Execution Queue concept — Unified disk-backed queue for three kinds of background work: retries of transient failures, deferred submissions (llm_submit), and scheduled jobs — shared sidecar sweep, per-reason policy (backoff, failure escalation)
  • Summarization Framework concept — Composition-based summarization — Source × Strategy × Store with a shared refresh orchestrator. Two compositions today: conversation sessions (layered disclosure → durable store) and Chrome tabs (flat extraction → TTL cache).
  • Workflow Cancel Directions directions — How to cancel a workflow run — finding the run id, the reason argument, and how cancel relates to the automatic idle sweep.
  • Workflow Run Lifecycle concept — How in-flight workflow runs are bounded and recovered: cancel, the idle-timeout sweep, and restart recovery of the conductor's in-memory active-runs map.
  • Workflow System concept — Workflow execution — DAG, execution policy, auto-run steps, conductor, step result visibility

Artifacts

  • Artifacts system — Artifacts capabilities and workflows
  • Artifact Cleanup capability — Run TTL-based cleanup over registered artifacts. By default, sweeps every registered artifact (filesystem, llm-cache, messages, notifications, llm-queue, …). Pass name to scope the sweep to a single artifact. Use dry_run=true (default) to preview what would be deleted. The name field is deliberately distinct from artifact_save's type field (which means filesystem subtype) — they live in different namespaces.
  • Artifact Delete capability — Delete an artifact and its metadata by ID.
  • Artifact Get capability — Retrieve an artifact by ID (filename stem). Returns metadata and content (inline if < 50KB, otherwise file path).
  • Artifact List capability — List artifacts in the data store, filtered by type, recency, tags, or session. Sorted by creation time (newest first).
  • Artifact Registry capability — Return the cross-backend artifact-registry introspection map. For each registered artifact, lists its name, storage kind (FilesystemStorage, SqliteRowsStorage, JsonRecordsStorage, …), lifecycle kind (trigger+action+optional retention), provenance kind (SessionTagged or none), declared capabilities, and the MCP operations it exposes. Single source of truth for 'what does this persisted resource look like.'
  • Artifact Save capability — Save an artifact (context bundle, export, report, snapshot, or scratch) to the centralized data store with metadata and TTL-based lifecycle.
  • Commit Record capability — Record structured commit metadata (hash, files, test results, knowledge units updated) as an artifact. Called after a successful git commit to enable enriched commit cards in the dashboard.

Automation

  • Automation concept — Lazy-resolution layer that decides how far the agent may take a task (operating tier) and how loudly to resurface it. Pure-function resolvers over stored signals; no I/O, no DB writes — surfaces (dashboard, engage view, audit log) call them per-read.
  • Action contexts (Slice 5a) reference — resolve_who_can_act answers "who can act on this task now?" by consulting the CONTEXT_REGISTRY against the live tool-status cache. Tasks declare agent_required_contexts + user_required_contexts; the resolver returns a WhoCanActDecision with per-side unmet tokens and a handoff-eligible flag.
  • Pickup-time readiness (Slice 7) reference — compute_pickup_readiness pure function -- 5-rule precedence ladder deciding whether a task is ready to execute as-is or should develop first when picked up. Reads creation_effort + user_involvement + provenance + staleness + deadline + has_action_items.
  • Risk model + automation tiers + dynamic resurfacing concept — Operating-tier and dynamic resurfacing-level resolvers. Pure functions reading the four risk dimensions (financial, privacy, accuracy, compute) + three amplifiers (reversibility, regret_potential, inference_uncertainty) against the user's tolerance config. Returns OperatingTierDecision / ResurfacingDecision dataclasses with typed pipeline_blocker per ROADMAP §3.3.

Backups

  • Backups concept — Data-backup capabilities — snapshot, restore, and remote sync of work-buddy's databases
  • Data Backup capability — Take a snapshot of work-buddy's vital SQLite DBs (task_metadata, projects, messages, threads). Hot-backup, tar+gzip, write manifest, optionally push to GitHub Releases. Called by the hourly sidecar cron AND by the user via /wb-backup-now.
  • Data Backup List capability — List local snapshots (and optionally remote ones). Each entry includes snapshot_id, timestamp, size, manual flag, and the manifest summary (commit + schema versions).
  • Data Restore capability — Restore work-buddy's vital SQLite DBs from a snapshot. Validates the manifest (refuses if the snapshot's commit or schema is newer than the running code), unpacks to staging, runs migrations forward, verifies integrity, then atomically swaps into place (the old DBs are moved to .data/db.pre_restore_/ for safety).

Browser

  • Browser integration — Chrome tab triage and browser-integrated workflows
  • Chrome Triage workflow — Triage currently-open Chrome tabs through the unified source pipeline: collect tabs, attach cached Haiku summaries + tag signals, embedding-fused cluster (Louvain over embedding+tag+window-gated proximity), Sonnet-refine cluster boundaries + propose a per-cluster action (close all tabs / group in Chrome / route to tasks / etc.), and spawn a group umbrella thread + group sub-threads with the tabs as ContextItems. The user reviews and approves via the dashboard column grid + per-column action chip.
  • Chrome Triage Directions directions — How to run Chrome tab triage — pipeline overview + the agent's role (confirm completion; user reviews on the dashboard column grid).

Clarify

  • Text-segmenter SubCall system — Generic text-segmentation SubCall. Splits captured prose into distinct matters. Used by inline-capture (right-click 'Send to agent') to detect multi-matter selections. Reusable for any future singular-input pipeline (per-message email triage, etc.).

Context

  • Context concept — Context capabilities and workflows
  • Agent Docs capability — Search and navigate all agent documentation: directions, system docs, capabilities, and workflows. Supports exact path lookup, subtree browsing, and natural language search with hierarchical progressive disclosure.
  • Agent Docs Rebuild capability — Reload the knowledge store from disk. Use after editing store JSON files or after registry changes.
  • Chrome Activity capability — Query Chrome browsing history from the rolling tab ledger. Supports: hot_tabs (ranked by engagement), changes (opened/closed/navigated/engaged/moved), sessions (domain clusters), tabs_at (snapshot at a time), context (tab proximity and window layout), details (full URLs by filter), status (ledger health). Output is compact (no URLs) — use details query for full URLs.
  • Chrome Content capability — Extract full page text from currently-open Chrome tabs. Filter by domain or title substring, or get top-engagement tabs. Free — no LLM calls. Use for single-tab inspection or reading specific page content.
  • Chrome Infer capability — Infer what the user is working on by reading page content from engaged Chrome tabs and analyzing with Haiku. Evaluates provided theories against actual page evidence. Caches results per tab to avoid redundant API calls. ~$0.001/call.
  • Chrome Route To Tasks capability — Walk a Chrome-group thread's tabs and create one task per tab. Each tab's title becomes the task text; the URL goes into a linked summary note.
  • Chrome Route To Umbrella Task capability — Create a single task representing the whole Chrome group. The cluster label becomes the task text; the tabs are listed in the linked summary note.
  • Chrome Tab Close capability — Close specified Chrome tabs by tab ID. Returns count of closed/missing tabs.
  • Chrome Tab Group capability — Create a Chrome tab group or add tabs to an existing group. Returns the group ID.
  • Chrome Tab Move capability — Move Chrome tabs to a specific position or window.
  • Collect And Orient workflow — Generate a fresh context bundle and use it to orient on the user's current work state. This is the primary "what's going on right now?" workflow.
  • Context Collection Directions directions — How to collect context and synthesize an orientation — priority order, flags, contract cross-reference
  • Context Review Directions directions — How to review an existing context bundle — freshness check, same synthesis rules as context-collect, no re-collection
  • Context Block capability — Collect + render a context block from registered sources (git, tasks, projects, chrome, obsidian, obsidian_tasks, obsidian_wellness, calendar, day_planner, session_activity, chat, message, smart, datacore). Structured sources (git / tasks / projects / chrome) emit curated prompt text; the rest wrap legacy collectors. Supports per-source depth, target_date windows, max_chars budget, markdown or JSON output, and cache reuse via max_age_seconds.
  • Context Bundle capability — Run all (or selected) collectors and save a context bundle to disk. Use individual collectors (context_git, context_chat, etc.) when you only need one source.
  • Context Calendar capability — Google Calendar schedule for a given date. Also checks plugin readiness.
  • Context Chat capability — Recent Claude Code conversations and CLI history with tool usage, duration, and outcome snippets
  • Context Chrome capability — Currently open Chrome tabs (requires Chrome extension running)
  • Context Drill Down capability — Expand one item from a context source. Works on structured wave-1 sources that implement drill_down — tasks (field: 'note' / 'line'), git (field: 'full_message' / 'diff_stats'), projects (field: 'description' / 'full'). Wave-2/3 markdown wrappers don't implement drill-down — the prompt already holds their full body at DEEP depth.
  • Context Git capability — Recent git activity across all repos: commits, diffs, dirty trees. Pass annotate=true to tag commits made by agent sessions with their session ID.
  • Context Messages capability — Inter-agent messaging state: pending, recent, unread messages
  • Context Obsidian capability — Obsidian vault summary: journal entries, recently modified notes
  • Context Projects capability — Active projects with identity, state, and trajectory — synthesized from vault directories, STATE.md files in repos, task tags, git activity, and contracts. Filters the rendered output to active projects by default; pass statuses to widen.
  • Context Search capability — Search indexed content (conversations, documents, tabs). Requires IR index — build with ir_index first. Methods: 'substring' (exact match, no embedding service), 'keyword' (BM25), 'semantic' (dense), or comma-delimited combo like 'keyword,semantic' (default, RRF fused).
  • Context Smart capability — Smart Connections context: semantically related notes to active contracts
  • Context Tasks capability — Obsidian task summary: outstanding tasks + recent state changes (last 48h by default)
  • Context Wellness capability — Wellness tracker summary from recent journal entries
  • Datacore Compile Plan capability — Compile a structured JSON query plan into a Datacore query string. Plan keys: target (required), path, tags, tags_any, status, text_contains, exists, frontmatter, child_of, parent, expressions, negate.
  • Datacore Evaluate capability — Evaluate a Datacore expression (e.g. arithmetic, field access).
  • Datacore Fullquery capability — Execute a Datacore query with timing and revision metadata. Same as datacore_query but includes duration_s and revision.
  • Datacore Get Page capability — Get a single vault page by path with Datacore metadata: frontmatter, sections, tags, links, timestamps.
  • Datacore Query capability — Execute a Datacore query against the vault index. Supports @page, @section, @block, @task, @list-item, @codeblock with filters like path(), tags, childof(), parentof(). Returns serialized results.
  • Datacore Run Plan capability — Compile and execute a structured query plan in one step. Preferred over raw datacore_query when building queries programmatically — the plan schema is simpler and validates before execution.
  • Datacore Schema capability — Summarize the vault's Datacore schema: object types, top tags, frontmatter keys, path prefixes. Use before building queries to understand what's available.
  • Datacore Status capability — Check if Datacore plugin is installed, initialized, and queryable. Returns version, index revision, and object type counts.
  • Datacore Validate capability — Validate a Datacore query string without executing it. Returns parse error details if invalid.
  • Dev Mode Toggle capability — Toggle dev mode for the current session. When active, all knowledge queries automatically include dev_notes — development-facing documentation that operational agents don't need. Use True to enable, False to disable, or omit to toggle.
  • Docs Create capability — Create a new unit in the knowledge store. Writes to the appropriate JSON file, updates parent children lists, and validates DAG integrity.
  • Docs Delete capability — Delete a unit from the knowledge store. Cleans up parent/child references.
  • Docs Get capability — [Legacy] Get a knowledge unit by name. Use agent_docs instead.
  • Docs Index capability — [Legacy] Build IR index. Use agent_docs_rebuild instead.
  • Docs Move capability — Move a unit to a new path. Updates all parent/child references across the store.
  • Docs Query capability — [Legacy] Search knowledge units. Use agent_docs instead.
  • Docs Update capability — Update fields on an existing knowledge unit. Only provided fields are changed; omitted fields preserved.
  • Docs Validate capability — Validate the knowledge store: DAG integrity, command-to-store mappings, thinned command format, required fields, kind-specific fields, placeholder duplicates, and parent-child symmetry.
  • Ir Index capability — Build or check the IR search index. Run 'build' to (re)encode dense vectors for indexed documents; 'status' returns per-source counts including dense_eligible_docs (how many docs CAN be encoded) and pending_eligible (real backlog — NOT doc_count vs vector_count, which is misleading because sources like conversation intentionally leave dense_text empty for tool-only spans).
  • Knowledge capability — Search across both system documentation and personal knowledge from the Obsidian vault. Returns results tagged with their source scope (system or personal).
  • Knowledge Index Rebuild capability — Rebuild the knowledge search index. Uses the persistent on-disk cache by default — unchanged units keep their cached vectors, so typical warm rebuilds are <1s. Pass force=true to purge the cache and re-embed everything (slow — 1-3 minutes for the full store).
  • Knowledge Index Status capability — Check the knowledge search index status: whether it's built, unit count, and whether dense vectors are available.
  • Knowledge Mint capability — Create or update a personal knowledge unit in the Obsidian vault. Generates a markdown file with YAML frontmatter. If the file already exists, appends new evidence.
  • Knowledge Personal capability — Search personal knowledge from the Obsidian vault. Includes minted insights, patterns, feedback, preferences. Supports filtering by category and severity.
  • Review Latest Bundle workflow — Read the most recent existing context bundle without re-collecting. Faster than collect-and-orient when a recent bundle already exists.
  • Session Find Uncommitted directions — How to invoke and present uncommitted session results — per-entry format and follow-up suggestion
  • Session Identify directions — Locate a prior Claude Code conversation by topic, then drill into it for the specific turns that matter
  • Session Commits capability — Extract git commits made during Claude Code sessions. Parses raw JSONL for Bash tool calls containing 'git commit' and their results. Scope to one session or scan all recent sessions.
  • Session Expand capability — Full context around a specific message in a session. Returns untruncated text for the target and surrounding messages.
  • Session Get capability — Browse messages in a Claude Code session. Paginated with role/type filtering. Use after context_search finds a session.
  • Session Locate capability — Jump from a context_search hit to the relevant conversation page. Takes a span_index from search result metadata and returns messages centered on that chunk.
  • Session Search capability — Hybrid search within a single session. Uses IR (keyword/semantic/substring) scoped to the session, then resolves chunk hits to message-level results via the span map.
  • Session Uncommitted capability — Find agent sessions that wrote files still present in dirty git state. Answers: 'which sessions wrote code that was never committed?' Cross-references Write/Edit/NotebookEdit tool calls against git status --porcelain across all repos.
  • Session Wb Activity capability — Summary of what a session did through work-buddy's MCP gateway — capabilities invoked, workflows run, errors, key artifacts. Reads from the per-session activity ledger.
  • Session Inspection concept — Random-access into individual Claude Code conversation sessions — browsing, search, context expansion, git commit extraction
  • Vault Recon capability — Diagnostic-grade vault reconnaissance. Returns cross-tabs an agent can reason over to spot recurring conventions: frontmatter state machines (type x status), tag families (depth-3 tree), path-by-type distribution, recent activity by region, cardinality-capped frontmatter values. Single page walk with anti-noise caps.
  • Vault Recon Collect capability — Periodic vault reconnaissance entry point: snapshot the vault via vault_recon, append to a 60-day rolling ledger at .data/vault_recon/snapshots.json, compute deltas against prior snapshots, apply 5 curated significance rules, and write a one-shot type:prompt investigation job to .data/user_jobs/ on each rule firing (deduplicated per (rule, focus) over a 7-day window). Designed to be fired daily by sidecar_jobs/vault-recon.md.
  • Workflow Cancel capability — Cancel a running workflow run — drop it from the in-memory active-runs map, mark its on-disk DAG cancelled (kept for audit), and revoke its consent blanket. Idempotent; a completed run is left untouched.
  • Workflow Create capability — Create a new workflow unit (DAG + step instructions). Use this instead of docs_create for kind='workflow' units — docs_create does not accept workflow-specific fields.
  • Workflow Sweep Idle capability — Cancel active workflow runs that have had no step progress past the idle threshold. Runs automatically on an interval in the MCP gateway; also callable manually (with dry_run) for observability.
  • Workflow Update capability — Update an existing workflow unit. Only provided fields change; omitted fields preserved. 'steps' and 'step_instructions' replace/merge rather than patch individual entries — read the current value, mutate, and pass the whole structure back.

Contracts

  • Contracts system — Contracts capabilities and workflows
  • Active Contracts capability — List all contracts with status=active
  • Analyze Contracts workflow — Review all active contracts, check health, and surface issues for the user.
  • Contract Check Directions directions — How to analyze contracts — health flags, alignment check, per-contract next actions, work-pattern cross-reference
  • Contract Creation Directions directions — How to guide contract creation — interview flow, minimum viable fields, scope checking, WIP awareness, confirmation rules
  • Contract Constraints capability — Get active contracts with their current bottleneck constraints
  • Contract Health capability — Health check report: status counts, overdue, stale, missing fields
  • Contract Wip Check capability — Check if active contract count is within the WIP limit (max 3)
  • Contracts Summary capability — Markdown summary of all contracts with title, status, deadline, progress
  • Create Contract workflow — Guide the user through defining a new contract for a bounded deliverable.
  • Overdue Contracts capability — List contracts past their deadline
  • Stale Contracts capability — List contracts not reviewed in N days (default 7)

Conversation Observability

  • Conversation Observability concept — Durable session-attributed activity derived from Claude Code JSONL sessions — commits, writes, uncommitted work, topic summaries
  • Conversation Observability Get capability — Look up a single observed-session row by session_id, including its metadata (start/end, message_count, span count, tool usage counts). Returns None when the session has not been observed yet.
  • Conversation Observability List capability — List observed Claude Code sessions, sorted by recency. Optional filters: days (recency window), project (one project's sessions only).
  • Conversation Observability Refresh capability — Refresh the conversation_observability DB: observed sessions metadata, session-attributed commits, session-attributed file writes (with dirty-state snapshot), and session-attributed GitHub PR activity. Stale-only by default; pass stale_only=false to force every recent session to re-load.
  • Conversation Observability Summarize capability — DEPRECATED — legacy v1 entry. Generates LLM topic summaries for stale Claude Code sessions in batches. No-ops when conversation_observability.summaries.use_incremental is true (the v2 queue worker handles refresh on the 5-min cadence; see summarization_worker_tick). Preserved for rollback compatibility and as an MCP-callable v1 path; new callers should use summarization_worker_tick or wait for the natural cron drain.
  • Conversation Observability Summary Get capability — DEPRECATED ALIAS — use session_summary_get instead. Same callable, shorter canonical name. Look up the cached tldr + topic summaries for one session_id; returns None when nothing has been summarized yet.
  • Conversation Observability Uncommitted capability — Return the legacy session_uncommitted report from the DB-backed attribution layer. Refreshes first; see also context/session_uncommitted (the thin compat wrapper).
  • Session PRs Get capability — List the GitHub pull-request events (created / merged / closed / reviewed) attributed to one session, detected structurally from gh pr Bash invocations in its JSONL. Read-only.
  • Session Summary Get capability — Look up the cached tldr + topic summaries for one session_id. Returns None when the session hasn't been summarized. Canonical replacement for the verbose conversation_observability_summary_get.

Conversations

  • Conversations system — Conversations capabilities and workflows
  • Conversation Management Directions directions — When and how to use agent-user conversations — decision guide, response types, behavioral notes. (Renamed from threads/thread-directions in v5 Stage 1; the threads namespace is reserved for the v5 universal-entity primitive.)
  • Conversation Ask capability — Ask a question in a conversation and optionally wait for the user's response.
  • Conversation Close capability — Close a conversation.
  • Conversation Create capability — Create a new conversation with the user. Opens a chat sidebar on the dashboard.
  • Conversation List capability — List conversations.
  • Conversation Poll capability — Check if the latest question in a conversation has been answered.
  • Conversation Send capability — Send a message in an existing conversation (fire-and-forget, no response expected).

Daily Journal

  • Daily Journal system — Daily journal lifecycle — line-range segmentation, per-thread tag/summary manifest, clustering, routing, rewrite, and update synthesis.
  • Journal Backlog Processing Directions directions — How to run Running Notes backlog pipeline — cluster review, routing proposals, rewrite presentation
  • Process Backlog workflow — Process today's Running Notes backlog through the unified source pipeline: collect line-range segments, annotate with Haiku-generated tags + summaries, embedding-fused cluster, Sonnet-refine cluster boundaries + propose a per-group action, and spawn a group umbrella thread + group sub-threads with the segments as ContextItems. The user reviews and refines via the dashboard column grid.
  • Segment Notes workflow — Read the Running Notes section from a journal file, identify coherent threads of related content, and annotate the text with inline thread IDs. The raw text is never modified — only HTML comment tags are inserted.
  • Update Journal workflow — Append activity-detected Log entries to an Obsidian journal file.

Dev

  • Development concept — Developmental mode — tools and directions for building, debugging, and modifying work-buddy itself
  • MCP Gateway Design Tenets directions — Five architectural principles for designing capabilities, plus the priming hazard and agentic stub patterns for workflow authoring
  • Dev Doc Update workflow — Review current-session code changes, cross-check against the knowledge store, update units that have gone stale or need creating, then validate store integrity. Enforces scan → propose → confirm → apply → validate → report so doc drift cannot be silently skipped and broken cross-refs cannot silently ship.
  • Dev Doc Update Directions directions — How to run /wb-dev-document — scan current changes, propose knowledge-store edits, confirm, apply via docs_/workflow_ capabilities, validate store integrity, report. Rules for what to check and what not to clobber.
  • Development Mode directions — Enter developmental agent mode — orient on architecture, key locations, and dev workflow for modifying work-buddy itself
  • Dev-Mode Orientation workflow — Forced orientation before dev work — activate dev mode, search the knowledge store for the subsystem being modified, read the code, then declare the prior art found. Only after advancing the step with a non-trivial declaration may the agent proceed with the actual task.
  • Dev PR workflow — Commit work-buddy code changes with test verification, chained doc update, PII scan, cleanup review, and commit metadata recording. Replaces the prose /wb-commit directions.
  • Dev PR Directions directions — How to run /wb-dev-pr — chained doc-update via /wb-dev-document, test verification, PII scan, cleanup review, commit, metadata record, push + PR. Replaces the prose /wb-commit directions (dev/commit).
  • Documentation Architecture directions — Where system documentation lives, what is canonical vs legacy, and how to make documentation changes
  • Durable surfaces — no transient narrative directions — Authoring rule for code and agent docs: describe the system's current behavior, not the journey of how it got there. No commit hashes, branch names, PR numbers, dates, agent-session tags, stage labels, or 'after X' framing in code identifiers, comments, log strings, tests, knowledge units, slash-command text, or CLAUDE.md.
  • Live Testing Directions directions — How to drive a live end-to-end test of an in-progress code change — distinct from unit tests; verifies wiring across MCP server, sidecar, and surfaces with the user in the loop.
  • MCP Registry Reload directions — When and how to reload the MCP gateway capability registry. Two paths: heavy full rebuild (mcp_registry_reload) and light per-capability re-probe (recheck_disabled_capability). Picks the right one for the situation.
  • Obsidian Plugin Integration directions — Build a new Obsidian plugin integration for work-buddy — probe, wrap, package, and optionally collect
  • Session Retrospective directions — Switch to developmental mode and critique/debug this session's execution — then fix what you find
  • Stress Test workflow — Subprocess isolation validation workflow (developer tool). The compute-primes step runs in a subprocess to exercise the gateway's subprocess execution path.

Disclosure

  • Progressive Disclosure system — Unified navigation contract for tree-shaped drillable resources. One MCP capability (drill_tree) walks any registered TreeDrillable at three depths (index/summary/full).
  • Drill Tree capability — Walk a tree-shaped drillable resource at three depths (index|summary|full). Default depth is index — cheapest walk. Today's domains: knowledge (units via agent_docs), summary (summarization framework's per-node store).
  • Walk capability — Universal tree navigation — canonical short name for drill_tree. Walks any registered TreeDrillable at three depths (index | summary | full). Today's domains: knowledge (units), summary (framework per-node store).

Email

  • Email integration (Thunderbird bridge) integration — How work-buddy reads email via the thunderbird-work-buddy companion extension and feeds it into the triage Review pool
  • Email Triage workflow — Triage recent unread email through the unified source pipeline: fetch via the Thunderbird bridge, synthesise tags from sender/folder/labels, embedding-fused cluster, refine cluster boundaries + propose a per-cluster action (close / create one task per email / create umbrella task), and spawn a group umbrella thread + group sub-threads with the emails as ContextItems. The user reviews and approves via the dashboard's Threads tab.
  • Email Accounts capability — List the email accounts visible through the bridge — only accounts the user has explicitly allowed in the extension's options page are exposed (default-deny).
  • Email Close capability — Mark an email cluster as not actionable — newsletters, automated notifications, etc. Advisory only: dismisses the Thread without touching the underlying mailbox (Thunderbird bridge is read-first in v1).
  • Email Create Tasks capability — Walk an email-cluster thread and create one task per email. The subject becomes the task text; sender + date land in the linked summary note.
  • Email Create Umbrella Task capability — Create a single task representing the whole email cluster. The cluster label becomes the task text; the linked summary note lists every email's subject + sender + date for context.
  • Email Display capability — Open a message in Thunderbird's UI. Useful when the user wants to read it themselves — does not modify anything.
  • Email Get capability — Fetch one email message by its operational handle (provider_message_id + folder_path) — returns the body up to max_body_chars chars plus all summary fields.
  • Email Health capability — Liveness probe for the email bridge. Returns the bridge's /health payload (port, version, allowed-account count). Use this when the user reports email features are missing — it distinguishes 'bridge down' from 'no accounts allowed'.
  • Email Record Into Task capability — File an email cluster as a context section on an existing task's linked note. Use when the cluster is context for ongoing work (replies on an active deliverable, PR-review notifications about a task you're already tracking) rather than a new task. The target task must already have a note attached; this capability does not implicitly create one. Appends a bulleted 'Emails recorded' section listing each email's subject + sender + date.
  • Email triage directions directions — Run one source-pipeline pass over recent email; spawns Threads carrying the agent's per-cluster proposals.

Entities

  • Entities system — Entity registry — a reference-resolution layer for the user's named world. Authored, tagged, federated entity_resolve, append-only reference index.
  • Editing an Entity directions — How to update an entity's name/description, manage tags and aliases, and delete — plus the consent posture on destructive edits.
  • Listing Entities directions — How to browse the entity registry — hierarchical tag filter, presentation, drill-down via entity_get.
  • Creating an Entity directions — How to author a new entity — canonical name, tags, aliases, description, and the post-creation ritual.
  • Resolving an Entity directions — When and how to call entity_resolve — the pull-based lookup an agent uses on an unfamiliar proper noun before asking the user.
  • Entity Add Alias capability — Attach an alias to an entity. Globally unique (one alias, one entity); raises on collision.
  • Entity Add Reference capability — Explicitly append a reference row for an entity. The standard recording path is the side-effect of entity_resolve/create/update; this exists for scripts and dashboard-driven recording.
  • Entity Create capability — Create a new entity with optional description, tags, and aliases. Consent-gated for agent-author writes. Optionally anchors an initial reference if source_path + source_kind are supplied.
  • Entity Delete capability — Hard-delete an entity, cascading through tags, aliases, and references. Consent-gated (both user and agent authors must approve).
  • Entity Get capability — Fetch a single entity by canonical name, alias, or integer id. Returns tags, aliases, and the 5 most-recent reference rows.
  • Entity List capability — List entities ordered by most-recently-updated. Optional hierarchical tag filter: tag='person' returns 'person', 'person/family', 'person/colleague', etc.
  • Entity List References capability — List references for an entity, newest first. Default limit 50 to keep dashboard responses small.
  • Entity Remove Alias capability — Detach an alias from an entity. No-op if not attached.
  • Entity Resolve capability — Federated lookup across the entity store + the project registry. Returns all matches in parallel, flagged by provider. Optionally records a reference when source_path + source_kind are supplied.
  • Entity Set Tags capability — Replace the full tag set on an entity. Pass an empty list to clear. Tags are normalized; exact duplicates and redundant ancestor tags (person when person/family is present) are collapsed before writing.
  • Entity Update capability — Update an entity's canonical name and/or description. Tags + aliases are managed through their own capabilities so a rename PATCH can't accidentally wipe them.

Features

  • Features concept — User-facing component opt-in/out system and configuration-time requirements.
  • LM Studio Embedding Offload Setup directions — Procedure for offloading work-buddy's document-side passage encoder to LM Studio — download GGUF, verify metadata, run drift test, update config.
  • Feature Preferences directions — How to check feature preferences before recommending or using a component, and how the requirements system differs from runtime health checks.
  • User-authored Scheduled Jobs directions — How a user authors a personal scheduled cron job — file location, frontmatter schema, collision behavior, hot-reload.

Inline

  • Inline Commands system — Framework for triggering agent actions from inside Obsidian via right-click menu or #wb/cmd/* tags
  • Inline Consume Modes concept — Post-execution note mutation behaviors — strip, annotate, replace, leave
  • Inline Commands Directions directions — How to add a new inline command — decorator, context scope, consume mode, persistence
  • Inline Cancel Watcher capability — Cancel a single persistent watcher by ID.
  • Inline Invoke capability — Execute an inline command (menu or #wb/cmd/* tag surface).
  • Inline List Commands capability — List registered inline commands, optionally filtered by surface.
  • Inline List Watchers capability — List all persistent inline watchers.
  • Inline Menu Manifest capability — Manifest of inline commands that expose a right-click menu entry.
  • Inline Sync capability — Reconcile vault #wb/cmd/* tags with the persistent watcher store.
  • Inline Tag Removed capability — Cancel persistent watchers whose tag was removed from a note.
  • Inline Commands Overview concept — Architecture of the inline command framework — surfaces, dispatcher, handler registration
  • Inline Persistent Watchers concept — How #wb/cmd/* tags declared persistent install a PersistentWatcher that survives Obsidian restarts

Journal

  • Journal system — Journal capabilities and workflows
  • Activity Timeline capability — Infer recent activity from journal entries and optionally deeper signals. Returns a structured timeline with events, gaps, and relative timestamps. Use for understanding what happened during a time window.
  • Day Planner capability — Day Planner operations: check plugin status, read current plan, generate schedule from events+tasks, or write plan to journal. Composite: replaces separate check_ready/get_plan/generate/write/resync calls.
  • Hot Files capability — Rank vault files by activity intensity, fusing modification frequency (vault events) with writing intensity (Keep the Rhythm). Hierarchically collapses busy directories to prevent context flooding. Use sub_directory to drill into a specific area.
  • Journal Append To Note capability — Append all items in a journal-group thread as bullets to a single existing vault note. Useful for project-observation clusters.
  • Journal Rewrite Running Notes capability — Remove processed lines from today's daily note. Consent-gated wrapper around journal_backlog.rewrite_running_notes. Umbrella-level cleanup: typically run after all the umbrella's groups have been routed.
  • Journal Route To Considerations capability — Walk a journal-group thread's context items and create one consideration note per item. Each item's label becomes the title; raw text becomes the body.
  • Journal Route To Tasks capability — Walk a journal-group thread's context items and create one task per item in the master task list. Each item's label becomes the task text. Continue-on-error: a single failed item doesn't block the rest.
  • Journal Sign In capability — Read sign-in state (sleep/energy/mood/check-in/motto) and wellness trends, optionally write fields. Composite: replaces separate extract_sign_in + interpret_wellness + write_sign_in calls.
  • Journal State capability — Read journal state: target date, activity window, existing entries
  • Journal Write capability — Append log entries or persist a briefing to the journal. For log entries: pass time/description tuples. For briefing: pass markdown to wrap in a callout.
  • Running Notes capability — Read the Running Notes section from the user's daily journal. This is the primary stream-of-consciousness capture zone where the user records ideas, observations, and notes throughout the day. Supports filtering by date range, last N days, or same-day only. Call with same_day=true for just today's entries, or days=N for recent history.
  • Journal Update Directions directions — How to detect activity and append journal Log entries — format, synthesis rules, approval flow
  • Vault Write At Location capability — Insert content at a specific section in a vault note. Configurable note (path or resolver like 'latest_journal', 'today'), section (header text), and position ('top' or 'bottom' of section). Used by Telegram capture and general-purpose vault writing.

Memory

  • Memory system — Personal memory subsystem — semantic store, mental models, retention, reflection
  • Hindsight Memory Server integration — External Hindsight server providing semantic memory storage and reflection
  • Memory Prune capability — Delete memories from the bank. CONSENT-GATED, IRREVERSIBLE. Call with no args to list documents for review. Then provide document_id to delete a specific document's memories, or memory_type to bulk-delete a category (world/experience/observation).
  • Memory Read capability — Read from personal memory (Hindsight). No LLM cost. Modes: 'search' (default) — semantic + keyword recall, use descriptive topic phrases with specific entity names for best results; 'model' — fetch a mental model by ID; 'recent' — list latest memories.
  • Memory Reflect capability — LLM-powered reasoning over memories. CONSENT-GATED: triggers a server-side LLM call against your Anthropic API key (~1-3K tokens per call). Use memory_read for free retrieval first.
  • Memory Write capability — Store a personal fact, preference, or constraint in memory

Messaging

  • Messaging system — Messaging capabilities and workflows
  • Get Thread capability — Get all messages in a conversation thread
  • Query Messages capability — Query messages by recipient, sender, status, or limit
  • Read Message capability — Fetch a single message with full body content
  • Reply To Message capability — Reply to an existing message
  • Send Message capability — Send a message to another agent or project
  • Update Message Status capability — Update a message's status (e.g., pending → resolved)

Metacognition

  • Metacognition concept — Self-accountability framework — scan for patterns the user has chosen to be held to, apply graduated interventions
  • Blindspot Detection Directions directions — How to scan for active work-pattern blindspots — intervention levels, cascade checking, rewriting template, output format

Morning

  • Morning Routine concept — Configurable morning routine — journal, tasks, contracts, calendar, metacognition
  • Morning Routine Directions directions — How to run the morning routine — sign-in conversation, blindspot scan, synthesis, propose-mits, persist-briefing, day-planner, quality checks
  • Morning Routine workflow — Configurable morning routine that coordinates journal, tasks, contracts, calendar, and metacognition into a single briefing-first flow. Collect everything, then synthesize and act.

Notifications

  • Notification & Consent System system — Multi-surface notifications, requests, and consent — Obsidian, Telegram, Dashboard
  • Consent System directions — How consent-gated operations work — auto-request in gateway, pre-flight bundling, session scope, risk levels
  • Consent List capability — List all consent entries with their status (mode, tier, expiry for temporary grants).
  • Notification List Pending capability — List all pending notifications and requests awaiting user response.
  • Notification Send capability — Send a fire-and-forget notification to the user via all available surfaces (Obsidian, Telegram if enabled). No response expected. Optionally target specific surfaces.
  • Sending Notifications directions — How to send fire-and-forget notifications — parameters, surface rendering, and examples
  • Sending Requests and Consent directions — How to request user decisions — request_send, consent_request, surface rendering, blocking vs non-blocking, and handling responses
  • Request Poll capability — Check/wait for a response to a previously delivered request. Without timeout_seconds: single immediate check. With timeout_seconds: blocks until response or timeout (max recommended: 110s). Response is cleared from Obsidian after reading (one-shot).
  • Request Send capability — Create a request, deliver to all available surfaces, and optionally poll for the user's response. Supports choice, boolean, freeform, and range response types. Without timeout_seconds: non-blocking (returns immediately, use request_poll later). With timeout_seconds: blocks until response or timeout (max recommended: 110s to stay within MCP call limits).
  • Notification Surfaces service — Surface details — Obsidian modals, Telegram messages, Dashboard forms
  • Telegram Bot integration — Telegram bot for mobile access — commands, setup, architecture

Obsidian

  • Obsidian Integration integration — Obsidian vault integration — bridge, tasks, datacore, smart connections, vault writer
  • Obsidian Bridge integration — HTTP bridge to Obsidian — eval_js, latency handling, timeout retry rules
  • Google Calendar Integration integration — Google Calendar access via Obsidian Google Calendar plugin (stale/unmaintained) + eval_js bridge
  • Datacore Query Directions directions — How to translate user intent into Datacore vault queries — schema-first, decomposition, plan vs raw query, validate-and-repair
  • Day Planner Plugin Integration integration — Obsidian Day Planner plugin (v0.28.0) integration -- plan entry format, settings, runtime surface
  • Keep the Rhythm Integration integration — Writing activity tracking via KTR plugin (v0.2.8) -- per-file word/char deltas in 5-minute buckets
  • Obsidian Retry capability — Synchronous bridge-aware retry for Obsidian-dependent capabilities. Checks bridge health before each attempt, waits between retries, and returns a structured result. Use when you need the result before proceeding (e.g., step 1 of a multi-step task). For fire-and-forget retries, the gateway's automatic background retry handles it.
  • Smart Connections Ecosystem integration — Smart Plugins ecosystem (9 plugins, Pro license) -- SmartEnv runtime, embedding, semantic search, memory pressure
  • Tag Wrangler Integration integration — Tag operations via Tag Wrangler plugin (v0.6.4) + metadataCache -- read, rename, merge, tag pages
  • Obsidian Tasks Plugin Integration integration — Runtime integration with Tasks plugin (v7.23.1) -- cache API, ownership split, mutation pipeline, tag behavior, emoji-aware sync, soft-delete contract
  • Typed Obsidian Exception Hierarchy concept — Typed ObsidianError subclasses raised by the bridge layer; classified by isinstance + error_kind rather than substring matching
  • Vault Event Tracking concept — Event-driven file change tracking for Obsidian -- replaces O(n) mtime scanning, persists in localStorage
  • Picking a Vault Write Path concept — When to use bridge.write_file_raw directly vs. vault_write — the safe/fallback distinction explained
  • Vault Location Writer reference — Section-aware vault writing — insert content at specific locations in notes

Operations

  • Operations concept — How to operate work-buddy — MCP gateway, sessions, Python environment
  • Agent Sessions reference — Session ID setup, agent directories, Python conda environment, and Poetry dependency management
  • MCP Gateway directions — How to discover and call MCP gateway capabilities — the primary interface for agents
  • Retry capability — Retry a previously recorded operation by its ID. Use wb_status() to discover recent/pending operations after a timeout. Operations with retry_policy='manual' cannot be auto-retried. Operations with an active execution lease will be refused to prevent double-dispatch.

Projects

  • Projects system — Project registry — identity, observations, memory, discovery, and lifecycle
  • Deleting a Project directions — Pre-flight steps before deleting a project — confirm slug, explain impact, then call
  • Discovering Project Candidates directions — How to evaluate and triage project_discover candidates — create, alias, or ignore
  • Listing Projects directions — How to list and present projects — grouping by status, detail drill-down via project_get
  • Creating a Project directions — Parameter defaults, slug rules, when to ask vs infer, and post-creation ritual for project_create
  • Recording Project Observations directions — What makes a good observation, slug disambiguation, and existence prerequisite for project_observe
  • Project Add Alias capability — Attach an alternative slug (alias) to a project. Aliases route to the canonical row across capabilities. Writes a revision.
  • Project Add Folder capability — Attach a folder to a project. Writes a revision capturing the new folder set.
  • Project Confirm Description capability — Mark the latest revision as user-confirmed. Use this when a human reviews an LLM-authored description (or other agent edit) and signs off.
  • Project Create capability — Manually create a project. Accepts initial folders + aliases + provenance metadata. Consent-gated.
  • Project Delete capability — Soft-delete a project (set status='deleted'). Row + folders + aliases + revision history are preserved. Consent-gated.
  • Project Discover capability — Discover project candidates from task tags and git repos not yet in the registry. Returns candidates for agent review — evaluate each and use project_create to promote real projects.
  • Project Get capability — Get a single project (resolved via slug or alias) with its folders, aliases, and recent Hindsight memory recall
  • Project List capability — List projects with folders + aliases, ordered by lifecycle status. Soft-deleted rows are filtered by default; pass include_deleted=True to see them.
  • Project Memory capability — Read from the project memory bank (Hindsight-backed). Modes: 'search' (semantic recall, optionally scoped to one project), 'model' (fetch a mental model: project-landscape, active-risks, recent-decisions, inter-project-deps), 'recent' (latest project memories)
  • Project Observe capability — Record an observation about a project — strategic decisions, supervisor feedback, pivots, blockers, or anything that shapes trajectory but wouldn't appear in code or tasks
  • Project Remove Alias capability — Detach an alias from a project. Writes a revision.
  • Project Remove Folder capability — Detach a folder from a project. Writes a revision.
  • Project Revisions List capability — Return revision history for a project, newest first. Each entry snapshots the project state plus folder + alias sets at that revision.
  • Project Set Folder Archived capability — Flip the archived flag on a project folder (mark dormant or active). Writes a revision.
  • Project State At capability — Reconstruct a project's state as of a given timestamp (latest revision ≤ timestamp). Includes folders + aliases as they were then.
  • Project Sync capability — Reconcile project markdown notes (work-buddy/projects/.md) against the projects SQLite registry: propagate out-of-band note edits into the store, create store rows for new notes. Markdown-canonical; never deletes a project. See architecture/markdown-db.
  • Project Update capability — Update a project's identity: name, status, or description. Writes a revision row capturing the change (author + summary).

Routing

  • Routing concept — Information routing workflows — moving captured items to their destinations with user confirmation
  • Route Information workflow — Given a batch of discrete information items (each with an ID, raw text, and optional agent-proposed metadata), present routing recommendations to the user in clusters, get confirmation or correction, and execute the approved routings.
  • Search concept — Universal IR search verb (find) plus the markdown-formatted twin (context_search). Search any indexed source — conversations, summaries, knowledge units, Chrome tabs, task notes, documents, projects — with optional per-source drill.
  • Find capability — Structured IR search across any indexed source. Returns a plain list of hits, or — when drill=True — the funnel shape (stage1_hits + candidate_items + drilled). Subsumes summary_search (which remains as an alias).

Services

  • Services & Infrastructure concept — Sidecar-managed services — dashboard, messaging, embedding, and service pointers
  • Dashboard service — Web dashboard for system observability — Flask service, dev mode, remote access, development rules
  • Dashboard — Automation surfaces concept — Dashboard surfaces that project the operating-tier resolver and who-can-act decision into the Today tab. The earlier Review Queue, Daily Log, and Engage tabs were retired when the Threads tab became the canonical resolution surface; what remains is the Today tab and the per-task Auto column on the Tasks tab.
  • Dashboard Chat Sidebar concept — Reusable right-rail chat surface that slides in beside the main view; mounts the conversation_chat renderer and squishes content via body padding.
  • Dashboard — Costs tab concept — LLM cost / usage view with two complementary sources (per-call internal log + Claude Code transcripts), row-level backend filters, and the Anthropic rate-limit observation chip. The unified llm_costs_query capability reads both sources.
  • Dashboard Form Bridge concept — Schema-driven agent ↔ form interaction subsystem: one MCP capability, one frontend bridge, one contract test asserting schema↔DOM stay in sync.
  • Inter-Agent Messaging service — Inter-agent messaging service for cross-session communication
  • Sidecar Daemon service — Unified process supervisor, cron scheduler, and message-driven job dispatcher

Status

  • Status concept — Status capabilities and workflows
  • Claude Code Usage Scan capability — Scan Claude Code's local transcript JSONLs into the cost cache (~/.claude/projects/*/.jsonl). Incremental by default. Use full_rebuild=true after a pricing or schema change.
  • Dashboard Interact capability — Drive a dashboard form on the user's behalf — fill fields, open the form, click submit, or read current state. Single typed entry point for chat-walkthrough agents; each call is validated against the form's registered FormSchema before anything reaches the frontend. See the brief's structural section for the form_id and field names you can address.
  • Escalation Recent capability — Recent LLM-escalation observability records. Each record is one logical job (one LLMRunner.call OR one adapter-level escalation chain across multiple calls) with its full per-tier attempt list, final outcome, and trace correlation.
  • Feature Status capability — Show which tools, features, and capabilities are available or disabled, and why. Use this to diagnose missing integrations.
  • List Sessions capability — List all known agent sessions with metadata
  • Llm Call capability — Make a single LLM API call (Tier 2 execution). Cheaper than spawning a full agent session. Supports freeform text or structured JSON output via output_schema (inline dict or named schema from work_buddy/llm/schemas/). Routes to Claude via 'tier' or to a local/remote OpenAI-compatible server (LM Studio, vLLM, Ollama) via 'profile'. Handles caching and cost tracking automatically.
  • Llm Costs capability — Check LLM token usage, costs, and breakdown for this session. Shows per-task costs, per-model costs, cache hit rates, and top callers.
  • Llm Costs Query capability — Aggregate LLM cost / usage across one or both data sources (work-buddy's per-call internal log + Claude Code transcripts). Smart parameters: time window (named or ISO range), group_by (project, model, session, day, tool), source filter, min_cost / project / model filters, and previous-window comparison. Single capability covering most cost questions.
  • Llm Submit capability — Asynchronously submit an llm_call for background execution. Returns immediately with an operation_id; the sidecar's retry sweep invokes llm_call with your params and messages the originating session on completion. Use when local inference latency (tens of seconds) would block the caller unnecessarily. For synchronous bounded calls use llm_call. Cloud tier calls are already fast — no point submitting them; profile is therefore required.
  • Llm With Tools capability — Invoke a local model with restricted work-buddy MCP tool access, so it can look things up (projects, tasks, journal, context) while answering. Tool access is limited to a named preset defined in work_buddy/llm/tool_presets.py (currently: 'readonly_safe', 'readonly_context'). No arbitrary tool list accepted at call time — presets are the security boundary. Requires 'profile' and 'tool_preset'.
  • Mcp Registry Reload capability — HEAVY: invalidate and rebuild the capability registry. Re-probes every tool, purges work_buddy.* from sys.modules, rebuilds every capability (~6-8s). Use ONLY after code changes to existing capability callables. For transient tool-probe failures (capability stuck in disabled state), use work_buddy.recovery.recheck_disabled_capability(name) instead — it re-probes only the relevant tools with a 30s cool-down. See architecture/capability-registry.
  • Remote Session Begin capability — Launch or resume a visible Claude Code session in a real terminal window. If session_id or session_name is provided, resumes that session; otherwise starts a new one. Designed for Remote Control (phone app) connection.
  • Remote Session List capability — List resumable Claude Code sessions from ~/.claude/sessions/. Shows session ID, name, cwd, and start time.
  • Service Health capability — Check if the messaging service is running
  • Session Activity capability — Query the session activity ledger — what this agent session has done through work-buddy. Filters by event type, capability, category, status. Returns last N matching entries (newest first).
  • Session Resume capability — Resume an existing Claude Code session in a new local terminal window. No prompt is sent and remote-control is off — the terminal opens directly into the conversation, ready for the user to type. cwd is auto-derived from the session's recorded working directory.
  • Session Summary capability — Compact summary of what this agent session has done — counts by category/capability, errors, mutations, key artifacts created, workflow progress.
  • Setup Help Directions directions — How to present component health diagnostics — structured output format, lead with the fix
  • Setup Wizard Directions directions — How to run the setup wizard — modes, feature preferences, requirements, guided setup
  • Setup Help capability — Diagnose why a component isn't working. Runs automated check sequences that walk dependency chains and stop at the first failure with a root cause and fix suggestion. Use 'all' for an overview of all components, or specify a component ID (e.g. 'hindsight', 'obsidian', 'postgresql') for targeted diagnostics.
  • Setup Wizard capability — Comprehensive setup wizard for work-buddy. Validates bootstrap requirements, checks feature health, manages user preferences (wanted/unwanted features), and provides guided first-time setup. Modes: 'status' (quick overview), 'guided' (interactive walkthrough), 'diagnose' (deep diagnostic for one component), 'preferences' (view/edit).
  • Sidecar Jobs capability — List all scheduled sidecar jobs with their next fire time, heartbeat status, and whether exclusion windows are active.
  • Sidecar Status capability — Check if the sidecar daemon is running and get its current state: supervised services health, scheduler status, and upcoming job schedule.
  • Tailscale Status Directions directions — Check Tailscale VPN status — daemon state, tailnet identity, online peers, Serve config
  • Tailscale Status capability — Check Tailscale VPN status: daemon state, tailnet identity, online peers, and Serve configuration (published ports).
  • User Job Create capability — Author a personal scheduled cron job by writing a .md file under /user_jobs/. Validates the cron expression and refuses to overwrite an existing job. The scheduler hot-reloads (~30s) and starts firing the job. See features/user-jobs for the schema.

Summarization

  • Summarization system — Producer + search surface for content summaries. Per-session TL;DR+topics, per-page flat extracts, and the coarse-to-fine retrieval funnel over them.
  • Summarization Worker Tick capability — Drain the summarization queue once (PRD §6 O2). Picks eligible (cooldown-passed) entries FIFO, bounded by worker_tick_limit and the daily cost budget. Used by the sidecar cron and inline-trigger from /wb-journal-update and /wb-morning. Pass bypass_cooldown=true for explicit user-triggered refresh; bypass_budget=true to override the daily ceiling.
  • Summary Search capability — Coarse-to-fine retrieval funnel over framework summaries. Stage 1 ranks query against summary nodes; stage 2 (optional) drills into raw spans of top items. Each hit carries a drill_node_id ready to hand to drill_tree.

Tasks

  • Tasks system — Tasks capabilities and workflows
  • Task action items reference — Per-action-item rows attached to a parent task. Each item carries its own risk profile + required contexts + definition_of_done. Safety rule: items with authorship='agent_unapproved' cannot be executed by the agent -- is_executable enforces this.
  • Task Assign Directions directions — How to assign a task — presentation format, completion tracking, state change protocol
  • Task Briefing Directions directions — How to present the daily task status summary — concise format, required sections, one next action
  • Task Handoff Directions directions — How to write a structured session handoff prompt and package it as a task note
  • Inline Todos workflow — Find #wb/TODO markers across the vault, present them for triage, execute the freeform instructions, and clean up handled tags.
  • Inline TODOs Directions directions — How to triage and execute #wb/TODO vault markers — batch presentation, execution rules, tag cleanup
  • Namespace Lookup capability — Return the closest existing namespace tags to a single query. Designed for the 'did you mean?' check before minting a brand-new namespace — the agent calls this to confirm a proposed namespace isn't a near-duplicate of something that already exists.
  • Session Tasks Get capability — List the tasks a session was assigned to (the reverse of task→sessions), each enriched with its current text + state. Bridge-independent — reads the SQLite task store, so it works even when Obsidian isn't running. Read-only.
  • Task Completeness workflow — Investigate whether a task was already completed (fully/partially/differently), judge the spirit not the letter, then optionally mark it complete with the correct prior date.
  • Task Me — what should I do right now? workflow — What should I do right now? Loads tasks, calendar, and active contracts; clamps the day plan to the current moment; surfaces 1–2 next-action recommendations; and optionally writes the resulting plan back to the journal Day Planner.
  • Task Me Directions directions — How to run /wb-task-me — the re-runnable engage flow that answers what should I do right now
  • Task New workflow — Interactive task creation with project + namespace-tag inference. Plans the task, enriches with project-registry + tag-universe context, confirms with the user (only when minting a new project, new project subtree, or new namespace), then applies via task_create.
  • Task Creation Directions directions — How to create a task via the task-new workflow — minimal prompting, project + namespace inference, gates only on minting new projects/namespaces
  • Task Read Directions directions — How to read a task without claiming it — inspection-only path that does not write a session assignment
  • Scattered Tasks Directions directions — How to present scattered task results and triage into action categories
  • Task Search Directions directions — How to search tasks - distinguish description-text search (task_search, store-only) from note-body hybrid retrieval (context_search source=task_note)
  • Task Triage workflow — Interactive inbox review: surface tasks that need decisions, collect user input, apply state changes.
  • Task Update Description Directions directions — How to rewrite a task's description text safely - preferred over filesystem-direct edits, atomic against concurrent user edits
  • Task Archive capability — Move completed tasks from the master list to tasks/archive.md. Consent-gated; the prompt shows the exact count and a random 5-title sample so the user approves a concrete scope. Default policy archives tasks completed >= 7 days ago so recent work stays visible. Posts a fire-and-forget summary notification after the move so a bulk archive doesn't happen silently.
  • Task Assign capability — Claim a task for the current session and get full context (text, note, metadata)
  • Task Briefing capability — Daily task status summary with contract constraints, MITs, focused, overdue, stale, suggestions
  • Task Change State capability — Update task metadata: state (not completion), urgency, due date. Cannot set state='done' — use task_toggle for completion.
  • Task Create capability — Create a new task in the master task list. Optionally attach a note file for details/subtasks. Slice 2 GTD vocabulary (task_kind, density, outcome_text, next_action_text, definition_of_done, creation_effort, user_involvement, creation_provenance, deadline, dependency) is optional and defaults to 'looks like a legacy manually-authored task'. Agent-driven creators should set creation_provenance (e.g. 'agent_inferred_from_journal') and lower user_involvement.
  • Task Delete capability — Permanently delete a task: remove line, note file, and store record. Consent-gated.
  • Task Namespace Suggest capability — Rank existing namespace tags by relevance to a task text (hybrid BM25+embedding via the shared embedding service; falls back to token overlap). Returns ranked candidates from the existing universe only — it does not propose new namespaces. The calling agent decides whether to apply suggestions, add more, or mint a new namespace.
  • Task Read capability — Read a task's full context (text, note, metadata) without claiming it for the current session
  • Task Review Inbox capability — Get inbox tasks with suggested actions (mit, snooze, kill, needs_date)
  • Task Scattered capability — Find open tasks scattered across the vault outside the master task list. Groups by file with counts. Uses Datacore structural queries.
  • Task Search capability — Search tasks by description text via the SQLite store. Bridge-independent — works even when Obsidian isn't running. Returns task records (full task_metadata rows) ordered most-recently-updated first. For full-text search over task NOTE bodies (the [[uuid|📓]]-linked detail files), use context_search(source='task_note') instead — that's hybrid retrieval over note content; this is exact-text search over the line description.
  • Task Set Tags capability — Replace the user-modifiable tags on an existing task's master-list line. Manages free-form namespace tags (e.g. #admin/uhn), project tags (#projects//...), and opt-in prefixes (#ns/..., #task/...). Pass the complete desired list; anything missing is removed. Preserved: #todo, #tasker/* (mirrors store-owned state — mutate via task_change_state), #wb/todo, #wb/done, wikilinks, 🆔, plugin emojis. Project slugs are validated against the project registry — unknown slugs raise ValueError.
  • Task Stale Check capability — Find forgotten/stale tasks across inbox, snoozed, MIT, and focused
  • Task Sync capability — Compare master task list against SQLite store: detect orphans, create missing store records, report checkbox mismatches
  • Task Toggle capability — Mark a task complete, incomplete, or toggle. Handles checkbox, done date, and store state atomically. Use done=true to complete, done=false to reopen, omit to toggle. Consent-gated.
  • Task Update Description capability — Rewrite the description text on a task line. Preserves checkbox, #todo, #projects/*, namespace tags, wikilinks, 🆔 + ID, plugin emojis (📅, ✅, urgency). Updates the store's description column in lockstep. Use this instead of filesystem-direct edits — it routes through the same consent-aware, retry-aware path as the other mutations and avoids the read-modify-write race on the master task list.
  • Task Triage Directions directions — How to run interactive task triage — presentation rules, actions, summary format
  • Weekly Review workflow — Agentic weekly planning session. The agent assembles the strategic picture,
  • Weekly Review Directions directions — How to run the weekly task review — MIT drafting, WIP enforcement, constraint validation
  • Weekly Review Data capability — Gather all data for the weekly review: contracts, constraints, WIP, tasks, staleness, suggestions

Threads

  • Threads — universal-entity primitive system — The Thread is the universal entity for 'context that may need an action'. Replaces the older split between PoolEntry (now folded into states) and ActionItem (now folded into sub-Threads). Task survives as a subclass.
  • Action Catalog (typed lens over capability + workflow registries) concept — Filtered view over the existing capability + workflow registries: entries where is_action=True. Four action kinds: Standard, Improvised, Suggestion, Clarification.
  • Autonomy policy (composed, not enum) concept — Per-Thread policy composed from orthogonal axes. Saved compositions are configuration, not types. Sub-threads override DOWN axis-by-axis only. Stage 5 wired the runtime — auto-advance branch resolvers gate every INFERRING_ → AWAITING__CONFIRMATION transition.
  • Thread event log (canonical state) concept — Every state-affecting operation produces an event. The current-state cache exists for query convenience but events are authoritative.
  • Thread FSM (resolution phase) concept — 14-state FSM that runs from inciting event to terminal. Transitions wired in Stage 2; data structures (state catalog, transition table) land in Stage 1.
  • Threads — parent-child relationship patterns (decompose / group / singular) concept — Three parent-child relationship patterns. Decompose: parent has an action, children FSM-execute, cascade-on-terminal advances parent. Group: umbrella holds N cluster sub-threads with item-level drag-drop reorganization (Chrome / journal / email scans). Singular: umbrella holds N children whose actions render hoisted onto the parent's card so the user sees one thread with N proposals (inline-capture multi-record path).
  • LLM-call priority queue (lives in work_buddy/llm/, not threads/) concept — General infrastructure for dispatching LLM work by priority. Threads enqueue; the queue dispatches. Reusable by any client (scheduled jobs, agents, batch ops). MUST NOT be reimplemented inside the Thread package.
  • Resolution phase (vs. execution) concept — Resolution = the cyclic, human-in-loop FSM where the system decides what to do. Execution = whichever runtime owns the dispatched action; the threads FSM dispatches but does not itself host the action runtime.
  • Run Source Pipeline capability — Run an end-to-end source pipeline: collect raw items, annotate with tags + summary, algorithmically cluster, LLM-refine cluster boundaries + per-cluster action proposals (local-first tier_chain), and spawn a group umbrella thread + group sub-threads with the items as ContextItems. Replaces the per-source journal/chrome/email scan entry points.
  • Thread Defer capability — Defer a thread so it resurfaces at a future time. Sets the cached resurface_at field; the existing Later mechanic re-surfaces the thread when the time arrives.
  • Thread Dismiss capability — Mark a thread as dismissed via the standard FSM transition. For group sub-threads this is the 'do nothing with this cluster' action. For umbrellas it cascades through the existing dismiss flow.
  • Thread Rename capability — Rewrite a thread's title (and description) — used by the action-chip 'Rename' affordance and by the LLM cluster-refinement step when it overrides an algorithmic cluster label.

Vault

  • Vault Health Namespace integration — Cross-cutting concerns about the vault as a whole — reconnaissance, drift detection, hygiene checks, schema validation. First member: vault_recon (diagnostic) and the vault-recon collector (periodic discovery loop).
  • Vault Investigation Agent Directions directions — How a spawned investigation agent reasons over a delta detected by the vault-recon collector and surfaces a proposal to the user.
  • Vault Recon Directions directions — How to read vault_recon output and identify recurring conventions worth surfacing.