Skip to content

Vault Recon Directions

How to read vault_recon output and identify recurring conventions worth surfacing.

When to use

user runs /wb-vault-recon, or an agent needs to reason over vault_recon cross-tabs to identify patterns

Slash command: /wb-vault-recon

Directions

Read vault_recon output to identify recurring conventions in the user's vault. Returns cross-tabs an agent can pivot to spot state machines, tag families, hot regions of work.

When to use

  • The user runs /wb-vault-recon and wants a diagnostic peek at vault structure.
  • An investigation agent is reasoning over a fresh recon snapshot to draft a proposal.
  • A future vault-health check needs structural ground truth.

Do NOT use as a substitute for datacore_schema for cheap "is anything queryable" probes. vault_recon is heavier (full page walk + list-item walk, ~2–3s on a 6k-page / 200k-list-item vault, capped at 90s by bridge timeout). Direct invocation via wb_run("vault_recon") may exceed the MCP tool result token limit — prefer reading .data/vault_recon/latest.json written by the collector.

Invocation

`wb_run("vault_recon")`                                       # full vault
`wb_run("vault_recon", {"path_prefix": "repos/electricrag/"})` # region focus
`wb_run("vault_recon", {"activity_days": 14})`                 # tighter activity window

Output structure

Key fields and what they answer:

Field Question it answers
object_types How many pages/sections/tasks/etc. exist (vault-wide; ignores path_prefix).
pages_total, pages_walked Total in vault vs. walked given filter.
top_tags Top-30 normalized top-level tags (page-level union).
frontmatter_keys Top-30 keys with usage counts. The keys are the alphabet of structure.
frontmatter_values[key] For each key, top 20 values + distinct_count + truncated flag. THIS IS WHERE STATE MACHINES SHOW UP. Look for keys like status, state, phase with discrete value sets that look like an enum.
high_cardinality_keys Keys skipped because their value cardinality > 100 (UUIDs, timestamps).
tag_tree Hierarchical tag tree to depth 3 with counts (page-level).
type_by_status Cross-tab {type: {status: count}}. The Kanban view of any state machine.
path_by_type {path: {type: count}} filtered to paths with ≥2 typed pages.
recent_activity_by_path Depth-2 prefix → count of pages with mtime within activity_days.
list_item_top_tags Top-30 inline-tag families on list-items (excluding #todo*). The user's inline-concept stream.
list_item_tag_tree Same as tag_tree but for list-items. Concept-stream substrate for Tier-2 surfacing.
list_item_tagged_total Count of list-items with at least one non-#todo tag.
task_statuses, tasks_total Task statuses (note: redundant with task_metadata.db; included for completeness).

Pattern-recognition heuristics

Spot a frontmatter state machine

Examine frontmatter_values. A state machine looks like: - A key (commonly status, state, phase) with truncated: false and 3–8 distinct values. - Values that look like a workflow: PROPOSED, DESIGNED, COMPLETED or draft, published, archived. - Often paired with a type key and type_by_status cross-tab showing per-type counts in each status.

If you see one: name it. "You have a hypothesis | experiment | thread state machine with PROPOSED → DESIGNED → COMPLETED transitions, mostly under repos/electricrag/."

Spot a tag family

Walk tag_tree (page-level) or list_item_tag_tree (inline). A family looks like a node with multiple children, where the children are themselves substructured. E.g. #mide with children workflow, system, context, meta.

Spot a concept-stream pattern (Tier-2)

list_item_top_tags and list_item_tag_tree capture inline-tagged list-items — the user's running thinking log. Patterns to spot: - A tag prefix appearing 50+ times across list-items = an active concept the user references repeatedly. - A previously-active tag with low recent count = drift candidate. - Co-occurrence within the same line = related concepts.

These are NOT in task_metadata.db (only #todo* tasks are) — list-item tags are concept references, not actions.

Spot a path convention

Look at path_by_type. If a single path holds multiple pages of one type (e.g. repos/electricrag/kb/research/threads holds 8 pages of type=thread), that's a directory naming convention.

Spot a hot region

recent_activity_by_path ranks regions by recent mtime activity. Top 1–2 entries are where the user is working now.

What to do with findings

This directions unit is for the reader of recon output, not for the user-facing slash command response.

  • If invoked from /wb-vault-recon: present the most striking 3–5 findings in plain English, no auto-action.
  • If invoked from the investigation agent (after a delta has triggered escalation): cross-reference with vault/investigation-directions for the proposal protocol.

Do NOT cement anything from a single recon run. Cementing is the user's call (or a follow-up workflow's). This unit only teaches an agent how to see.