Skip to content

Session Retrospective

Switch to developmental mode and critique/debug this session's execution — then fix what you find

When to use

When the user invokes /wb-dev-retro or asks for a session retrospective, post-session critique, or to debug what went wrong during an operational session

Slash command: /wb-dev-retro

Directions

Switch from operational to developmental mode. Critique and debug this session's execution, then fix what you find.

The design rationale behind operational vs developmental agents: operational agents get things done (workflows, tasks, journals). Developmental agents improve the system itself (code, prompts, capabilities). The session retrospective bridges the two — the agent that experienced the friction diagnoses and fixes it, preserving first-person context that would be lost in a handoff to a separate evaluator.

Prerequisite: You must have done operational work in this session (morning routine, task triage, journal update, context collection, etc.). If you haven't, tell the user there's nothing to retrospect on.

Step 1: Critique

Review everything that happened in this session. Be specific, be harsh, and cite evidence (quote the call, the response, the output). You are debugging the system — not just the code, but the behavior, the prompts, the context engineering, the workflow design, and your own decisions.

Look for:

Capability bugs - Calls that returned wrong, irrelevant, or excessively broad results - Missing capabilities (you had to write raw Python because nothing existed in the gateway) - Silent failures or unhelpful error messages - Redundant calls (same data fetched multiple ways)

Prompt and context issues - Slash commands that didn't give you enough context, or gave you the wrong context - Workflow steps that were awkward, out of order, or unnecessary - Violations of just-in-time retrieval: context loaded too early (bloat) or too late (missed) - Unclear data contracts between workflow steps

Agent behavior issues (debug your own reasoning) - Wrong approach — a better-contextualized agent would have avoided this path - Repeated mistakes within the session - Missed signals that were obvious in retrospect - Over-engineering or under-engineering

Output quality - Journal entries, tasks, briefings, or user-facing summaries that were poorly structured, too verbose, or missed the point

Format as:

## Session Critique

### Critical (broke something or produced wrong output)
1. **[Category] Short title**: What happened. What should have happened. Evidence.

### Friction (slowed things down or degraded quality)
1. **[Category] Short title**: What happened. What should have happened.

### Minor (polish)
1. **[Category] Short title**: What happened. What should have happened.

Present this to the user. Ask if they experienced friction you missed or if any of your critiques are off-base.

Step 2: Triage (with the user)

Present the full critique and let the user decide what to act on. For each issue, offer these options:

  • Fix now — code change, prompt edit, config tweak. Do it in this session.
  • Debug first — root cause is unclear. Reproduce the issue, investigate, then fix.
  • Create a task — needs deeper work or a fresh session. Use /wb-task-handoff to capture the issue with full context so a future agent can pick it up.
  • Skip — not worth fixing, or not actually a problem.

The user drives this. Don't prescribe how many to fix.

Step 3: Do the work

For each issue the user selected:

Fix now: Read the relevant files, make the change, test if possible (re-run the call that failed), briefly note what you changed.

Debug first: Reproduce the problem — re-run the call or workflow step. Narrow the root cause: is it the capability code, the prompt, the workflow, the data? Then fix and test. If you learn something non-obvious, document it in the relevant knowledge store unit.

Create a task: Use /wb-task-handoff. The handoff should include the specific failure (with evidence from this session), your root cause hypothesis, the files involved, and what you already tried.

Step 4: Close out

If you made code changes: 1. Review them holistically — do they introduce new issues? 2. Summarize: what was fixed, what was handed off, what was skipped 3. Call out systemic patterns if you see them (e.g., "three issues were all about context bloat — the JIT pattern needs enforcement across workflows")

What NOT to do

  • Boil the ocean — don't try to fix everything
  • Stay abstract — "search should be better" is not actionable; "context_search returned 47 results including binary files, needs a file-type filter" is
  • Bikeshed — don't spend 30 min on naming when a capability is broken
  • Scope-creep into features — this is about fixing friction, not adding new features
  • Self-congratulate — don't list things that went well; the point is improvement