Skip to content

Master Plans CLI next-stage guidance command

Problem

Agents working on task files have no automated way to know which narrative sections are incomplete or what to write next. They either skip sections entirely, write shallow placeholder content, or fill sections out of order — leading to inconsistent task records that are hard to review. The plans CLI has no content-completeness inspection, so the only quality check happens at PR review time when it's expensive to send the agent back. A lightweight advisory command that reports section completeness and prints stage-specific guidance would give agents (and human operators) a fast self-check without requiring a full workflow engine.

Context

Possible Solutions

Plan


This replaces the current generic "Deliverables / Acceptance criteria /
Dependencies" skeleton for execution tasks. The existing skeleton remains
available for planning/scoping tasks that don't follow the narrative flow.

### Template variants (future)

The `task create` command could accept a `--template` flag to select between:
- `default` — current generic skeleton (Deliverables / Acceptance / Dependencies)
- `execution` — narrative-driven template (Problem / Solutions / Plan / ...)
- Custom templates from a `master_plans/templates/` directory

Not in scope for this task, but a natural extension.

## Implementation Progress

**Design Decision: Template-with-links vs. enforced stepper**

### Context

AgentTree (`~/Fivetran/agenttree`) uses a full `agenttree next` state machine:
exit hooks validate each stage (pytest pass, sections filled), the CLI physically
blocks advancement on gate failure, sessions track restart recovery, and each
stage renders a Jinja-templated skill file. It's ~1000+ lines of workflow engine.

Our current system is lighter: task markdown with narrative sections, SKILL.md
files that tell agents what to fill at each phase, and a validator at the end.

### Decision: Template-with-links (not enforced stepper)

**Chosen approach:** Enrich the task template with per-section guidance and skill
links, add a lightweight `plans task check` command for content completeness,
and rely on the existing cbox review / CI gates for enforcement.

**Rejected approach:** Building a `cbox next` / `plans task next-stage` stepper
that enforces stage transitions with exit hooks and session tracking.

### Rationale

1. **The bottleneck is stage quality, not stage skipping.** Agents don't skip
   sections — they write shallow ones. A `next` hook can check if a section
   exists and has N lines, but can't judge content quality. The real quality
   gate is cbox review, which already exists. Template structure gets agents to
   fill the right things; review catches quality problems.

2. **Maintenance burden vs. marginal gain.** A workflow engine couples stage
   definitions, hook code, and session tracking. Every lifecycle change
   (add/reorder/conditionally skip a stage) requires editing YAML config and
   Python hooks. With template-with-links, you edit a markdown file. The skill
   system already provides stage-specific instructions — linking to the right
   skill from each section is the same thing as "render the next stage's skill,"
   minus the machinery.

3. **Existing enforcement is sufficient.** `plans task validate` (frontmatter),
   `cbox review` (code quality), `just ci` (tests/lint), and the SKILL.md
   narrative fill expectations table already cover the gates. The missing piece
   is making the template better, not adding a stepper.

4. **Advisory `check` gives 90% of the value.** A `plans task check` that says
   "your next incomplete section is Possible Solutions, here's what to focus on"
   is a ~50-line function. It provides the same "what should I do next?" signal
   without the session/hook/state-machine overhead.

### When to revisit

Reconsider the enforced stepper if:
- Agents routinely skip sections entirely (not just write shallow ones).
- Multi-agent handoffs require hard gates (agent A must finish planning
  before agent B starts implementation).
- Human approval gates are needed mid-task (not just at PR review).

- **`plans task check` command**: Inspect a task file, report section
  completeness (header exists + has substantive content beyond placeholder
  comments), identify next section to fill, print stage-specific guidance.
- **Enriched task template**: Update `task_create` body template with
  per-section guidance blocks and optional skill/doc links.
- **Narrative section guidance map**: Define per-section guidance text
  (what to write, what "done" looks like, link to relevant skill/doc).
- Update SKILL.md files and docs to reference `plans task check`.

- `plans task check --path <task>` prints a clear report: which sections
  are filled vs. empty/placeholder, what the next section is, and guidance.
- Exit code: 0 if all sections substantively filled, 1 otherwise.
- New tasks created via `plans task create` have richer per-section
  guidance (not just generic checkboxes).
- Agent workflows (cbox-task SKILL.md, manager SKILL.md) reference
  `plans task check` as a self-check step.

**Implementation sketch**

### `plans task check`

```python
SECTION_GUIDANCE = {
    "## Problem": {
        "prompt": "Describe the gap or issue being solved. What's broken or missing?",
        "done_when": "Contains a clear problem statement (not just a placeholder comment).",
    },
    "## Possible Solutions": {
        "prompt": "List 2-3 options with trade-offs. Include rejected alternatives.",
        "done_when": "Contains at least 2 distinct options with reasoning.",
    },
    "## Chosen Plan": {
        "prompt": "Document the selected approach and why it was chosen.",
        "done_when": "Names the chosen option and explains the rationale.",
    },
    "## Implementation Notes": {
        "prompt": "Technical details, key code changes, design decisions made during implementation.",
        "done_when": "Contains specific file/function references or design notes.",
    },
    "## Review Feedback Log": {
        "prompt": "Append reviewer comments and author responses during review.",
        "done_when": "Contains at least one review entry (or 'No review feedback' if clean pass).",
    },
    "## Summary": {
        "prompt": "Brief problem statement + implementation outcome. Written last.",
        "done_when": "1-3 sentence summary of what was done and why.",
    },
    "## Final Approval": {
        "prompt": "Reviewer sign-off signal.",
        "done_when": "Contains approval or sign-off entry.",
    },
}

Logic: 1. Parse task file, extract text under each narrative section header. 2. A section is "substantive" if it has >1 non-blank, non-comment line. 3. Walk sections in canonical order, find first incomplete one. 4. Print report: filled/empty status per section, next section + guidance.

Enriched task template

Update task_create body to include inline guidance per section:

```markdown

  • Narrative section definitions already exist in NARRATIVE_SECTIONS and _SECTION_BLOCKS in plans_cli.py — extend, don't duplicate.
  • Coordinate with cbox-task and manager SKILL.md updates.

  • Original naming: plans task next-stage. Renamed to plans task check to reflect advisory/diagnostic nature (not a stepper).

  • If check proves useful, a future --fix flag could auto-scaffold missing sections (subsumes current init-execution-log).
  • The template-with-links approach can evolve incrementally — each section's guidance and skill links can be improved independently without touching workflow engine code.

Review Feedback

  • Review cleared