absolute-work

End-to-end, phase-gated software development lifecycle for AI agents. Turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code. Triggers on "build this end-to-end", "plan and build", "break this into tasks", "pick up this ticket", "grill me on this", "run this migration", "absolute-work this", or any multi-step development task. Relentlessly interviews to a shared design, writes a reviewed spec, decomposes into atomic tasks on a persistent markdown board, then peels tasks one safe wave at a time with test-first verification. Handles features, bugs, refactors, greenfield projects, planning breakdowns, and migrations.

What is absolute-work?

Quick Start

Open your terminal or command prompt
Run: npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work
Start your AI coding agent (Claude Code, Cursor, Gemini CLI, or any supported agent)
The absolute-work skill is now active and ready to use

Overview Files

absolute-work

absolute-work is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex, and mcp. An end-to-end, phase-gated software development lifecycle that turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code.

Quick Facts

Field	Value
Category	workflow
Version	0.1.0
Platforms	claude-code, gemini-cli, openai-codex, mcp
License	MIT
References	6 deep-dive guides
Evals	14 test cases

How to Install

Make sure you have Node.js installed on your machine.
Run the following command in your terminal:

npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work

The absolute-work skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).

Overview

absolute-work is one continuous skill that takes any unit of work — a ticket, a task, a plan, a migration — from fuzzy intent to verified code. It replaces the traditional brainstorm-then-build handoff with a single lifecycle that stops at hard gates so you stay in control the whole way.

It relentlessly interviews you to a shared design (one question at a time, codebase-first), writes and self-reviews a spec, decomposes the work into atomic tasks on a persistent local markdown board, then peels those tasks off one safe wave at a time with test-first verification. No external trackers, no silent scope creep, and no code until the design is approved.

The 6 phases

Intake & Brainstorm — deep context scan, codebase-first questioning, adaptive question banks per work type, relentless design interview until mutual 100% confidence
Spec — writes the approved design to docs/plans/, then a separate reviewer subagent grades it on a scored rubric
Decompose & Plan — atomic tasks, a dependency graph, and safe-wave assignment on .absolute-work/board.md
Execute — onion-peel one wave at a time, TDD per task (red → green → refactor)
Verify — binary signals (test/lint/typecheck/build) then an independent scored evaluator
Converge — full suite, summary, manual test steps to exercise the new functionality, close the board, suggest a commit (never auto-commit)

What makes this skill different

Phase-gated — stops and waits for your explicit "go" between every phase. Control over speed.
Safety-first execution — blockers and dependents run sequentially; only provably-independent tasks parallelize. When in doubt, it serializes.
Adaptive intake — detects the work type (feature, bug, refactor, greenfield, planning, migration) and swaps in a tailored question bank. Migrations get first-class handling (call-site inventory, codemods, incremental rollout, rollback).
Spec-driven + test-driven — a reviewed spec before code, tests before implementation, generator-evaluator separation for grading.
Fully local — all state lives in .absolute-work/board.md. No JIRA, no GitHub API, fully portable and resumable across sessions.

Reference Guides

Guide	Coverage
Intake Playbook	Adaptive question banks per work type, codebase-first intelligence, design-tree traversal, calibration
Migration Playbook	Call-site inventory, codemods, coexistence seams, incremental rollout, backwards-compat, rollback
Spec Writing	Spec template, section scaling, writing style, decision log, scored review protocol
Board Format	Full `.absolute-work/board.md` spec, status transitions, sequence/wave model, example board
Execution Model	DAG patterns, safe-wave (sequential-blocker / parallel-independent) algorithm, agent prompts, conflict handling, failure recovery
Verification Framework	TDD per task, verification signals, generator-evaluator protocol, scored rubric, mandatory tail tasks

Key Principles

Phase gates always — explicit approval between every phase
Codebase before questions — only ask what code can't answer
Relentless until aligned — interview to mutual 100% confidence
Spec before code, tests before implementation
Dependency-first decomposition — a DAG, never a flat list
Safety-first execution — serialize when in doubt
Generator ≠ evaluator — the builder never grades its own work
No silent scope creep, and never auto-commit

Platforms

claude-code
gemini-cli
openai-codex
mcp

Related Skills

Pair absolute-work with these complementary skills:

Frequently Asked Questions

What is absolute-work?

An end-to-end, phase-gated software development lifecycle for AI agents. It turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code — relentlessly interviewing you to a shared design, writing a reviewed spec, decomposing into atomic tasks on a persistent local markdown board, then peeling tasks off one safe wave at a time with test-first verification.

How is this different from just asking the agent to build something?

absolute-work imposes structure and control. It refuses to write code until the design is approved, it stops at a hard gate between every phase, it tracks everything on a persistent board that survives across sessions, and it verifies each task with an independent evaluator instead of self-grading. The result is fewer wrong turns, no silent scope creep, and a complete audit trail.

Does it integrate with JIRA or GitHub Issues?

No — absolute-work is fully local by design. "Ticket", "task", "planning", and "migration" are intake types that produce tasks in a local .absolute-work/board.md file. There are no external tracker dependencies, so it is portable and works in any repo.

How do I install absolute-work?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support absolute-work?

This skill works with claude-code, gemini-cli, openai-codex, mcp. Install it once and use it across any supported AI coding agent.

Maintainers

@maddhruv

Generated from AbsolutelySkilled

SKILL.md

Absolute Work: End-to-End AI Development Lifecycle

Absolute Work takes any unit of work — a ticket, a task, a plan, a migration — from fuzzy intent to verified code. It is one continuous skill with hard gates between phases: brainstorm a shared design, write and review a spec, decompose into a dependency-graphed task board, then peel tasks off one safe wave at a time with test-first verification. Nothing is assumed, nothing is silently expanded, and no code is written until the design is approved.

The lifecycle has 6 phases: INTAKE & BRAINSTORM → SPEC → DECOMPOSE & PLAN → EXECUTE → VERIFY → CONVERGE

Activation Banner

At the very start of every Absolute Work invocation, before any other output, display this ASCII art banner:

 █████╗ ██████╗ ███████╗ ██████╗ ██╗     ██╗   ██╗████████╗███████╗
██╔══██╗██╔══██╗██╔════╝██╔═══██╗██║     ██║   ██║╚══██╔══╝██╔════╝
███████║██████╔╝███████╗██║   ██║██║     ██║   ██║   ██║   █████╗
██╔══██║██╔══██╗╚════██║██║   ██║██║     ██║   ██║   ██║   ██╔══╝
██║  ██║██████╔╝███████║╚██████╔╝███████╗╚██████╔╝   ██║   ███████╗
╚═╝  ╚═╝╚═════╝ ╚══════╝ ╚═════╝ ╚══════╝ ╚═════╝    ╚═╝   ╚══════╝
██╗    ██╗ ██████╗ ██████╗ ██╗  ██╗
██║    ██║██╔═══██╗██╔══██╗██║ ██╔╝
██║ █╗ ██║██║   ██║██████╔╝█████╔╝
██║███╗██║██║   ██║██╔══██╗██╔═██╗
╚███╔███╔╝╚██████╔╝██║  ██║██║  ██╗
 ╚══╝╚══╝  ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═╝

Follow the banner immediately with: Entering plan mode — phase-gated lifecycle active

The Phase Gate Rule

Absolute Work STOPS at the end of every phase and waits for the user's explicit "go" before advancing. This is non-negotiable. The phases are:

INTAKE & BRAINSTORM ─┃ gate ┃─ SPEC ─┃ gate ┃─ DECOMPOSE & PLAN ─┃ gate ┃─ EXECUTE ─┃ gate per wave ┃─ VERIFY ─┃ gate ┃─ CONVERGE

At each gate, present what was produced, summarize what comes next, and ask the user to confirm before proceeding. Never chain two phases without an approval in between. Use AskUserQuestion (where available) for every gate and every interview question.

Activation Protocol

Immediately after the banner, enter plan mode before doing anything else:

On platforms with native plan mode (e.g. Claude Code's EnterPlanMode): invoke it immediately.
On platforms without it: simulate plan mode — complete INTAKE & BRAINSTORM and SPEC fully, write no code, and get explicit approval before EXECUTE.

The first three phases are planning work. No files are created or modified (other than the spec and the board) until the user approves the task graph and execution begins.

Session Resume Protocol

When Absolute Work is invoked and a .absolute-work/board.md already exists in the project root:

Detect: Read the board and determine its status.
Display: Print a compact summary of completed / in-progress / blocked / remaining tasks.
Resume: Pick up from the last incomplete wave — do NOT restart from INTAKE.
Reconcile: If the codebase changed since the last session, diff against the board's expected state and flag conflicts before resuming.

If the board is completed, ask whether to start a new session (archive the old board to .absolute-work/archive/) or review the finished work. Never blow away an existing board without explicit user confirmation.

Codebase Convention Detection

Before INTAKE begins, auto-detect the project's conventions so every phase is grounded in reality, not assumptions.

Signal	Files to Check
Package manager	`package-lock.json` (npm), `yarn.lock`, `pnpm-lock.yaml`, `bun.lockb`, `Cargo.lock`, `go.sum`
Language/Runtime	`tsconfig.json`, `pyproject.toml` / `setup.py`, `go.mod`, `Cargo.toml`
Test runner	`jest.config.`, `vitest.config.`, `pytest.ini`, `.mocharc.*`, test directory patterns
Linter/Formatter	`.eslintrc.`, `eslint.config.`, `.prettierrc.*`, `ruff.toml`, `.golangci.yml`
Build system	`Makefile`, `vite.config.`, `next.config.`, `turbo.json`
CI/CD	`.github/workflows/`, `.gitlab-ci.yml`, `Jenkinsfile`
Available scripts	`scripts` in `package.json`, `Makefile` targets
Directory conventions	`src/`, `lib/`, `app/`, `tests/`, `__tests__/`, `spec/`

Write detected conventions to the board under ## Project Conventions. Reference them in every later phase — especially PLAN and the mandatory verification tail tasks. Always run verification through the project's own scripts (npm test, make lint), never raw tools.

When to Use This Skill

Use Absolute Work when:

Picking up a ticket or task that needs design before implementation
Multi-step feature development touching 3+ files or components
"Build this end-to-end", "plan and execute this", "break this into tasks"
Greenfield projects, major refactors, or migrations
Planning/breakdown work — turning a vague goal into a sequenced task list
Complex bug fixes spanning multiple systems
The user wants to be grilled on a design before building

Do NOT use Absolute Work when:

Single-file bug fixes or typo corrections where the answer is obvious
Quick questions, code explanations, or pure research
Tasks the user explicitly wants to drive manually

Key Principles

Phase gates always. Stop and get explicit approval between every phase. Control over speed.
Codebase before questions. Search the code first; only ask what code genuinely cannot answer.
Relentless until aligned. Interview one question at a time until BOTH you and the user are 100% confident. Doubt on either side means keep going.
Spec before code. No implementation until a written spec is reviewed and approved.
Dependency-first decomposition. Every task is a node in a DAG, not a flat list.
Safety-first execution. Blockers and dependents run sequentially; only provably-independent tasks parallelize. When in doubt, serialize. (See references/execution-model.md.)
Test-first verification. Every task writes tests before implementation. "Done" means tests pass.
Generator ≠ evaluator. The agent that builds a task does not grade it.
Persistent state. All progress lives in .absolute-work/board.md, surviving across sessions.
No silent scope creep. Everything outside the agreed scope goes to Deferred Work, visible on the board.
Never auto-commit. Suggest a commit; the user commits.

Phase 1: INTAKE & BRAINSTORM (Relentless Design Interview)

Turn fuzzy intent into a shared, bulletproof design. This is a structured interrogation of every assumption, dependency, and design branch — not a casual chat.

The interview directive — operate by this verbatim:

Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.

Ask the questions one at a time.

If a question can be answered by exploring the codebase, explore the codebase instead.

Step 1 — Deep context scan

Read what exists before asking anything: docs/ (README first), root README.md, CLAUDE.md, CONTRIBUTING.md, docs/plans/ (overlapping designs), recent commits (last 10-20), package manifests, top-level structure. Synthesize what matters — do not dump a file listing.

Step 2 — Codebase-first intelligence

Before asking ANY question, check if the codebase answers it. Facts live in code (database, test framework, auth); preferences require asking (visual style, real-time vs batch). When code answers it, state what you found: "I see you're using Prisma with PostgreSQL — I'll design around that." See references/intake-playbook.md.

Step 3 — Detect the work TYPE and adapt

Identify the type and swap in its tailored question bank (full banks in references/intake-playbook.md):

Type	Focus
Feature	user problem, flow, happy/error paths, scope boundary
Bug	repro steps, expected vs actual, blast radius, fix criteria
Refactor	pain point, target state, blast radius, test safety net, incremental vs all-at-once
Greenfield	problem/user fit, v1 scope, stack, data model, deploy target
Planning / breakdown	goal, milestones, sequencing, what ships first
Migration	what→what, coexistence, rollback, breaking changes, call-site inventory — load `references/migration-playbook.md`

Step 4 — Scope assessment

If the request spans multiple independent subsystems, flag it and decompose into sub-projects first; brainstorm the first sub-project through the normal flow. See references/approach-analysis.md.

Step 5 — Relentless interview

One question at a time via AskUserQuestion. Never batch.
Strictly linear — resolve decision A before asking about dependent decision B.
Walk the design tree depth-first — purpose → data model → behavior → UI → edge cases. Every branch has an error/edge-case child; walk it.
Honest options — only propose multiple approaches at a genuine fork; always mark one (Recommended) with rationale tied to project context. When the answer is obvious, present it and briefly say why alternatives were dismissed.
Mutual 100% confidence — after each decision, confirm both sides are sure. Hesitation means probe deeper.

Step 6 — Confidence self-check

Before presenting the design, review every decision: am I 100% sure, or filling gaps with assumptions? Any sub-100% decision → return to the interview. State your confidence to the user.

Step 7 — Design presentation

Present section by section (architecture, components, data flow, error handling, testing), scaled to complexity. Get approval per section. Design for isolation: small units, one clear purpose each, well-defined interfaces. Follow existing patterns; don't fight the codebase.

━━ GATE: user approves the full design before Phase 2. ━━

Phase 2: SPEC (Spec-Driven Development)

Write the approved design to docs/plans/YYYY-MM-DD-<topic>-design.md (clear prose, file paths, code blocks for schemas/interfaces, a Decision Log). Scale sections to complexity.

Then run a scored spec review with a separate reviewer subagent (generator-evaluator separation): graded on Completeness, Consistency, Clarity, Scope, Testability (1-5 each).

4.0+ → approved, proceed to user review
3.0-3.9 → fix flagged issues, re-dispatch (max 3 iterations)
< 3.0 → surface to the user immediately

See references/spec-writing.md for the template, scaling rules, and review rubric.

━━ GATE: user reviews and approves the spec before Phase 3. ━━

Phase 3: DECOMPOSE & PLAN (Build the Task Board)

Break the spec into atomic sub-tasks, build the dependency graph, and write the board.

Decompose

Rules: test tasks separate from code; infra/config before dependents; aim for 5-15 tasks; every graph ends with the three mandatory tail tasks (Self Code Review → Requirements Validation → Full Project Verification — see references/verification-framework.md). Apply the complexity budget: if scope exceeds ~15 M-equivalent tasks, suggest splitting into multiple sessions.

Build the DAG and assign safe waves

Compute each task's depth (max(dependency depth) + 1) and group by depth into waves. Then apply the safety pass: within a wave, only tasks that touch disjoint files and share no interfaces may run in parallel; everything else is serialized. Assign shared-file ownership to a single task. When in doubt, serialize. See references/execution-model.md and references/board-format.md.

Per-task plan

For each task: files to create/modify, test files (TDD — written first), approach, acceptance criteria, and concrete test cases (happy path, edge, error). Write everything to .absolute-work/board.md. Ask the user during intake whether the board is git-tracked or gitignored.

Present the ASCII dependency graph + wave/sequence plan.

━━ GATE: user approves the task graph before Phase 4. ━━

Phase 4: EXECUTE (Onion-Peel, One Safe Wave at a Time)

Pre-execution snapshot

Before touching any file: ensure the tree is clean (commit or stash), record the current commit hash on the board under ## Rollback Point. This is the safety net for catastrophic failure.

Wave loop

for each wave in [Wave 1 … Wave N]:
  partition tasks: sequential (blockers/dependents/shared-file) vs parallel-safe (disjoint, independent)
  run sequential tasks in dependency order, one at a time
  run parallel-safe tasks concurrently (separate agents)
  each task → TDD: write failing tests (red) → implement (green) → refactor → update board
  wave boundary: conflict check + compact progress report
  ━━ GATE: confirm before starting the next wave ━━

Each agent gets a self-contained prompt from the board (conventions + research + plan + acceptance criteria) and the rule: write tests first, stay in scope, report blockers — never work around them. Scope creep: blocking discoveries become new visible tasks; non-blocking ones go to ## Deferred Work. See references/execution-model.md for the agent template, conflict resolution, blocked-task handling, and failure recovery.

Phase 5: VERIFY (Signals + Independent Evaluator)

Every task proves it works before closing, using two layers:

Signals (binary gate) — run the project's test, lint, typecheck, and build scripts. Any failure goes straight back to the generator to fix (up to 2 retries). No skipping tests, no @ts-ignore/eslint-disable/type: ignore to pass checks.
Evaluator (scored gate) — if signals pass, a separate evaluator subagent grades the work against a scored rubric (Correctness, Code Quality, Completeness, Test Coverage, Integration Safety). 4.0+ passes; 3.0-3.9 iterates on specific feedback (max 5); < 3.0 escalates to the user.

S-size tasks may skip the evaluator if all signals pass cleanly; M-size, failed, or shared-interface tasks always get it. See references/verification-framework.md.

After the final wave, run the full suite and the three mandatory tail tasks.

━━ GATE: present verification results before Phase 6. ━━

Phase 6: CONVERGE (Close Out)

Full suite — run the complete test/lint/build one final time.
Documentation — update any docs that were in scope.
Summary — files changed (with line counts), tests added, key decisions, deferred work.
How to test it — end every session with concrete, copy-pasteable steps the user can run to exercise the added functionality themselves: the exact commands to start the app/script, the inputs or routes to hit (curl, UI clicks, CLI invocation), and the expected output for each. Cover the happy path plus at least one edge/error case. Ground every command in the detected conventions (real scripts, real ports, real file paths) — never invent commands the project does not have.
Close board — mark completed with a timestamp; the board is the audit trail.
Suggest commit — propose a message. Never run git commit yourself.

Gotchas

Chaining phases without a gate. The whole point is control — never advance past a phase boundary without the user's explicit "go".
Asking questions the codebase answers. Search configs, deps, and test files before every question; it erodes trust to ask what the code already states.
Parallel agents editing shared files. Two same-wave tasks editing one utility produce a wave-boundary conflict. Detect shared files in DECOMPOSE, assign ownership, and serialize the rest. Default to sequential when unsure.
Rollback point recorded mid-wave. Capture the commit hash before Wave 1 touches anything, or the snapshot already contains partial changes.
Board marked complete without running checks. The mandatory tail tasks are skipped most often. Never mark completed until the actual test/lint/build output is on the board.
DISCOVER/research skipped for "obvious" tasks. Agents duplicate existing utilities or miss conventions a 2-minute scan would catch. Research every task.
Silent scope creep. Adjacent improvements absorbed mid-task obscure what changed and blow the estimate. Everything outside scope goes to Deferred Work.
Auto-committing. This skill suggests commits; it never runs them.

Anti-Patterns

Anti-Pattern	Better Approach
Jumping to code before the spec is approved	Hard gate: no implementation until the spec passes review and the user approves
Batching multiple interview questions	One question at a time, depth-first, dependency-resolved
Flat task lists without dependencies	Model as a DAG — hidden dependencies cause ordering bugs and conflicts
Parallelizing everything for speed	Safety first — only disjoint, independent tasks parallelize; serialize when in doubt
Proposing fake alternatives when the answer is obvious	Present the single right answer with rationale; options only at genuine forks
Skipping TDD for "simple" changes	Tests are the proof of correctness — write them first, always
Self-grading completed work	Dispatch a separate, skeptical evaluator — generators over-praise their own work
Massive L-sized tasks	Decompose until every task is S or M
Starting fresh when a board exists	Detect, display status, resume from the last incomplete wave
Advancing with private doubts	Stop, reason, and either resolve the doubt or surface it as a question

Output / Response Style

Respond terse like smart caveman. All technical substance stay. Only fluff die.

Persistence

ACTIVE EVERY RESPONSE once triggered. No revert after many turns. No filler drift. Still active if unsure. Off only when user says "stop caveman" or "normal mode".

Rules

Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Abbreviate common terms (DB/auth/config/req/res/fn/impl). Strip conjunctions. Use arrows for causality (X -> Y). One word when one word enough.

Technical terms stay exact. Code blocks unchanged. Errors quoted exact.

Pattern: [thing] [action] [reason]. [next step].

Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..." Yes: "Bug in auth middleware. Token expiry check use < not <=. Fix:"

Examples

"Why React component re-render?"

Inline obj prop -> new ref -> re-render. useMemo.

"Explain database connection pooling."

Pool = reuse DB conn. Skip handshake -> fast under load.

Auto-Clarity Exception

Drop caveman temporarily for: security warnings, irreversible action confirmations, multi-step sequences where fragment order risks misread, user asks to clarify or repeats question. Resume caveman after clear part done.

Example — destructive op:

Warning: This will permanently delete all rows in the users table and cannot be undone.
DROP TABLE users;
Caveman resume. Verify backup exist first.

References

Load a reference only when its phase needs it — they are long and consume context.

references/intake-playbook.md — adaptive question banks per work type, codebase-first intelligence, design-tree traversal, calibration, example sessions
references/migration-playbook.md — first-class migration handling: call-site inventory, codemods, incremental rollout, backwards-compat, rollback
references/spec-writing.md — spec template, section scaling, writing style, decision log, scored review protocol
references/board-format.md — full .absolute-work/board.md spec, statuses, sequence/wave model, example board
references/execution-model.md — DAG patterns, safe-wave (sequential-blocker / parallel-independent) algorithm, agent prompt template, conflict handling, scope-creep and failure recovery
references/verification-framework.md — TDD per task, verification signals, generator-evaluator protocol, scored rubric, mandatory tail tasks

References

board-format.md

Board Format Specification

The .absolute-work/board.md file is the single source of truth for an Absolute Work run. It is the only state — fully local, no external trackers — and is designed to be both human-readable and machine-parseable. It survives across sessions to enable resume, audit, and handoff.

File Location

{project-root}/.absolute-work/board.md

The .absolute-work/ directory may also contain:

board.md — the main board (always present)
archive/board-{timestamp}.md — completed or superseded boards

The user chooses during intake whether .absolute-work/ is git-tracked (audit trail, resume across machines) or gitignored (local working state).

Board Metadata (YAML frontmatter)

---
id: aw-{timestamp}
title: "{brief description of the overall task}"
type: feature | bug | refactor | greenfield | planning | migration
status: intake | spec | decomposing | planning | executing | verifying | converged | completed | abandoned
created: "{ISO 8601}"
updated: "{ISO 8601}"
git_tracked: true | false
evaluator_enabled: true | false
total_tasks: {N}
completed_tasks: {N}
failed_tasks: {N}
current_wave: {N}
total_waves: {N}
---

evaluator_enabled defaults to true for boards with any M-size tasks; set false only when all tasks are S-size.

Board Sections

1. Intake Summary

## Intake Summary
- **Task**: {one-line description}
- **Type**: feature | bug | refactor | greenfield | planning | migration
- **Complexity**: simple | medium | complex
- **Problem**: {what needs to be built/fixed}
- **Success Criteria**: {what "done" looks like}
- **Constraints**: {patterns, libraries, conventions to follow}
- **Dependencies**: {external APIs, services, other work}
- **Edge Cases**: {known edge cases} (if complex)
- **Spec**: docs/plans/{date}-{topic}-design.md
- **Board Persistence**: git-tracked | gitignored

2. Project Conventions

Written during Codebase Convention Detection — package manager, language/runtime, test runner, linter/formatter, build system, available scripts, directory conventions. Referenced by every later phase and by every execution agent's prompt.

3. Task Graph

## Task Graph

### Sub-tasks
| ID | Title | Type | Size | Dependencies | Wave | Run | Status |
|----|-------|------|------|-------------|------|-----|--------|
| AW-001 | {title} | config | S | - | 1 | seq | done |
| AW-002 | {title} | code | M | AW-001 | 2 | seq | in-progress |
| AW-003 | {title} | code | S | AW-001 | 2 | parallel | pending |

### Dependency Graph
{ASCII graph — see execution-model.md}

### Wave Assignments
- **Wave 1** (1 task): AW-001 [sequential — blocker, shared file]
- **Wave 2** (2 tasks): AW-002 [sequential], AW-003 [parallel-safe]

The Run column records the safety decision: seq (blocker/dependent/shared-file → run in order) or parallel (disjoint files, no shared interfaces → may run concurrently).

4. Tasks (per-task detail)

## Tasks

### AW-001: {title}
- **Type**: code | test | docs | infra | config
- **Size**: S | M
- **Dependencies**: none | [AW-XXX]
- **Wave**: {N}  **Run**: seq | parallel
- **Status**: {current status}

#### Research Notes
- Key files: {list}
- Reusable code: {functions/utilities to reuse}
- Patterns: {conventions observed}
- Risks: {risks identified}

#### Execution Plan
- Files to create/modify: {list}
- Test files: {list}
- Approach: {brief}
- Acceptance criteria:
  - [ ] {criterion 1}
- Test cases: {happy path, edge, error}

#### Verification
- Signals: PASS | FAIL
- Tests: {passed}/{total} ({new} new)
- Lint: clean | {issues}   Type Check: pass | {errors}   Build: pass | fail
- Evaluator Score: {N.N}/5.0 | skipped (S-size)
- Verdict: PASS | NEEDS WORK | FAIL    Iteration: {N}/5

5. Rollback Point

## Rollback Point
- Pre-execution commit: {hash}
- Recorded: {timestamp}

6. Execution Log

## Execution Log
### Wave 1 — {timestamp}
- Tasks: AW-001 (seq)
- Completed: {timestamp} — Result: all passed

7. Deferred Work

## Deferred Work
- {non-blocking discovery}, found during {task}, not in original scope

8. Convergence Summary

## Convergence Summary
### Files Changed
| File | Action | Lines |
|------|--------|-------|
| src/api/auth.ts | created | +120 |

### Tests Added
- Total new tests: {N}    Coverage: {% if available}

### Key Decisions
- {decision and why}

### Suggested Commit Message
{emoji} {type}: {subject}

{body}

Status Transitions

Board-level

intake → spec → decomposing → planning → executing → verifying → converged → completed
                                                                      └→ abandoned

Task-level

pending → researching → planned → in-progress → verifying → done
                                      │             └→ failed (retry)
                                      └→ blocked

From	To	Trigger
pending	planned	Research + plan written in DECOMPOSE & PLAN
planned	in-progress	EXECUTE starts for this task
in-progress	verifying	Implementation complete, running checks
in-progress	blocked	Dependency failed or external blocker
verifying	done	All signals pass and evaluator (if run) passes
verifying	failed	Verification failed after max retries
blocked	in-progress	Blocker resolved

Resuming a Board

At the start of any invocation, if .absolute-work/board.md exists:

Read the board; parse frontmatter and current state.
Display a compact status summary.
Identify the current phase and incomplete tasks (anything not done/failed).
Resume from the current point — executing → next unfinished wave; verifying → re-run verification on unverified tasks; planning → finish remaining plans.
Add a "Resumed at {timestamp}" entry to the Execution Log.
Reconcile: if the codebase changed since updated, flag conflicts before continuing.

If the board is completed, ask whether to start fresh (archive to archive/board-{timestamp}.md) or review. Never overwrite a board without explicit confirmation.

Example Board (abbreviated)

---
id: aw-1717400000
title: "Add user authentication to Next.js app"
type: feature
status: executing
git_tracked: false
total_tasks: 8
completed_tasks: 3
current_wave: 2
total_waves: 4
---

## Intake Summary
- **Task**: Add email/password + Google OAuth authentication
- **Type**: feature  **Complexity**: complex
- **Constraints**: NextAuth.js v5, existing Prisma + PostgreSQL
- **Spec**: docs/plans/2026-06-03-auth-design.md
- **Board Persistence**: gitignored

## Task Graph
### Sub-tasks
| ID | Title | Type | Size | Dependencies | Wave | Run | Status |
|----|-------|------|------|-------------|------|-----|--------|
| AW-001 | NextAuth config + providers | config | S | - | 1 | seq | done |
| AW-002 | User + Account Prisma models | config | S | - | 1 | parallel | done |
| AW-003 | Auth API route handler | code | M | AW-001, AW-002 | 2 | seq | in-progress |
| AW-004 | Auth middleware | code | M | AW-001 | 2 | parallel | done |

### Wave Assignments
- **Wave 1** (2): AW-001 [seq — shared config], AW-002 [parallel-safe]
- **Wave 2** (2): AW-003 [seq], AW-004 [parallel-safe]

## Rollback Point
- Pre-execution commit: a1b2c3d

execution-model.md

Execution Model

Absolute Work executes a dependency-graphed task board one safe wave at a time — the onion-peel. The defining rule is safety over speed: blockers and dependents run sequentially; only provably-independent tasks run in parallel. When in doubt, serialize.

Identifying Dependencies

Task B depends on task A if:

B needs code/files that A creates
B imports or uses a function/type/interface A defines
B tests, extends, modifies, or documents A's output
B configures infrastructure A requires

B does NOT depend on A if they modify different files with no shared interfaces and can be tested in isolation.

Dependency checklist (per task pair)

Does B need any file A creates? → dependency
Does B import any symbol A defines? → dependency
Does B test code A writes? → dependency
Can B's tests pass without A complete? → if yes, no dependency

Common DAG Patterns

Linear chain:   AW-001 → AW-002 → AW-003          (no parallelism; each task is a wave)
Fan-out:        AW-001 → {AW-002, AW-003, AW-004}  (W1: 001; W2: the rest)
Fan-in:         {AW-001, AW-002, AW-003} → AW-004  (W1: the three; W2: 004)
Diamond:        001 → {002, 003} → 004             (setup → parallel features → integration)
Independent clusters: A:001→002  B:003→004  C:005  (disconnected sub-graphs)
Layered:        infra → data → logic → ui → tests  (one wave per layer)

The Safety-First Wave Algorithm

Two passes: depth grouping, then a safety partition.

Pass 1 — Depth grouping (topological)

for each task:
  depth = 0 if no dependencies
          else max(dependency.depth) + 1
waves = group tasks by depth     // Wave 1 = depth 0, Wave 2 = depth 1, ...

Waves execute in strict serial order — Wave N+1 never starts until Wave N is fully verified and the user confirms.

Pass 2 — Safety partition (within each wave)

Classify every task in the wave as seq (sequential) or parallel:

A task is parallel-safe only if ALL hold:

It touches a disjoint set of files from every other task in the wave
It shares no interfaces/types being defined by another task in the wave
It is not a blocker that a later task in the same wave reads from
Its outcome does not change another task's plan

Otherwise it is seq — run it alone, in dependency order. Default to seq when uncertain. Record the decision in the board's Run column with a one-line reason.

Shared-file ownership

If two same-wave tasks would touch the same file, assign that file to one owner task; the others treat it as read-only until the owner completes, or get moved to a later wave. This is the single most common source of wave-boundary conflicts — resolve it at decomposition time.

Execution within a wave

run all `seq` tasks first, one at a time, in dependency order
then run `parallel`-safe tasks concurrently (separate agents)
wait for the whole wave to finish → wave boundary checks → GATE → next wave

Running seq tasks first means the riskiest/shared work lands before independent work fans out, so parallel agents build on a settled base.

Worktree isolation (optional)

On platforms that support it, run parallel tasks in isolated worktrees when there is any residual risk of file overlap; merge back at the wave boundary. Skip it when tasks touch clearly different directories — the merge overhead isn't worth it.

ASCII Graph Rendering

Task Graph:
  [W1] AW-001 [config: Init structure]           (seq — shared config)
         ├──> [W2] AW-002 [code: DB schema]        (seq)
         └──> [W2] AW-003 [code: API router]       (parallel-safe)

Wave Summary:
  Wave 1 (1): AW-001  [seq]
  Wave 2 (2): AW-002 [seq], AW-003 [parallel]

Pre-Execution Snapshot

Before any file is touched in Wave 1:

Ensure the working tree is clean (commit or stash existing changes).
Record the current commit hash on the board under ## Rollback Point.
If execution goes catastrophically wrong, the user can git reset --hard to this commit.

Record the hash before Wave 1 begins — a mid-wave snapshot already contains partial changes.

Agent Prompt Template

Each execution agent receives a self-contained prompt from the board:

## Task: {AW-XXX} - {Title}

### Context
{description from the board}

### Project Conventions
{detected conventions — package manager, test runner, linter, directory patterns}

### Research Notes
{key files, reusable code, patterns, risks for this task}

### Execution Plan
- Files to create/modify: {list}
- Test files: {list}
- Approach: {from PLAN}

### Acceptance Criteria
{specific, verifiable conditions}

### Rules
1. Write tests FIRST, watch them fail (red), then implement (green), then refactor.
2. Run lint and type-check on modified files via the project's scripts.
3. Do NOT modify files outside your task scope.
4. Reuse existing utilities named in the research notes — do not reinvent them.
5. If blocked, STOP and report the blocker — never work around it.
6. Report: files changed, tests written, tests passing, any issues.

Template by task type

code → full TDD template above.
test → skip "write tests first" (the task is the tests); write happy/edge/error cases and run against the implementation.
docs → no TDD; follow existing doc style; verify code examples are syntactically valid.
config/infra → verify by running the relevant tool/build; check idempotency.

Wave Boundary Checks

After every task in a wave completes, before the gate:

Conflict check — did any two agents modify the same file? Merge intelligently (prefer the change that better satisfies its acceptance criteria); if unmergeable, present both versions to the user.
Interface compatibility — types defined by one task match those expected by another.
Import resolution — all cross-task imports resolve.
Combined build + tests — run the build and the wave's tests together.
Progress report — print a compact status table:

Wave 2 complete (2/4 waves)
| Task   | Run      | Status | Notes |
|--------|----------|--------|-------|
| AW-003 | seq      | done   |       |
| AW-004 | parallel | done   |       |

Then GATE — confirm before starting the next wave.

Scope Creep Guard

Blocking discovery (can't finish the current task without it): add a new visible task to the DAG, place it in the current or next wave, flag it on the board, continue with other tasks.
Non-blocking discovery (nice-to-have, adjacent cleanup): do NOT absorb it. Add it to ## Deferred Work, mention it at CONVERGE. The user decides whether to start a new session.
Never silently expand scope — every DAG addition is visible on the board and called out in the next progress report.

Blocked Tasks

When a task is blocked:
  1. Mark status `blocked` with a reason and the blocking task ID.
  2. Continue executing non-blocked tasks in the wave.
  3. After the wave, reassess:
     - blocker resolved → add to the next wave
     - blocker persists → flag for user attention
     - approachable differently → revise plan and retry

If a Wave N task fails and Wave N+1 tasks depend on it, mark the dependents blocked (not failed); run the non-dependent Wave N+1 tasks normally; unblock dependents if the failure is later fixed.

Failure Recovery

Failure	Action	Max Retries
Test failure (code bug)	Fix code, re-run tests	2
Lint/type error	Fix, re-run check	2
Build failure	Find root cause, fix	1
Agent crash/timeout	Restart with same prompt	1
Merge conflict	Resolve, re-verify	1
Fundamental approach failure	Revise plan, flag for user	0 (needs user input)

On retry, append the error to the agent prompt: "Previous attempt failed because: {error}. Fix and retry." When retries are exhausted, mark the task failed, record all attempt logs, flag the user with the rollback hash, and continue with non-dependent tasks. Never bypass tests or checks to force a pass.

Performance Guidelines

Optimal wave size: 1-3 parallel tasks (low overhead); 4-6 (good throughput); 7+ → split into sub-waves to limit failure blast radius.
Each parallel agent consumes context and compute; in constrained environments, cap concurrency at 3-4.
Skip parallelism when a wave has one task, when all tasks touch the same file, or when any dependency isn't fully captured in the DAG (a sign the decomposition needs another pass).

intake-playbook.md

Intake Playbook

The design interview is the engine of Absolute Work. A relentless, structured interview extracts every requirement, constraint, and edge case before a single line of code is written. This playbook covers design-tree traversal, adaptive question banks per work type, codebase-first intelligence, question calibration, implicit-requirement extraction, and anti-patterns.

Design Tree Traversal

Every unit of work is a tree of decisions. Walk it depth-first, resolving each branch completely before moving to siblings. This prevents half-explored requirements from haunting the implementation later.

Rules

Root first — start with purpose. Why does this exist? What problem does it solve?
Depth before breadth — explore the first child fully before siblings.
Resolve before advancing — a node is resolved when you have a clear answer, a concrete decision, or an explicit deferral. Never leave a node ambiguous.
Backtrack on dead ends — if a branch leads to "we don't need this," mark it explicitly out of scope and backtrack.
Dependency edges — if a node depends on another branch, resolve the blocker first.

Example tree: "Add a commenting system"

commenting-system
├── purpose (who comments? what is commentable?)
├── data-model (schema, threading flat vs nested, storage)
├── permissions (create/edit/delete, moderation)
├── ui (input, list, threading display, empty state)
├── real-time (needed? transport? optimistic updates?)
├── notifications (notify on reply? channel?)
└── edge-cases (deleted parent, deleted post, concurrent edits, spam)

By the time you reach notifications, the threading decision is already resolved, so you know whether replies even exist. Upstream decisions constrain downstream ones — always interview in tree order: purpose → data model → behavior → UI → edge cases.

Adaptive Question Banks by Work Type

Detect the work type, then use its bank. Scale depth to complexity (see Scaling Rules below).

Feature

#	Question	Purpose
1	What is the feature and what user problem does it solve?	Root purpose
2	Who is the target user? Different roles?	Scope actors
3	Walk me through the user flow start to finish.	Map the journey
4	What existing features does this interact with?	Dependency map
5	What does the happy path look like?	Core behavior
6	What happens when things go wrong (network, invalid input, missing data)?	Error handling
7	Any performance requirements (response time, data volume)?	Non-functional
8	Behind a flag or always-on?	Rollout
9	What is explicitly out of scope for this version?	Scope boundary
10	How will we know it works in production?	Observability

Bug

#	Question	Purpose
1	Expected vs actual behavior?	Problem statement
2	How do we reproduce it (steps, environment)?	Reproduction
3	What is the impact and who is affected?	Priority
4	When did it start? Any recent changes?	Root-cause hints
5	Related bugs or known issues?	Context
6	When is this bug considered fixed?	Success criteria

Refactor

#	Question	Purpose
1	What is the specific pain point?	Root cause
2	What does the ideal end state look like?	Target architecture
3	Blast radius — how many files, modules, consumers?	Risk
4	Is there test coverage for the code being changed?	Safety net
5	Incremental or all-or-nothing?	Strategy
6	Downstream consumers or public APIs affected?	Breaking changes
7	Rollback plan if regressions appear?	Safety

Greenfield

#	Question	Purpose
1	What problem does this solve, and for whom?	Problem/user fit
2	The 3-5 core features for v1 — no more?	Scope discipline
3	Tech stack and hard constraints?	Foundation
4	Reference implementations or designs to study?	Prior art
5	High-level data model / core entities?	Data layer
6	Auth and user roles?	Auth model
7	Third-party services or APIs?	External deps
8	Deployment target?	Infrastructure
9	Testing strategy (unit, integration, e2e)?	Quality gates
10	If you could ship only one feature, which?	Prioritization

Planning / Breakdown

Use when the user wants a vague goal turned into a sequenced plan rather than an immediate build.

#	Question	Purpose
1	What is the end goal, stated as an outcome?	North star
2	What are the milestones between here and there?	Sequencing
3	What must ship first to unblock the rest?	Critical path
4	What is already done or in flight?	Current state
5	What are the hard deadlines or constraints?	Boundaries
6	What can be deferred to a later phase?	Scope control

Migration

#	Question	Purpose
1	What is being migrated, and to what? (v2→v3, JS→TS, lib A→B)	Problem statement
2	Full migration or incremental?	Strategy
3	Must old and new coexist during migration?	Constraints
4	Rollback plan if something breaks?	Safety
5	Known breaking changes?	Risk
6	Test coverage of the code being migrated?	Safety net
7	Priority order of modules to migrate?	Sequencing

For migrations, also load migration-playbook.md — it covers call-site inventory, codemods, incremental rollout, and backwards-compatibility in depth.

Codebase-First Intelligence

Before asking the user a question, check whether the codebase already has the answer. Every question the codebase could have answered is a wasted round-trip that erodes trust.

Before Asking About	Search For	Where
Database / ORM	prisma, typeorm, mongoose, drizzle deps + config	`package.json`, `prisma/schema.prisma`, `.config.`
Authentication	auth middleware, JWT, session, next-auth, clerk	`middleware/auth`, `lib/auth`, `package.json`
Testing framework	test config + existing test files	`jest.config`, `vitest.config`, `package.json` scripts
State management	stores, context, redux/zustand/jotai	`/store/`, `/context/`, `package.json`
Styling	tailwind/postcss config, styled-components	`tailwind.config`, `.module.css`
Deployment	CI/CD, Dockerfiles, deploy config	`.github/workflows/*`, `Dockerfile`, `vercel.json`
Lint / format	eslint, prettier, biome	`.eslintrc`, `.prettierrc`, `biome.json`

Protocol

For every question you are about to ask:

Can I find it in the codebase? Search; if found, state it and skip the question.
Can I infer it? If the project uses Prisma + Postgres, don't ask "what database?" — state it and ask the deeper question.
Is this a fact or a preference? Facts live in code (test framework). Preferences require asking (desired coverage level, visual style, real-time vs batch).

Question Calibration

Multiple choice vs open-ended

Multiple choice when there are 2-4 known options, the user may not know terminology, or speed matters. Always include a (Recommended) option with rationale.
Open-ended when the answer space is unbounded or you need the user's mental model.

When a question is too broad

If the user would need more than 3 sentences to answer well, split it.

Too broad	Better
"How should the notification system work?"	"In-app, email, or both?" then "Badge, dropdown, or full page?"
"What are the security requirements?"	"Who can access this resource?" then "Do we need rate limiting?"

Rules

One decision per question. If your question contains "and," consider splitting.
No compound conditionals. Resolve X first, then ask the follow-up.
Ground in the codebase. "I see you use Express with middleware routing — should new auth endpoints follow the same pattern?"
Offer a recommendation when you can, tied to project context, not popularity.
Timebox complexity. If a question opens a 20-minute rabbit hole, flag it and offer to defer with a placeholder.

Extracting Implicit Requirements

Users say what they want; they rarely say what they need. Surface hidden requirements as follow-up questions — do not assume.

User Says	Hidden Requirements
"Add notifications"	Channel (in-app/email/push), read state, preferences, digest mode, notification center
"Make it real-time"	Transport, reconnection, optimistic updates, conflict resolution, offline handling
"Add user roles"	Permission model, assignment UI, hierarchy, admin override, audit logging
"Support file uploads"	Max size, formats, virus scanning, storage backend, progress, resume, thumbnails
"Add search"	Full-text vs exact, indexing, debounce, highlighting, facets, empty state, pagination
"Make it work offline"	Sync strategy, conflict resolution, storage limits, cache invalidation, sync status
"Deploy to production"	CI/CD, env config, monitoring, rollback plan

Extraction protocol

Acknowledge the stated requirement.
Surface the 2-3 most architecture-affecting hidden requirements as questions.
Do not dump all hidden requirements at once — prioritize, then circle back to polish.

Scaling Rules

Tier	When	Questions
Simple	1-2 files, clear scope, no external deps	3 — always: problem, success criteria, constraints
Medium	3-5 files / 2+ components, some ambiguity	5 — add: existing code context, dependencies
Complex	5+ files, cross-cutting, greenfield, migration	8-10 — add: edge cases, testing, docs, rollout, priority

Heuristic: count files touched, presence of external deps, scope definition, and whether data migration/backwards-compat is involved. When in doubt, ask one more question, not one fewer.

Anti-Patterns

Asking what the codebase can answer — search configs and deps first.
Batching unrelated questions — the user answers the easy one and skips the hard one. One at a time.
Implementation before purpose — resolve what and why before how. Transport choices without context are coin flips.
Accepting vague answers — "handle errors gracefully" means something different to everyone. Ask for a concrete example.
Skipping error/edge branches — edge cases are where bugs live. Ask "what happens at 0 items? 10,000? on failure?"
Leading questions — "we should use Redis here, right?" confirms your bias. Ask the open question.
Skipping the out-of-scope conversation — without explicit scoping, the feature grows silently.
Interviewing out of order — designing UI before the data model risks designing something the data can't support.

migration-playbook.md

Migration Playbook

Migrations are the highest-risk work type Absolute Work handles: the code already runs in production, real users depend on it, and a botched migration breaks things that worked yesterday. This playbook makes migrations a first-class flow — safe, incremental, and reversible.

The core principle: never big-bang a migration you can do incrementally. Keep old and new coexisting behind a seam, move call sites in small verified batches, and keep a rollback at every step.

Migration Types

Type	Example	Primary Risk
Language/syntax	JS → TS, CommonJS → ESM	Type errors, build config, partial coverage
Library swap	moment → date-fns, Enzyme → Testing Library	API mismatch, behavioral differences
Framework version	React 17 → 18, Next 13 → 14	Breaking changes, deprecated APIs
API/contract	REST v2 → v3, schema change	Consumer breakage, data shape drift
Data/schema	column rename, table split, DB engine	Data loss, downtime, dual-write complexity
Infrastructure	provider A → B, monolith → services	Config drift, cutover coordination

Identify the type during intake — it determines which sections below apply.

Phase A: Call-Site Inventory (before any code)

You cannot safely migrate what you have not counted. Build a complete inventory of everything that touches the thing being migrated.

Steps

Find the surface. Grep for the symbol, import, endpoint, or pattern being migrated. Capture every hit with file:line.
Classify each call site by shape — most migrations have 3-5 distinct usage patterns plus a long tail of one-offs.
Count and bucket. Record totals per pattern on the board. This sizes the migration and reveals whether a codemod is worth writing.
Find the blind spots. Dynamic usage (string-built imports, reflection, config-driven dispatch) won't show up in a grep. List where these could hide.
Map consumers. For API/contract/library migrations, identify external consumers (other services, published packages, clients) that a grep of this repo will miss.

Inventory table (write to the board)

## Migration Inventory
| Pattern | Example call site | Count | Codemod-able? | Notes |
|---------|-------------------|-------|---------------|-------|
| direct import | src/a.ts:12 | 47 | yes | mechanical rename |
| wrapped helper | src/lib/fmt.ts:8 | 1 | n/a | central shim point |
| dynamic dispatch | src/router.ts:30 | 3 | no | manual review |
Total call sites: 51 across 23 files

Phase B: Choose the Strategy

Strategy	Use When	Trade-off
Incremental (strangler)	Large surface, production code, old+new can coexist	Safest; slower; needs a coexistence seam
Codemod-driven	Many mechanical, uniform call sites	Fast for the uniform 80%; the tail is still manual
Parallel-run / shadow	Behavior must be proven identical (data, money)	Highest confidence; most setup
Big-bang	Tiny surface (< ~10 sites) or hard cutover required	Fast; only safe when blast radius is small and fully tested

Default to incremental unless the surface is genuinely tiny. State the chosen strategy and its rationale on the board and in the spec.

Phase C: Establish the Coexistence Seam

For incremental migrations, old and new must run side by side. Create a seam so call sites can be moved one batch at a time without a flag day.

Adapter/shim — a thin wrapper exposing the new implementation behind the old signature (or vice versa), so call sites switch without changing shape.
Feature flag — gate new behavior so it can be toggled per environment or per user, and reverted instantly.
Dual-write (data migrations) — write to both old and new stores during transition; read from old until new is verified, then flip reads, then stop writing old.
Version negotiation (APIs) — serve both v-old and v-new; deprecate v-old only after consumers move.

The seam is itself a task on the board, and usually a blocker that must complete sequentially before any call-site batch runs.

Phase D: Incremental Rollout

Move the surface in small, independently verifiable batches — this is the onion-peel applied to migrations.

for each batch of call sites (grouped by module or pattern):
  1. migrate the batch (codemod for mechanical, manual for the tail)
  2. run tests for the affected modules → must stay green
  3. typecheck / build → must pass
  4. commit-worthy checkpoint (suggest commit; user commits)
  5. update the inventory: migrated N / total

Batch sizing: keep each batch small enough that if it breaks, the cause is obvious. Group by module boundary or by usage pattern. Migrate the central shim/helper first (one change unblocks many), then the mechanical bulk, then the manual tail last.

Codemod guidance

Write the codemod against the patterns found in the inventory; dry-run it and diff before applying.
Codemods handle the uniform majority; they will not handle dynamic dispatch, comments, or unusual formatting. Always hand-review the tail.
Re-run the inventory grep after the codemod to confirm the count dropped to the expected remainder.

Phase E: Backwards Compatibility

Concern	Handling
External consumers	Keep the old surface working (deprecated) until consumers migrate; announce a removal timeline
Persisted data	New code must read old-format data; migrate data lazily on read or via a background job
Serialized contracts	Version payloads; tolerate missing/extra fields during transition
Public API	Additive changes only during transition; breaking removals happen in a later, separate release

Backwards-compat shims are temporary debt: add a task to the Deferred Work section to remove them once the migration completes and consumers have moved.

Phase F: Rollback Plan

Every migration records how to undo it, at every checkpoint — not just at the end.

Snapshot — the pre-migration commit hash on the board (## Rollback Point).
Per-batch reversibility — each batch is a clean checkpoint the user can revert to.
Flag kill-switch — if behind a feature flag, the rollback is flipping the flag, no redeploy.
Data rollback — for dual-write, document how to stop writing new and resume reading old; for destructive schema changes, ensure a backup exists before the change.
Cutover criteria — define what must be true before the old path is removed (all call sites moved, consumers migrated, monitoring clean for N days).

Never remove the old path in the same step that introduces the new one. Removal is its own task, gated on the cutover criteria being met.

Migration Anti-Patterns

Anti-Pattern	Better Approach
Migrating before inventorying call sites	Grep and count everything first — you can't migrate what you haven't found
Big-bang on a large surface	Incremental batches behind a coexistence seam
Removing the old path alongside adding the new	Coexist first; remove only after cutover criteria are met
Codemod with no dry-run/diff review	Dry-run, diff, then apply; always hand-review the tail
Ignoring dynamic/reflective usage	List blind spots explicitly; grep won't catch string-built dispatch
No rollback until the very end	Every batch is a reversible checkpoint; snapshot before Wave 1
Forgetting external consumers	Keep old contract alive and deprecated until consumers move
Leaving compat shims forever	Track shim removal in Deferred Work, gated on cutover

spec-writing.md

Spec Writing

Reference for producing the design spec during Phase 2. Covers the document template, section scaling rules, writing style, the decision log, and the scored review protocol.

Spec Document Template

Write to docs/plans/YYYY-MM-DD-<topic>-design.md where <topic> is a short kebab-case slug (e.g. 2026-06-03-commenting-system-design.md).

# [Topic] Design Spec

## Summary
<!-- 2-3 sentences. What is being built and why. -->

## Context
<!-- What exists today. Why this change is needed. Link relevant code paths. -->

## Design

### Architecture
<!-- How the pieces fit together. ASCII diagram or description. -->

### Components
<!-- Each new/modified component with its responsibility and file path. -->

### Data Model
<!-- Schemas, tables, types. Code blocks for definitions. -->

### Interfaces / API Surface
<!-- Endpoints, function signatures, event contracts. Code blocks. -->

### Data Flow
<!-- Step-by-step for the key operations. -->

## Error Handling
<!-- Failure modes, retry strategy, user-facing error states. -->

## Testing Strategy
<!-- What to test, how, and at what level (unit/integration/e2e). -->

## Migration Path
<!-- Steps from current to new state. Remove if not applicable. -->

## Open Questions
<!-- Unresolved items. Remove if none remain. -->

## Decision Log
<!-- Key decisions from the interview. See format below. -->

Section Scaling Rules

Scale depth to complexity. Remove sections that would only say "N/A".

Simple (config change, utility, small fix) — ~1 page

Summary (2-3 sentences), Context (1-2 sentences), Components (bullets of what changes), Data Model / Interfaces only if changed, Testing Strategy (which tests). Skip Architecture, Data Flow, Migration Path, Open Questions, Decision Log.

Medium (new component, endpoint, moderate feature) — 2-3 pages

All core sections at moderate depth: Architecture (brief, no diagram needed), Components (table with name/responsibility/path), full Data Model in a code block, full Interfaces with request/response shapes, Data Flow (numbered steps), Error Handling (table), Testing Strategy (specific cases), Decision Log.

Complex (new system, migration, cross-cutting) — 4-6 pages

Every section at full depth: Architecture with a diagram and component relationships, Components table with dependencies, full schemas with relationships and indexes, all endpoints/functions with full types, primary + secondary data flows, comprehensive error handling with retry logic, test matrix by type, phased Migration Path with rollback, Open Questions with owners, full Decision Log.

Complexity heuristic

Signal	Simple	Medium	Complex
Files touched	1-2	3-8	8+
New components	0	1-2	3+
External deps	0	0-1	2+
Data model changes	none/trivial	new table/type	schema migration
Cross-cutting	no	maybe	yes

Writing Style

Be concrete, not abstract

Bad	Good
"An endpoint for comments"	`POST /api/posts/:postId/comments`
"A component that shows comments"	`src/components/CommentThread.tsx`
"Some database table"	`comments` table: `id`, `post_id`, `author_id`, `body`, `created_at`
"We'll handle errors"	Return `422 { error: "body_required" }` when the body is empty

Include file paths relative to repo root

The auth middleware at src/middleware/auth.ts validates the JWT before the request reaches src/api/comments/create.ts.

Use tables for comparisons and code blocks for interfaces/schemas

interface CreateCommentRequest {
  postId: string;
  body: string;
  parentId?: string; // for threaded replies
}

YAGNI

Remove anything not directly needed: don't spec future phases unless they constrain the current design, don't add "nice to have" sections, don't include sections that only say "N/A", fold one-sentence sections into a neighbor.

Decision Log Format

Record every decision where more than one reasonable option existed. The Rationale column is the most important — it prevents future re-litigation.

Decision	Options Considered	Chosen	Rationale
Database for comments	PostgreSQL, MongoDB, SQLite	PostgreSQL	Already in stack, ACID for threading, full-text search
Comment nesting depth	unlimited, flat, 2-level	2-level	Simple UI, covers 90% of cases, avoids recursive queries
Auth for commenting	anonymous, logged-in, mixed	logged-in only	Reduces spam, simplifies moderation, matches existing auth

Include both decisions the user made explicitly and ones you recommended. Keep each cell to 1-2 sentences.

Scored Spec Review Protocol

After writing the spec, dispatch a separate reviewer subagent (generator-evaluator separation — the agent that wrote the spec does not review it).

Rubric

Criterion	Weight	1 (Fail)	3 (Acceptable)	5 (Excellent)
Completeness	25%	TODOs, missing sections	Required sections present but thin	Every section substantive for its tier
Consistency	20%	Names/types contradict	Mostly consistent, minor mismatches	All names, types, paths match perfectly
Clarity	20%	Ambiguous, needs author to interpret	Clear to someone with project context	An unfamiliar dev can build from it
Scope	15%	Creep or missing agreed features	Covers discussed topics	Exactly what was discussed
Testability	20%	Vague "test the happy path"	Test cases listed but generic	Specific cases with inputs/outputs

Thresholds

Weighted Score	Verdict	Action
4.0 - 5.0	Approved	Proceed to user review
3.0 - 3.9	Needs Work	Fix flagged issues, re-dispatch (max 3 iterations)
< 3.0	Major Gaps	Surface to the user immediately, do not iterate

Reviewer prompt template

You are an independent spec reviewer. Grade this spec skeptically.
Do not give benefit of the doubt on vague sections.

Spec complexity tier: [SIMPLE | MEDIUM | COMPLEX]

--- BEGIN SPEC ---
{spec_content}
--- END SPEC ---

--- BEGIN INTERVIEW CONTEXT ---
{interview_summary}
--- END INTERVIEW CONTEXT ---

Score each criterion 1-5 using the rubric. Output (STRICT):

## Spec Review
- **Completeness**: {score}/5 - {justification}
- **Consistency**: {score}/5 - {justification}
- **Clarity**: {score}/5 - {justification}
- **Scope**: {score}/5 - {justification}
- **Testability**: {score}/5 - {justification}
- **Weighted Score**: {calculated}/5.0
- **Verdict**: Approved | Needs Work | Major Gaps

## Specific Issues (required if score < 4.0)
- [Section]: what is wrong and how to fix it

## What Was Done Well
- {1-2 strengths}

Reviewer approval is necessary but not sufficient — the user gate in Phase 2 is mandatory regardless of the reviewer's verdict.

Example Spec (abbreviated, medium tier)

# Commenting System Design Spec

## Summary
Add threaded comments (one level deep) to blog posts for logged-in users.

## Context
Blog at `src/app/blog/` uses Prisma + PostgreSQL. No commenting today.

## Design
### Architecture
New API routes at `/api/posts/:postId/comments`, new `comments` table,
React components in `src/components/comments/`.

### Components
| Component | Responsibility | File Path |
|---|---|---|
| CommentThread | Renders comments + replies | `src/components/comments/CommentThread.tsx` |
| CommentForm | Input form | `src/components/comments/CommentForm.tsx` |
| comments API | CRUD endpoints | `src/app/api/posts/[postId]/comments/route.ts` |

### Data Model
Comment: id, body, postId, authorId, parentId (nullable), createdAt, updatedAt.
Indexes on [postId, createdAt] and [parentId].

## Testing Strategy
11 tests: 8 integration (CRUD + auth + pagination + nesting), 2 unit, 1 e2e.

## Decision Log
| Decision | Chosen | Rationale |
|---|---|---|
| Nesting depth | 2-level | Avoids recursive queries, covers 90% of cases |
| Pagination | Cursor-based | Reliable with concurrent inserts |

verification-framework.md

Verification Framework

Every task proves it works before closing. Verification runs in two layers — signals (objective, binary) and evaluator (subjective, scored) — with generator-evaluator separation throughout. This reference also defines the three mandatory tail tasks that close every board.

TDD Workflow Per Task

Red → Green → Refactor

RED:      write tests describing the desired behavior → tests FAIL
GREEN:    write the minimum code to pass → tests PASS
REFACTOR: clean up while keeping tests green → tests PASS

Steps

Read the acceptance criteria from the task's plan.
Write test file(s) encoding each criterion as a test case.
Run tests — confirm they FAIL (red proves the tests are meaningful).
Implement to make each test pass, one at a time.
Run tests — confirm they PASS (green).
Refactor — rename, extract, simplify — keeping tests green.
Final run — all tests pass, lint clean, types check.

Test categories per task

Category	What	Priority
Happy path	Primary use case works	Required
Edge cases	Boundaries, empty, nulls	Required
Error handling	Invalid inputs, failure modes	Required
Integration	Interaction with other components	If applicable

Follow the project's existing test-naming convention; if none, use describe("Thing", () => it("should X when Y")).

Layer 1: Verification Signals (Binary Gate)

Run via the project's own scripts — never raw tools.

Signal	Example command	Required	Notes
Tests	`npm test` / `pytest` / project cmd	Always	All new + existing tests pass
Lint	`npm run lint` / project cmd	Always	Zero new warnings/errors
Type Check	`tsc --noEmit` / `mypy`	If typed	No new type errors
Build	`npm run build` / project cmd	If applicable	Project still builds
Format	`prettier --check` / `black --check`	If configured	Matches project format

Detect available commands from package.json scripts, Makefile, pyproject.toml, and CI config. If ANY signal fails, the task goes straight back to the generator to fix (up to 2 signal retries). Signal failures are unambiguous — no evaluator needed to diagnose a failing test or broken build.

If time-constrained, verify in priority order: Tests → Build → Type Check → Lint → Format.

Layer 2: The Evaluator (Scored Gate)

If all signals pass, dispatch a separate evaluator subagent. Self-evaluation has systematic bias — generators over-praise their own work and talk themselves out of legitimate issues. A fresh, skeptical context sees the work as a reviewer would.

Complexity gating

Size	Evaluator	Rationale
S (< 50 lines)	Signals only	Binary signals catch most issues in small changes
M (50-200 lines)	Full evaluator	Subjective quality matters at this scale
Any failed task	Mandatory full evaluator	Failure means the task is at the capability edge
Touches shared interfaces	Mandatory full evaluator	Integration risk demands independent review

Scored rubric (code tasks)

Dimension	Weight	1 (Fail)	3 (Acceptable)	5 (Excellent)
Correctness	30%	Tests fail / criteria unmet	Tests pass, basic criteria met	All criteria incl. edge cases, no regressions
Code Quality	20%	Ignores conventions, unclear	Follows conventions, readable	Clean, idiomatic, extends existing patterns
Completeness	20%	Partial, TODOs left	All stated criteria addressed	Handles implicit requirements too
Test Coverage	15%	No/trivial tests	Happy path tested	Edge, error, and boundary cases tested
Integration Safety	15%	Broken imports/types	Builds, existing tests pass	No warnings, clean integration

Rubric adaptations: test tasks → swap Code Quality for Assertion Quality, raise Test Coverage to 25%. docs tasks → Clarity/Accuracy replaces Code Quality; Coverage/Completeness replaces Test Coverage. config/infra → Integration Safety to 25%, add an Idempotency check.

Thresholds

Weighted Score	Verdict	Action
4.0 - 5.0	PASS	Proceed
3.0 - 3.9	NEEDS WORK	Generator gets specific feedback, retries
< 3.0	FAIL	Escalate to the user

Evaluator prompt template

## Evaluator Task: {AW-XXX} - {Title}

You are an independent evaluator. Grade this work skeptically and honestly.
Do not assume good intent. Look for gaps, shortcuts, incomplete work, hidden bugs.

### Acceptance Criteria
{criteria — what "done" means}

### Files Modified / Git Diff
{the actual diff of all changes}

### Test Output / Lint / Type / Build Output
{verification signal results}

### Project Conventions
{detected conventions}

Score each dimension 1-5 using the rubric. Output (STRICT):

#### Evaluation: AW-{XXX}
- **Correctness**: {score}/5 - {justification}
- **Code Quality**: {score}/5 - {justification}
- **Completeness**: {score}/5 - {justification}
- **Test Coverage**: {score}/5 - {justification}
- **Integration Safety**: {score}/5 - {justification}
- **Weighted Score**: {calculated}/5.0
- **Verdict**: PASS | NEEDS WORK | FAIL

#### Specific Feedback (required if score < 4.0)
- What is wrong, what the fix looks like, which files need changes.

#### Bugs Found
- {file:line references}

#### What Was Done Well
- {1-2 strengths}

The evaluator receives only outputs and criteria — never the generator's reasoning. This prevents it from rationalizing decisions it should be questioning.

Iterative refinement loop

for iteration in 1..5:
  evaluator grades the work
  if weighted score >= 4.0: PASS, exit
  if iteration >= 3 AND score stagnant/declining: FAIL, escalate with full history
  if score < 3.0 on any iteration: FAIL immediately, escalate
  else: generator makes TARGETED fixes from the feedback (not a rewrite); loop

Track scores per iteration on the board. A dropping score means the generator is thrashing — stop and escalate. On retry the generator receives: per-dimension scores, the specific-feedback section, the bugs list, and the instruction "Fix ONLY what the evaluator flagged."

Platform fallback

Without subagent support, switch context explicitly: "— EVALUATOR MODE — I evaluate the work I just completed as a different, skeptical reviewer. I do not reference my implementation intent; I judge only what is visible in the code and test output." Weaker than true separation, but strictly better than no gate.

Integration Verification (post-wave)

After a wave, if its tasks share dependencies or feed the next wave: verify import resolution, run the combined test suite (or tests for all files modified in the wave), run the build (watch for circular-dependency warnings), and smoke-test at runtime if applicable. After the final wave, run the FULL suite and build to catch regressions before CONVERGE.

What NOT to Do on Failure

Do NOT suppress or skip failing tests.
Do NOT add @ts-ignore, // eslint-disable, or # type: ignore to pass checks.
Do NOT reduce coverage or modify existing passing tests to accommodate a bug.
Do NOT mark a task done while any signal fails.

Mandatory Tail Tasks

Every task graph ends with these three, in order. They are real tasks on the board with acceptance criteria — not afterthoughts.

Third-to-last — Self Code Review (`review`)

Review all changes since the rollback point. Work the review pyramid bottom-up: Security → Correctness → Performance → Design → Readability → Convention → Testing. Classify findings [MAJOR] / [MINOR]. Fix all [MAJOR] immediately and reasonable [MINOR]. Re-run after fixes. Acceptance: zero [MAJOR] remaining; all [MINOR] documented (fixed or explicitly deferred). Depends on all implementation/test/docs tasks. If absolute-simplify is installed, run it on the working changes here.

Second-to-last — Requirements Validation (`verify`)

Compare all changes against the original prompt, intake summary, and spec. Verify every success criterion and constraint is satisfied. Acceptance: every success criterion demonstrably met; gaps loop back to EXECUTE until resolved. Depends on the self code review.

Last — Full Project Verification (`verify`)

Run all available checks via the project's package-manager scripts, skipping any not configured: Tests → Lint → Typecheck → Build. Acceptance: all available checks pass; failures are fixed and re-run until green. Do not mark the board completed until every available check passes and its output is recorded on the board. Depends on requirements validation.

Frequently Asked Questions

What is absolute-work?

How do I install absolute-work?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support absolute-work?

absolute-work works with claude-code, gemini-cli, openai-codex, mcp. Install it once and use it across any supported AI coding agent.

Is absolute-work free?

Yes, absolute-work is completely free and open source under the MIT license. Install it with a single command and start using it immediately.

What is the difference between absolute-work and similar tools?

absolute-work is an AI agent skill that teaches your coding agent specialized workflow knowledge. Unlike standalone tools, it integrates directly into claude-code, gemini-cli, openai-codex and other AI agents.

Can I use absolute-work with Cursor or Windsurf?

absolute-work works with any AI coding agent that supports the skills protocol, including Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and 40+ more.