absolute-work
End-to-end, phase-gated software development lifecycle for AI agents. Turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code. Triggers on "build this end-to-end", "plan and build", "break this into tasks", "pick up this ticket", "grill me on this", "run this migration", "absolute-work this", or any multi-step development task. Relentlessly interviews to a shared design, writes a reviewed spec, decomposes into atomic tasks on a persistent markdown board, then peels tasks one safe wave at a time with test-first verification. Handles features, bugs, refactors, greenfield projects, planning breakdowns, and migrations.
workflow sdlcplanningtask-managementtddmigrationworkflowWhat is absolute-work?
End-to-end, phase-gated software development lifecycle for AI agents. Turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code. Triggers on "build this end-to-end", "plan and build", "break this into tasks", "pick up this ticket", "grill me on this", "run this migration", "absolute-work this", or any multi-step development task. Relentlessly interviews to a shared design, writes a reviewed spec, decomposes into atomic tasks on a persistent markdown board, then peels tasks one safe wave at a time with test-first verification. Handles features, bugs, refactors, greenfield projects, planning breakdowns, and migrations.
Quick Start
- Open your terminal or command prompt
- Run:
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work - Start your AI coding agent (Claude Code, Cursor, Gemini CLI, or any supported agent)
- The absolute-work skill is now active and ready to use
absolute-work
absolute-work is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex, and mcp. An end-to-end, phase-gated software development lifecycle that turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code.
Quick Facts
| Field | Value |
|---|---|
| Category | workflow |
| Version | 0.1.0 |
| Platforms | claude-code, gemini-cli, openai-codex, mcp |
| License | MIT |
| References | 6 deep-dive guides |
| Evals | 14 test cases |
How to Install
- Make sure you have Node.js installed on your machine.
- Run the following command in your terminal:
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work- The absolute-work skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).
Overview
absolute-work is one continuous skill that takes any unit of work — a ticket, a task, a plan, a migration — from fuzzy intent to verified code. It replaces the traditional brainstorm-then-build handoff with a single lifecycle that stops at hard gates so you stay in control the whole way.
It relentlessly interviews you to a shared design (one question at a time, codebase-first), writes and self-reviews a spec, decomposes the work into atomic tasks on a persistent local markdown board, then peels those tasks off one safe wave at a time with test-first verification. No external trackers, no silent scope creep, and no code until the design is approved.
The 6 phases
- Intake & Brainstorm — deep context scan, codebase-first questioning, adaptive question banks per work type, relentless design interview until mutual 100% confidence
- Spec — writes the approved design to
docs/plans/, then a separate reviewer subagent grades it on a scored rubric - Decompose & Plan — atomic tasks, a dependency graph, and safe-wave assignment on
.absolute-work/board.md - Execute — onion-peel one wave at a time, TDD per task (red → green → refactor)
- Verify — binary signals (test/lint/typecheck/build) then an independent scored evaluator
- Converge — full suite, summary, manual test steps to exercise the new functionality, close the board, suggest a commit (never auto-commit)
What makes this skill different
- Phase-gated — stops and waits for your explicit "go" between every phase. Control over speed.
- Safety-first execution — blockers and dependents run sequentially; only provably-independent tasks parallelize. When in doubt, it serializes.
- Adaptive intake — detects the work type (feature, bug, refactor, greenfield, planning, migration) and swaps in a tailored question bank. Migrations get first-class handling (call-site inventory, codemods, incremental rollout, rollback).
- Spec-driven + test-driven — a reviewed spec before code, tests before implementation, generator-evaluator separation for grading.
- Fully local — all state lives in
.absolute-work/board.md. No JIRA, no GitHub API, fully portable and resumable across sessions.
Reference Guides
| Guide | Coverage |
|---|---|
| Intake Playbook | Adaptive question banks per work type, codebase-first intelligence, design-tree traversal, calibration |
| Migration Playbook | Call-site inventory, codemods, coexistence seams, incremental rollout, backwards-compat, rollback |
| Spec Writing | Spec template, section scaling, writing style, decision log, scored review protocol |
| Board Format | Full .absolute-work/board.md spec, status transitions, sequence/wave model, example board |
| Execution Model | DAG patterns, safe-wave (sequential-blocker / parallel-independent) algorithm, agent prompts, conflict handling, failure recovery |
| Verification Framework | TDD per task, verification signals, generator-evaluator protocol, scored rubric, mandatory tail tasks |
Key Principles
- Phase gates always — explicit approval between every phase
- Codebase before questions — only ask what code can't answer
- Relentless until aligned — interview to mutual 100% confidence
- Spec before code, tests before implementation
- Dependency-first decomposition — a DAG, never a flat list
- Safety-first execution — serialize when in doubt
- Generator ≠ evaluator — the builder never grades its own work
- No silent scope creep, and never auto-commit
Tags
sdlc planning task-management tdd migration workflow spec-driven-development brainstorming verification
Platforms
- claude-code
- gemini-cli
- openai-codex
- mcp
Related Skills
Pair absolute-work with these complementary skills:
Frequently Asked Questions
What is absolute-work?
An end-to-end, phase-gated software development lifecycle for AI agents. It turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code — relentlessly interviewing you to a shared design, writing a reviewed spec, decomposing into atomic tasks on a persistent local markdown board, then peeling tasks off one safe wave at a time with test-first verification.
How is this different from just asking the agent to build something?
absolute-work imposes structure and control. It refuses to write code until the design is approved, it stops at a hard gate between every phase, it tracks everything on a persistent board that survives across sessions, and it verifies each task with an independent evaluator instead of self-grading. The result is fewer wrong turns, no silent scope creep, and a complete audit trail.
Does it integrate with JIRA or GitHub Issues?
No — absolute-work is fully local by design. "Ticket", "task", "planning", and "migration" are intake types that produce tasks in a local .absolute-work/board.md file. There are no external tracker dependencies, so it is portable and works in any repo.
How do I install absolute-work?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support absolute-work?
This skill works with claude-code, gemini-cli, openai-codex, mcp. Install it once and use it across any supported AI coding agent.
Maintainers
Generated from AbsolutelySkilled
SKILL.md
Absolute Work: End-to-End AI Development Lifecycle
Absolute Work takes any unit of work — a ticket, a task, a plan, a migration — from fuzzy intent to verified code. It is one continuous skill with hard gates between phases: brainstorm a shared design, write and review a spec, decompose into a dependency-graphed task board, then peel tasks off one safe wave at a time with test-first verification. Nothing is assumed, nothing is silently expanded, and no code is written until the design is approved.
The lifecycle has 6 phases: INTAKE & BRAINSTORM → SPEC → DECOMPOSE & PLAN → EXECUTE → VERIFY → CONVERGE
Activation Banner
At the very start of every Absolute Work invocation, before any other output, display this ASCII art banner:
█████╗ ██████╗ ███████╗ ██████╗ ██╗ ██╗ ██╗████████╗███████╗
██╔══██╗██╔══██╗██╔════╝██╔═══██╗██║ ██║ ██║╚══██╔══╝██╔════╝
███████║██████╔╝███████╗██║ ██║██║ ██║ ██║ ██║ █████╗
██╔══██║██╔══██╗╚════██║██║ ██║██║ ██║ ██║ ██║ ██╔══╝
██║ ██║██████╔╝███████║╚██████╔╝███████╗╚██████╔╝ ██║ ███████╗
╚═╝ ╚═╝╚═════╝ ╚══════╝ ╚═════╝ ╚══════╝ ╚═════╝ ╚═╝ ╚══════╝
██╗ ██╗ ██████╗ ██████╗ ██╗ ██╗
██║ ██║██╔═══██╗██╔══██╗██║ ██╔╝
██║ █╗ ██║██║ ██║██████╔╝█████╔╝
██║███╗██║██║ ██║██╔══██╗██╔═██╗
╚███╔███╔╝╚██████╔╝██║ ██║██║ ██╗
╚══╝╚══╝ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝Follow the banner immediately with: Entering plan mode — phase-gated lifecycle active
The Phase Gate Rule
Absolute Work STOPS at the end of every phase and waits for the user's explicit "go" before advancing. This is non-negotiable. The phases are:
INTAKE & BRAINSTORM ─┃ gate ┃─ SPEC ─┃ gate ┃─ DECOMPOSE & PLAN ─┃ gate ┃─ EXECUTE ─┃ gate per wave ┃─ VERIFY ─┃ gate ┃─ CONVERGEAt each gate, present what was produced, summarize what comes next, and ask the user
to confirm before proceeding. Never chain two phases without an approval in between.
Use AskUserQuestion (where available) for every gate and every interview question.
Activation Protocol
Immediately after the banner, enter plan mode before doing anything else:
- On platforms with native plan mode (e.g. Claude Code's
EnterPlanMode): invoke it immediately. - On platforms without it: simulate plan mode — complete INTAKE & BRAINSTORM and SPEC fully, write no code, and get explicit approval before EXECUTE.
The first three phases are planning work. No files are created or modified (other than the spec and the board) until the user approves the task graph and execution begins.
Session Resume Protocol
When Absolute Work is invoked and a .absolute-work/board.md already exists in the project root:
- Detect: Read the board and determine its status.
- Display: Print a compact summary of completed / in-progress / blocked / remaining tasks.
- Resume: Pick up from the last incomplete wave — do NOT restart from INTAKE.
- Reconcile: If the codebase changed since the last session, diff against the board's expected state and flag conflicts before resuming.
If the board is completed, ask whether to start a new session (archive the old board to
.absolute-work/archive/) or review the finished work. Never blow away an existing board
without explicit user confirmation.
Codebase Convention Detection
Before INTAKE begins, auto-detect the project's conventions so every phase is grounded in reality, not assumptions.
| Signal | Files to Check |
|---|---|
| Package manager | package-lock.json (npm), yarn.lock, pnpm-lock.yaml, bun.lockb, Cargo.lock, go.sum |
| Language/Runtime | tsconfig.json, pyproject.toml / setup.py, go.mod, Cargo.toml |
| Test runner | jest.config.*, vitest.config.*, pytest.ini, .mocharc.*, test directory patterns |
| Linter/Formatter | .eslintrc.*, eslint.config.*, .prettierrc.*, ruff.toml, .golangci.yml |
| Build system | Makefile, vite.config.*, next.config.*, turbo.json |
| CI/CD | .github/workflows/, .gitlab-ci.yml, Jenkinsfile |
| Available scripts | scripts in package.json, Makefile targets |
| Directory conventions | src/, lib/, app/, tests/, __tests__/, spec/ |
Write detected conventions to the board under ## Project Conventions. Reference them in
every later phase — especially PLAN and the mandatory verification tail tasks. Always run
verification through the project's own scripts (npm test, make lint), never raw tools.
When to Use This Skill
Use Absolute Work when:
- Picking up a ticket or task that needs design before implementation
- Multi-step feature development touching 3+ files or components
- "Build this end-to-end", "plan and execute this", "break this into tasks"
- Greenfield projects, major refactors, or migrations
- Planning/breakdown work — turning a vague goal into a sequenced task list
- Complex bug fixes spanning multiple systems
- The user wants to be grilled on a design before building
Do NOT use Absolute Work when:
- Single-file bug fixes or typo corrections where the answer is obvious
- Quick questions, code explanations, or pure research
- Tasks the user explicitly wants to drive manually
Key Principles
- Phase gates always. Stop and get explicit approval between every phase. Control over speed.
- Codebase before questions. Search the code first; only ask what code genuinely cannot answer.
- Relentless until aligned. Interview one question at a time until BOTH you and the user are 100% confident. Doubt on either side means keep going.
- Spec before code. No implementation until a written spec is reviewed and approved.
- Dependency-first decomposition. Every task is a node in a DAG, not a flat list.
- Safety-first execution. Blockers and dependents run sequentially; only provably-independent tasks parallelize. When in doubt, serialize. (See
references/execution-model.md.) - Test-first verification. Every task writes tests before implementation. "Done" means tests pass.
- Generator ≠ evaluator. The agent that builds a task does not grade it.
- Persistent state. All progress lives in
.absolute-work/board.md, surviving across sessions. - No silent scope creep. Everything outside the agreed scope goes to Deferred Work, visible on the board.
- Never auto-commit. Suggest a commit; the user commits.
Phase 1: INTAKE & BRAINSTORM (Relentless Design Interview)
Turn fuzzy intent into a shared, bulletproof design. This is a structured interrogation of every assumption, dependency, and design branch — not a casual chat.
The interview directive — operate by this verbatim:
Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
Ask the questions one at a time.
If a question can be answered by exploring the codebase, explore the codebase instead.
Step 1 — Deep context scan
Read what exists before asking anything: docs/ (README first), root README.md, CLAUDE.md,
CONTRIBUTING.md, docs/plans/ (overlapping designs), recent commits (last 10-20), package
manifests, top-level structure. Synthesize what matters — do not dump a file listing.
Step 2 — Codebase-first intelligence
Before asking ANY question, check if the codebase answers it. Facts live in code
(database, test framework, auth); preferences require asking (visual style, real-time vs batch).
When code answers it, state what you found: "I see you're using Prisma with PostgreSQL — I'll
design around that." See references/intake-playbook.md.
Step 3 — Detect the work TYPE and adapt
Identify the type and swap in its tailored question bank (full banks in references/intake-playbook.md):
| Type | Focus |
|---|---|
| Feature | user problem, flow, happy/error paths, scope boundary |
| Bug | repro steps, expected vs actual, blast radius, fix criteria |
| Refactor | pain point, target state, blast radius, test safety net, incremental vs all-at-once |
| Greenfield | problem/user fit, v1 scope, stack, data model, deploy target |
| Planning / breakdown | goal, milestones, sequencing, what ships first |
| Migration | what→what, coexistence, rollback, breaking changes, call-site inventory — load references/migration-playbook.md |
Step 4 — Scope assessment
If the request spans multiple independent subsystems, flag it and decompose into sub-projects
first; brainstorm the first sub-project through the normal flow. See references/approach-analysis.md.
Step 5 — Relentless interview
- One question at a time via
AskUserQuestion. Never batch. - Strictly linear — resolve decision A before asking about dependent decision B.
- Walk the design tree depth-first — purpose → data model → behavior → UI → edge cases. Every branch has an error/edge-case child; walk it.
- Honest options — only propose multiple approaches at a genuine fork; always mark one (Recommended) with rationale tied to project context. When the answer is obvious, present it and briefly say why alternatives were dismissed.
- Mutual 100% confidence — after each decision, confirm both sides are sure. Hesitation means probe deeper.
Step 6 — Confidence self-check
Before presenting the design, review every decision: am I 100% sure, or filling gaps with assumptions? Any sub-100% decision → return to the interview. State your confidence to the user.
Step 7 — Design presentation
Present section by section (architecture, components, data flow, error handling, testing), scaled to complexity. Get approval per section. Design for isolation: small units, one clear purpose each, well-defined interfaces. Follow existing patterns; don't fight the codebase.
━━ GATE: user approves the full design before Phase 2. ━━
Phase 2: SPEC (Spec-Driven Development)
Write the approved design to docs/plans/YYYY-MM-DD-<topic>-design.md (clear prose, file
paths, code blocks for schemas/interfaces, a Decision Log). Scale sections to complexity.
Then run a scored spec review with a separate reviewer subagent (generator-evaluator separation): graded on Completeness, Consistency, Clarity, Scope, Testability (1-5 each).
- 4.0+ → approved, proceed to user review
- 3.0-3.9 → fix flagged issues, re-dispatch (max 3 iterations)
- < 3.0 → surface to the user immediately
See references/spec-writing.md for the template, scaling rules, and review rubric.
━━ GATE: user reviews and approves the spec before Phase 3. ━━
Phase 3: DECOMPOSE & PLAN (Build the Task Board)
Break the spec into atomic sub-tasks, build the dependency graph, and write the board.
Decompose
Each sub-task has: ID (AW-001), Title (action-oriented), Description (2-3 sentences),
Type (code | test | docs | infra | config), Size (S < 50 lines | M 50-200;
no L — decompose further), Dependencies (task IDs).
Rules: test tasks separate from code; infra/config before dependents; aim for 5-15 tasks; every
graph ends with the three mandatory tail tasks (Self Code Review → Requirements Validation →
Full Project Verification — see references/verification-framework.md). Apply the complexity budget:
if scope exceeds ~15 M-equivalent tasks, suggest splitting into multiple sessions.
Build the DAG and assign safe waves
Compute each task's depth (max(dependency depth) + 1) and group by depth into waves. Then apply
the safety pass: within a wave, only tasks that touch disjoint files and share no
interfaces may run in parallel; everything else is serialized. Assign shared-file ownership to a
single task. When in doubt, serialize. See references/execution-model.md and references/board-format.md.
Per-task plan
For each task: files to create/modify, test files (TDD — written first), approach, acceptance
criteria, and concrete test cases (happy path, edge, error). Write everything to
.absolute-work/board.md. Ask the user during intake whether the board is git-tracked or gitignored.
Present the ASCII dependency graph + wave/sequence plan.
━━ GATE: user approves the task graph before Phase 4. ━━
Phase 4: EXECUTE (Onion-Peel, One Safe Wave at a Time)
Pre-execution snapshot
Before touching any file: ensure the tree is clean (commit or stash), record the current commit
hash on the board under ## Rollback Point. This is the safety net for catastrophic failure.
Wave loop
for each wave in [Wave 1 … Wave N]:
partition tasks: sequential (blockers/dependents/shared-file) vs parallel-safe (disjoint, independent)
run sequential tasks in dependency order, one at a time
run parallel-safe tasks concurrently (separate agents)
each task → TDD: write failing tests (red) → implement (green) → refactor → update board
wave boundary: conflict check + compact progress report
━━ GATE: confirm before starting the next wave ━━Each agent gets a self-contained prompt from the board (conventions + research + plan +
acceptance criteria) and the rule: write tests first, stay in scope, report blockers — never work
around them. Scope creep: blocking discoveries become new visible tasks; non-blocking ones go to
## Deferred Work. See references/execution-model.md for the agent template, conflict
resolution, blocked-task handling, and failure recovery.
Phase 5: VERIFY (Signals + Independent Evaluator)
Every task proves it works before closing, using two layers:
- Signals (binary gate) — run the project's test, lint, typecheck, and build scripts. Any
failure goes straight back to the generator to fix (up to 2 retries). No skipping tests, no
@ts-ignore/eslint-disable/type: ignoreto pass checks. - Evaluator (scored gate) — if signals pass, a separate evaluator subagent grades the work against a scored rubric (Correctness, Code Quality, Completeness, Test Coverage, Integration Safety). 4.0+ passes; 3.0-3.9 iterates on specific feedback (max 5); < 3.0 escalates to the user.
S-size tasks may skip the evaluator if all signals pass cleanly; M-size, failed, or
shared-interface tasks always get it. See references/verification-framework.md.
After the final wave, run the full suite and the three mandatory tail tasks.
━━ GATE: present verification results before Phase 6. ━━
Phase 6: CONVERGE (Close Out)
- Full suite — run the complete test/lint/build one final time.
- Documentation — update any docs that were in scope.
- Summary — files changed (with line counts), tests added, key decisions, deferred work.
- How to test it — end every session with concrete, copy-pasteable steps the user can run to exercise the added functionality themselves: the exact commands to start the app/script, the inputs or routes to hit (
curl, UI clicks, CLI invocation), and the expected output for each. Cover the happy path plus at least one edge/error case. Ground every command in the detected conventions (real scripts, real ports, real file paths) — never invent commands the project does not have. - Close board — mark
completedwith a timestamp; the board is the audit trail. - Suggest commit — propose a message. Never run
git commityourself.
Gotchas
- Chaining phases without a gate. The whole point is control — never advance past a phase boundary without the user's explicit "go".
- Asking questions the codebase answers. Search configs, deps, and test files before every question; it erodes trust to ask what the code already states.
- Parallel agents editing shared files. Two same-wave tasks editing one utility produce a wave-boundary conflict. Detect shared files in DECOMPOSE, assign ownership, and serialize the rest. Default to sequential when unsure.
- Rollback point recorded mid-wave. Capture the commit hash before Wave 1 touches anything, or the snapshot already contains partial changes.
- Board marked complete without running checks. The mandatory tail tasks are skipped most often. Never mark
completeduntil the actual test/lint/build output is on the board. - DISCOVER/research skipped for "obvious" tasks. Agents duplicate existing utilities or miss conventions a 2-minute scan would catch. Research every task.
- Silent scope creep. Adjacent improvements absorbed mid-task obscure what changed and blow the estimate. Everything outside scope goes to Deferred Work.
- Auto-committing. This skill suggests commits; it never runs them.
Anti-Patterns
| Anti-Pattern | Better Approach |
|---|---|
| Jumping to code before the spec is approved | Hard gate: no implementation until the spec passes review and the user approves |
| Batching multiple interview questions | One question at a time, depth-first, dependency-resolved |
| Flat task lists without dependencies | Model as a DAG — hidden dependencies cause ordering bugs and conflicts |
| Parallelizing everything for speed | Safety first — only disjoint, independent tasks parallelize; serialize when in doubt |
| Proposing fake alternatives when the answer is obvious | Present the single right answer with rationale; options only at genuine forks |
| Skipping TDD for "simple" changes | Tests are the proof of correctness — write them first, always |
| Self-grading completed work | Dispatch a separate, skeptical evaluator — generators over-praise their own work |
| Massive L-sized tasks | Decompose until every task is S or M |
| Starting fresh when a board exists | Detect, display status, resume from the last incomplete wave |
| Advancing with private doubts | Stop, reason, and either resolve the doubt or surface it as a question |
Output / Response Style
Respond terse like smart caveman. All technical substance stay. Only fluff die.
Persistence
ACTIVE EVERY RESPONSE once triggered. No revert after many turns. No filler drift. Still active if unsure. Off only when user says "stop caveman" or "normal mode".
Rules
Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Abbreviate common terms (DB/auth/config/req/res/fn/impl). Strip conjunctions. Use arrows for causality (X -> Y). One word when one word enough.
Technical terms stay exact. Code blocks unchanged. Errors quoted exact.
Pattern: [thing] [action] [reason]. [next step].
Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..." Yes: "Bug in auth middleware. Token expiry check use < not <=. Fix:"
Examples
"Why React component re-render?"
Inline obj prop -> new ref -> re-render. useMemo.
"Explain database connection pooling."
Pool = reuse DB conn. Skip handshake -> fast under load.
Auto-Clarity Exception
Drop caveman temporarily for: security warnings, irreversible action confirmations, multi-step sequences where fragment order risks misread, user asks to clarify or repeats question. Resume caveman after clear part done.
Example — destructive op:
Warning: This will permanently delete all rows in the users table and cannot be undone.
DROP TABLE users;Caveman resume. Verify backup exist first.
References
Load a reference only when its phase needs it — they are long and consume context.
references/intake-playbook.md— adaptive question banks per work type, codebase-first intelligence, design-tree traversal, calibration, example sessionsreferences/migration-playbook.md— first-class migration handling: call-site inventory, codemods, incremental rollout, backwards-compat, rollbackreferences/spec-writing.md— spec template, section scaling, writing style, decision log, scored review protocolreferences/board-format.md— full.absolute-work/board.mdspec, statuses, sequence/wave model, example boardreferences/execution-model.md— DAG patterns, safe-wave (sequential-blocker / parallel-independent) algorithm, agent prompt template, conflict handling, scope-creep and failure recoveryreferences/verification-framework.md— TDD per task, verification signals, generator-evaluator protocol, scored rubric, mandatory tail tasks
References
board-format.md
Board Format Specification
The .absolute-work/board.md file is the single source of truth for an Absolute Work run. It is
the only state — fully local, no external trackers — and is designed to be both human-readable and
machine-parseable. It survives across sessions to enable resume, audit, and handoff.
File Location
{project-root}/.absolute-work/board.mdThe .absolute-work/ directory may also contain:
board.md— the main board (always present)archive/board-{timestamp}.md— completed or superseded boards
The user chooses during intake whether .absolute-work/ is git-tracked (audit trail, resume
across machines) or gitignored (local working state).
Board Metadata (YAML frontmatter)
---
id: aw-{timestamp}
title: "{brief description of the overall task}"
type: feature | bug | refactor | greenfield | planning | migration
status: intake | spec | decomposing | planning | executing | verifying | converged | completed | abandoned
created: "{ISO 8601}"
updated: "{ISO 8601}"
git_tracked: true | false
evaluator_enabled: true | false
total_tasks: {N}
completed_tasks: {N}
failed_tasks: {N}
current_wave: {N}
total_waves: {N}
---evaluator_enabled defaults to true for boards with any M-size tasks; set false only when all
tasks are S-size.
Board Sections
1. Intake Summary
## Intake Summary
- **Task**: {one-line description}
- **Type**: feature | bug | refactor | greenfield | planning | migration
- **Complexity**: simple | medium | complex
- **Problem**: {what needs to be built/fixed}
- **Success Criteria**: {what "done" looks like}
- **Constraints**: {patterns, libraries, conventions to follow}
- **Dependencies**: {external APIs, services, other work}
- **Edge Cases**: {known edge cases} (if complex)
- **Spec**: docs/plans/{date}-{topic}-design.md
- **Board Persistence**: git-tracked | gitignored2. Project Conventions
Written during Codebase Convention Detection — package manager, language/runtime, test runner, linter/formatter, build system, available scripts, directory conventions. Referenced by every later phase and by every execution agent's prompt.
3. Task Graph
## Task Graph
### Sub-tasks
| ID | Title | Type | Size | Dependencies | Wave | Run | Status |
|----|-------|------|------|-------------|------|-----|--------|
| AW-001 | {title} | config | S | - | 1 | seq | done |
| AW-002 | {title} | code | M | AW-001 | 2 | seq | in-progress |
| AW-003 | {title} | code | S | AW-001 | 2 | parallel | pending |
### Dependency Graph
{ASCII graph — see execution-model.md}
### Wave Assignments
- **Wave 1** (1 task): AW-001 [sequential — blocker, shared file]
- **Wave 2** (2 tasks): AW-002 [sequential], AW-003 [parallel-safe]The Run column records the safety decision: seq (blocker/dependent/shared-file → run in
order) or parallel (disjoint files, no shared interfaces → may run concurrently).
4. Tasks (per-task detail)
## Tasks
### AW-001: {title}
- **Type**: code | test | docs | infra | config
- **Size**: S | M
- **Dependencies**: none | [AW-XXX]
- **Wave**: {N} **Run**: seq | parallel
- **Status**: {current status}
#### Research Notes
- Key files: {list}
- Reusable code: {functions/utilities to reuse}
- Patterns: {conventions observed}
- Risks: {risks identified}
#### Execution Plan
- Files to create/modify: {list}
- Test files: {list}
- Approach: {brief}
- Acceptance criteria:
- [ ] {criterion 1}
- Test cases: {happy path, edge, error}
#### Verification
- Signals: PASS | FAIL
- Tests: {passed}/{total} ({new} new)
- Lint: clean | {issues} Type Check: pass | {errors} Build: pass | fail
- Evaluator Score: {N.N}/5.0 | skipped (S-size)
- Verdict: PASS | NEEDS WORK | FAIL Iteration: {N}/55. Rollback Point
## Rollback Point
- Pre-execution commit: {hash}
- Recorded: {timestamp}6. Execution Log
## Execution Log
### Wave 1 — {timestamp}
- Tasks: AW-001 (seq)
- Completed: {timestamp} — Result: all passed7. Deferred Work
## Deferred Work
- {non-blocking discovery}, found during {task}, not in original scope8. Convergence Summary
## Convergence Summary
### Files Changed
| File | Action | Lines |
|------|--------|-------|
| src/api/auth.ts | created | +120 |
### Tests Added
- Total new tests: {N} Coverage: {% if available}
### Key Decisions
- {decision and why}
### Suggested Commit Message
{emoji} {type}: {subject}
{body}Status Transitions
Board-level
intake → spec → decomposing → planning → executing → verifying → converged → completed
└→ abandonedTask-level
pending → researching → planned → in-progress → verifying → done
│ └→ failed (retry)
└→ blocked| From | To | Trigger |
|---|---|---|
| pending | planned | Research + plan written in DECOMPOSE & PLAN |
| planned | in-progress | EXECUTE starts for this task |
| in-progress | verifying | Implementation complete, running checks |
| in-progress | blocked | Dependency failed or external blocker |
| verifying | done | All signals pass and evaluator (if run) passes |
| verifying | failed | Verification failed after max retries |
| blocked | in-progress | Blocker resolved |
Resuming a Board
At the start of any invocation, if .absolute-work/board.md exists:
- Read the board; parse frontmatter and current state.
- Display a compact status summary.
- Identify the current phase and incomplete tasks (anything not
done/failed). - Resume from the current point —
executing→ next unfinished wave;verifying→ re-run verification on unverified tasks;planning→ finish remaining plans. - Add a "Resumed at {timestamp}" entry to the Execution Log.
- Reconcile: if the codebase changed since
updated, flag conflicts before continuing.
If the board is completed, ask whether to start fresh (archive to archive/board-{timestamp}.md)
or review. Never overwrite a board without explicit confirmation.
Example Board (abbreviated)
---
id: aw-1717400000
title: "Add user authentication to Next.js app"
type: feature
status: executing
git_tracked: false
total_tasks: 8
completed_tasks: 3
current_wave: 2
total_waves: 4
---
## Intake Summary
- **Task**: Add email/password + Google OAuth authentication
- **Type**: feature **Complexity**: complex
- **Constraints**: NextAuth.js v5, existing Prisma + PostgreSQL
- **Spec**: docs/plans/2026-06-03-auth-design.md
- **Board Persistence**: gitignored
## Task Graph
### Sub-tasks
| ID | Title | Type | Size | Dependencies | Wave | Run | Status |
|----|-------|------|------|-------------|------|-----|--------|
| AW-001 | NextAuth config + providers | config | S | - | 1 | seq | done |
| AW-002 | User + Account Prisma models | config | S | - | 1 | parallel | done |
| AW-003 | Auth API route handler | code | M | AW-001, AW-002 | 2 | seq | in-progress |
| AW-004 | Auth middleware | code | M | AW-001 | 2 | parallel | done |
### Wave Assignments
- **Wave 1** (2): AW-001 [seq — shared config], AW-002 [parallel-safe]
- **Wave 2** (2): AW-003 [seq], AW-004 [parallel-safe]
## Rollback Point
- Pre-execution commit: a1b2c3d execution-model.md
Execution Model
Absolute Work executes a dependency-graphed task board one safe wave at a time — the onion-peel. The defining rule is safety over speed: blockers and dependents run sequentially; only provably-independent tasks run in parallel. When in doubt, serialize.
Identifying Dependencies
Task B depends on task A if:
- B needs code/files that A creates
- B imports or uses a function/type/interface A defines
- B tests, extends, modifies, or documents A's output
- B configures infrastructure A requires
B does NOT depend on A if they modify different files with no shared interfaces and can be tested in isolation.
Dependency checklist (per task pair)
- Does B need any file A creates? → dependency
- Does B import any symbol A defines? → dependency
- Does B test code A writes? → dependency
- Can B's tests pass without A complete? → if yes, no dependency
Common DAG Patterns
Linear chain: AW-001 → AW-002 → AW-003 (no parallelism; each task is a wave)
Fan-out: AW-001 → {AW-002, AW-003, AW-004} (W1: 001; W2: the rest)
Fan-in: {AW-001, AW-002, AW-003} → AW-004 (W1: the three; W2: 004)
Diamond: 001 → {002, 003} → 004 (setup → parallel features → integration)
Independent clusters: A:001→002 B:003→004 C:005 (disconnected sub-graphs)
Layered: infra → data → logic → ui → tests (one wave per layer)The Safety-First Wave Algorithm
Two passes: depth grouping, then a safety partition.
Pass 1 — Depth grouping (topological)
for each task:
depth = 0 if no dependencies
else max(dependency.depth) + 1
waves = group tasks by depth // Wave 1 = depth 0, Wave 2 = depth 1, ...Waves execute in strict serial order — Wave N+1 never starts until Wave N is fully verified and the user confirms.
Pass 2 — Safety partition (within each wave)
Classify every task in the wave as seq (sequential) or parallel:
A task is parallel-safe only if ALL hold:
- It touches a disjoint set of files from every other task in the wave
- It shares no interfaces/types being defined by another task in the wave
- It is not a blocker that a later task in the same wave reads from
- Its outcome does not change another task's plan
Otherwise it is seq — run it alone, in dependency order. Default to seq when uncertain.
Record the decision in the board's Run column with a one-line reason.
Shared-file ownership
If two same-wave tasks would touch the same file, assign that file to one owner task; the others treat it as read-only until the owner completes, or get moved to a later wave. This is the single most common source of wave-boundary conflicts — resolve it at decomposition time.
Execution within a wave
run all `seq` tasks first, one at a time, in dependency order
then run `parallel`-safe tasks concurrently (separate agents)
wait for the whole wave to finish → wave boundary checks → GATE → next waveRunning seq tasks first means the riskiest/shared work lands before independent work fans out,
so parallel agents build on a settled base.
Worktree isolation (optional)
On platforms that support it, run parallel tasks in isolated worktrees when there is any residual
risk of file overlap; merge back at the wave boundary. Skip it when tasks touch clearly different
directories — the merge overhead isn't worth it.
ASCII Graph Rendering
Task Graph:
[W1] AW-001 [config: Init structure] (seq — shared config)
├──> [W2] AW-002 [code: DB schema] (seq)
└──> [W2] AW-003 [code: API router] (parallel-safe)
Wave Summary:
Wave 1 (1): AW-001 [seq]
Wave 2 (2): AW-002 [seq], AW-003 [parallel]Pre-Execution Snapshot
Before any file is touched in Wave 1:
- Ensure the working tree is clean (commit or stash existing changes).
- Record the current commit hash on the board under
## Rollback Point. - If execution goes catastrophically wrong, the user can
git reset --hardto this commit.
Record the hash before Wave 1 begins — a mid-wave snapshot already contains partial changes.
Agent Prompt Template
Each execution agent receives a self-contained prompt from the board:
## Task: {AW-XXX} - {Title}
### Context
{description from the board}
### Project Conventions
{detected conventions — package manager, test runner, linter, directory patterns}
### Research Notes
{key files, reusable code, patterns, risks for this task}
### Execution Plan
- Files to create/modify: {list}
- Test files: {list}
- Approach: {from PLAN}
### Acceptance Criteria
{specific, verifiable conditions}
### Rules
1. Write tests FIRST, watch them fail (red), then implement (green), then refactor.
2. Run lint and type-check on modified files via the project's scripts.
3. Do NOT modify files outside your task scope.
4. Reuse existing utilities named in the research notes — do not reinvent them.
5. If blocked, STOP and report the blocker — never work around it.
6. Report: files changed, tests written, tests passing, any issues.Template by task type
- code → full TDD template above.
- test → skip "write tests first" (the task is the tests); write happy/edge/error cases and run against the implementation.
- docs → no TDD; follow existing doc style; verify code examples are syntactically valid.
- config/infra → verify by running the relevant tool/build; check idempotency.
Wave Boundary Checks
After every task in a wave completes, before the gate:
- Conflict check — did any two agents modify the same file? Merge intelligently (prefer the change that better satisfies its acceptance criteria); if unmergeable, present both versions to the user.
- Interface compatibility — types defined by one task match those expected by another.
- Import resolution — all cross-task imports resolve.
- Combined build + tests — run the build and the wave's tests together.
- Progress report — print a compact status table:
Wave 2 complete (2/4 waves)
| Task | Run | Status | Notes |
|--------|----------|--------|-------|
| AW-003 | seq | done | |
| AW-004 | parallel | done | |Then GATE — confirm before starting the next wave.
Scope Creep Guard
- Blocking discovery (can't finish the current task without it): add a new visible task to the DAG, place it in the current or next wave, flag it on the board, continue with other tasks.
- Non-blocking discovery (nice-to-have, adjacent cleanup): do NOT absorb it. Add it to
## Deferred Work, mention it at CONVERGE. The user decides whether to start a new session. - Never silently expand scope — every DAG addition is visible on the board and called out in the next progress report.
Blocked Tasks
When a task is blocked:
1. Mark status `blocked` with a reason and the blocking task ID.
2. Continue executing non-blocked tasks in the wave.
3. After the wave, reassess:
- blocker resolved → add to the next wave
- blocker persists → flag for user attention
- approachable differently → revise plan and retryIf a Wave N task fails and Wave N+1 tasks depend on it, mark the dependents blocked (not
failed); run the non-dependent Wave N+1 tasks normally; unblock dependents if the failure is
later fixed.
Failure Recovery
| Failure | Action | Max Retries |
|---|---|---|
| Test failure (code bug) | Fix code, re-run tests | 2 |
| Lint/type error | Fix, re-run check | 2 |
| Build failure | Find root cause, fix | 1 |
| Agent crash/timeout | Restart with same prompt | 1 |
| Merge conflict | Resolve, re-verify | 1 |
| Fundamental approach failure | Revise plan, flag for user | 0 (needs user input) |
On retry, append the error to the agent prompt: "Previous attempt failed because: {error}. Fix
and retry." When retries are exhausted, mark the task failed, record all attempt logs, flag the
user with the rollback hash, and continue with non-dependent tasks. Never bypass tests or checks
to force a pass.
Performance Guidelines
- Optimal wave size: 1-3 parallel tasks (low overhead); 4-6 (good throughput); 7+ → split into sub-waves to limit failure blast radius.
- Each parallel agent consumes context and compute; in constrained environments, cap concurrency at 3-4.
- Skip parallelism when a wave has one task, when all tasks touch the same file, or when any dependency isn't fully captured in the DAG (a sign the decomposition needs another pass).
intake-playbook.md
Intake Playbook
The design interview is the engine of Absolute Work. A relentless, structured interview extracts every requirement, constraint, and edge case before a single line of code is written. This playbook covers design-tree traversal, adaptive question banks per work type, codebase-first intelligence, question calibration, implicit-requirement extraction, and anti-patterns.
Design Tree Traversal
Every unit of work is a tree of decisions. Walk it depth-first, resolving each branch completely before moving to siblings. This prevents half-explored requirements from haunting the implementation later.
Rules
- Root first — start with purpose. Why does this exist? What problem does it solve?
- Depth before breadth — explore the first child fully before siblings.
- Resolve before advancing — a node is resolved when you have a clear answer, a concrete decision, or an explicit deferral. Never leave a node ambiguous.
- Backtrack on dead ends — if a branch leads to "we don't need this," mark it explicitly out of scope and backtrack.
- Dependency edges — if a node depends on another branch, resolve the blocker first.
Example tree: "Add a commenting system"
commenting-system
├── purpose (who comments? what is commentable?)
├── data-model (schema, threading flat vs nested, storage)
├── permissions (create/edit/delete, moderation)
├── ui (input, list, threading display, empty state)
├── real-time (needed? transport? optimistic updates?)
├── notifications (notify on reply? channel?)
└── edge-cases (deleted parent, deleted post, concurrent edits, spam)By the time you reach notifications, the threading decision is already resolved, so you know
whether replies even exist. Upstream decisions constrain downstream ones — always interview in
tree order: purpose → data model → behavior → UI → edge cases.
Adaptive Question Banks by Work Type
Detect the work type, then use its bank. Scale depth to complexity (see Scaling Rules below).
Feature
| # | Question | Purpose |
|---|---|---|
| 1 | What is the feature and what user problem does it solve? | Root purpose |
| 2 | Who is the target user? Different roles? | Scope actors |
| 3 | Walk me through the user flow start to finish. | Map the journey |
| 4 | What existing features does this interact with? | Dependency map |
| 5 | What does the happy path look like? | Core behavior |
| 6 | What happens when things go wrong (network, invalid input, missing data)? | Error handling |
| 7 | Any performance requirements (response time, data volume)? | Non-functional |
| 8 | Behind a flag or always-on? | Rollout |
| 9 | What is explicitly out of scope for this version? | Scope boundary |
| 10 | How will we know it works in production? | Observability |
Bug
| # | Question | Purpose |
|---|---|---|
| 1 | Expected vs actual behavior? | Problem statement |
| 2 | How do we reproduce it (steps, environment)? | Reproduction |
| 3 | What is the impact and who is affected? | Priority |
| 4 | When did it start? Any recent changes? | Root-cause hints |
| 5 | Related bugs or known issues? | Context |
| 6 | When is this bug considered fixed? | Success criteria |
Refactor
| # | Question | Purpose |
|---|---|---|
| 1 | What is the specific pain point? | Root cause |
| 2 | What does the ideal end state look like? | Target architecture |
| 3 | Blast radius — how many files, modules, consumers? | Risk |
| 4 | Is there test coverage for the code being changed? | Safety net |
| 5 | Incremental or all-or-nothing? | Strategy |
| 6 | Downstream consumers or public APIs affected? | Breaking changes |
| 7 | Rollback plan if regressions appear? | Safety |
Greenfield
| # | Question | Purpose |
|---|---|---|
| 1 | What problem does this solve, and for whom? | Problem/user fit |
| 2 | The 3-5 core features for v1 — no more? | Scope discipline |
| 3 | Tech stack and hard constraints? | Foundation |
| 4 | Reference implementations or designs to study? | Prior art |
| 5 | High-level data model / core entities? | Data layer |
| 6 | Auth and user roles? | Auth model |
| 7 | Third-party services or APIs? | External deps |
| 8 | Deployment target? | Infrastructure |
| 9 | Testing strategy (unit, integration, e2e)? | Quality gates |
| 10 | If you could ship only one feature, which? | Prioritization |
Planning / Breakdown
Use when the user wants a vague goal turned into a sequenced plan rather than an immediate build.
| # | Question | Purpose |
|---|---|---|
| 1 | What is the end goal, stated as an outcome? | North star |
| 2 | What are the milestones between here and there? | Sequencing |
| 3 | What must ship first to unblock the rest? | Critical path |
| 4 | What is already done or in flight? | Current state |
| 5 | What are the hard deadlines or constraints? | Boundaries |
| 6 | What can be deferred to a later phase? | Scope control |
Migration
| # | Question | Purpose |
|---|---|---|
| 1 | What is being migrated, and to what? (v2→v3, JS→TS, lib A→B) | Problem statement |
| 2 | Full migration or incremental? | Strategy |
| 3 | Must old and new coexist during migration? | Constraints |
| 4 | Rollback plan if something breaks? | Safety |
| 5 | Known breaking changes? | Risk |
| 6 | Test coverage of the code being migrated? | Safety net |
| 7 | Priority order of modules to migrate? | Sequencing |
For migrations, also load migration-playbook.md — it covers call-site inventory, codemods,
incremental rollout, and backwards-compatibility in depth.
Codebase-First Intelligence
Before asking the user a question, check whether the codebase already has the answer. Every question the codebase could have answered is a wasted round-trip that erodes trust.
| Before Asking About | Search For | Where |
|---|---|---|
| Database / ORM | prisma, typeorm, mongoose, drizzle deps + config | package.json, prisma/schema.prisma, *.config.* |
| Authentication | auth middleware, JWT, session, next-auth, clerk | middleware/auth*, lib/auth*, package.json |
| Testing framework | test config + existing test files | jest.config*, vitest.config*, package.json scripts |
| State management | stores, context, redux/zustand/jotai | **/store/**, **/context/**, package.json |
| Styling | tailwind/postcss config, styled-components | tailwind.config*, *.module.css |
| Deployment | CI/CD, Dockerfiles, deploy config | .github/workflows/*, Dockerfile, vercel.json |
| Lint / format | eslint, prettier, biome | .eslintrc*, .prettierrc*, biome.json |
Protocol
For every question you are about to ask:
- Can I find it in the codebase? Search; if found, state it and skip the question.
- Can I infer it? If the project uses Prisma + Postgres, don't ask "what database?" — state it and ask the deeper question.
- Is this a fact or a preference? Facts live in code (test framework). Preferences require asking (desired coverage level, visual style, real-time vs batch).
Question Calibration
Multiple choice vs open-ended
- Multiple choice when there are 2-4 known options, the user may not know terminology, or speed matters. Always include a (Recommended) option with rationale.
- Open-ended when the answer space is unbounded or you need the user's mental model.
When a question is too broad
If the user would need more than 3 sentences to answer well, split it.
| Too broad | Better |
|---|---|
| "How should the notification system work?" | "In-app, email, or both?" then "Badge, dropdown, or full page?" |
| "What are the security requirements?" | "Who can access this resource?" then "Do we need rate limiting?" |
Rules
- One decision per question. If your question contains "and," consider splitting.
- No compound conditionals. Resolve X first, then ask the follow-up.
- Ground in the codebase. "I see you use Express with middleware routing — should new auth endpoints follow the same pattern?"
- Offer a recommendation when you can, tied to project context, not popularity.
- Timebox complexity. If a question opens a 20-minute rabbit hole, flag it and offer to defer with a placeholder.
Extracting Implicit Requirements
Users say what they want; they rarely say what they need. Surface hidden requirements as follow-up questions — do not assume.
| User Says | Hidden Requirements |
|---|---|
| "Add notifications" | Channel (in-app/email/push), read state, preferences, digest mode, notification center |
| "Make it real-time" | Transport, reconnection, optimistic updates, conflict resolution, offline handling |
| "Add user roles" | Permission model, assignment UI, hierarchy, admin override, audit logging |
| "Support file uploads" | Max size, formats, virus scanning, storage backend, progress, resume, thumbnails |
| "Add search" | Full-text vs exact, indexing, debounce, highlighting, facets, empty state, pagination |
| "Make it work offline" | Sync strategy, conflict resolution, storage limits, cache invalidation, sync status |
| "Deploy to production" | CI/CD, env config, monitoring, rollback plan |
Extraction protocol
- Acknowledge the stated requirement.
- Surface the 2-3 most architecture-affecting hidden requirements as questions.
- Do not dump all hidden requirements at once — prioritize, then circle back to polish.
Scaling Rules
| Tier | When | Questions |
|---|---|---|
| Simple | 1-2 files, clear scope, no external deps | 3 — always: problem, success criteria, constraints |
| Medium | 3-5 files / 2+ components, some ambiguity | 5 — add: existing code context, dependencies |
| Complex | 5+ files, cross-cutting, greenfield, migration | 8-10 — add: edge cases, testing, docs, rollout, priority |
Heuristic: count files touched, presence of external deps, scope definition, and whether data migration/backwards-compat is involved. When in doubt, ask one more question, not one fewer.
Anti-Patterns
- Asking what the codebase can answer — search configs and deps first.
- Batching unrelated questions — the user answers the easy one and skips the hard one. One at a time.
- Implementation before purpose — resolve what and why before how. Transport choices without context are coin flips.
- Accepting vague answers — "handle errors gracefully" means something different to everyone. Ask for a concrete example.
- Skipping error/edge branches — edge cases are where bugs live. Ask "what happens at 0 items? 10,000? on failure?"
- Leading questions — "we should use Redis here, right?" confirms your bias. Ask the open question.
- Skipping the out-of-scope conversation — without explicit scoping, the feature grows silently.
- Interviewing out of order — designing UI before the data model risks designing something the data can't support.
migration-playbook.md
Migration Playbook
Migrations are the highest-risk work type Absolute Work handles: the code already runs in production, real users depend on it, and a botched migration breaks things that worked yesterday. This playbook makes migrations a first-class flow — safe, incremental, and reversible.
The core principle: never big-bang a migration you can do incrementally. Keep old and new coexisting behind a seam, move call sites in small verified batches, and keep a rollback at every step.
Migration Types
| Type | Example | Primary Risk |
|---|---|---|
| Language/syntax | JS → TS, CommonJS → ESM | Type errors, build config, partial coverage |
| Library swap | moment → date-fns, Enzyme → Testing Library | API mismatch, behavioral differences |
| Framework version | React 17 → 18, Next 13 → 14 | Breaking changes, deprecated APIs |
| API/contract | REST v2 → v3, schema change | Consumer breakage, data shape drift |
| Data/schema | column rename, table split, DB engine | Data loss, downtime, dual-write complexity |
| Infrastructure | provider A → B, monolith → services | Config drift, cutover coordination |
Identify the type during intake — it determines which sections below apply.
Phase A: Call-Site Inventory (before any code)
You cannot safely migrate what you have not counted. Build a complete inventory of everything that touches the thing being migrated.
Steps
- Find the surface. Grep for the symbol, import, endpoint, or pattern being migrated. Capture every hit with
file:line. - Classify each call site by shape — most migrations have 3-5 distinct usage patterns plus a long tail of one-offs.
- Count and bucket. Record totals per pattern on the board. This sizes the migration and reveals whether a codemod is worth writing.
- Find the blind spots. Dynamic usage (string-built imports, reflection, config-driven dispatch) won't show up in a grep. List where these could hide.
- Map consumers. For API/contract/library migrations, identify external consumers (other services, published packages, clients) that a grep of this repo will miss.
Inventory table (write to the board)
## Migration Inventory
| Pattern | Example call site | Count | Codemod-able? | Notes |
|---------|-------------------|-------|---------------|-------|
| direct import | src/a.ts:12 | 47 | yes | mechanical rename |
| wrapped helper | src/lib/fmt.ts:8 | 1 | n/a | central shim point |
| dynamic dispatch | src/router.ts:30 | 3 | no | manual review |
Total call sites: 51 across 23 filesPhase B: Choose the Strategy
| Strategy | Use When | Trade-off |
|---|---|---|
| Incremental (strangler) | Large surface, production code, old+new can coexist | Safest; slower; needs a coexistence seam |
| Codemod-driven | Many mechanical, uniform call sites | Fast for the uniform 80%; the tail is still manual |
| Parallel-run / shadow | Behavior must be proven identical (data, money) | Highest confidence; most setup |
| Big-bang | Tiny surface (< ~10 sites) or hard cutover required | Fast; only safe when blast radius is small and fully tested |
Default to incremental unless the surface is genuinely tiny. State the chosen strategy and its rationale on the board and in the spec.
Phase C: Establish the Coexistence Seam
For incremental migrations, old and new must run side by side. Create a seam so call sites can be moved one batch at a time without a flag day.
- Adapter/shim — a thin wrapper exposing the new implementation behind the old signature (or vice versa), so call sites switch without changing shape.
- Feature flag — gate new behavior so it can be toggled per environment or per user, and reverted instantly.
- Dual-write (data migrations) — write to both old and new stores during transition; read from old until new is verified, then flip reads, then stop writing old.
- Version negotiation (APIs) — serve both v-old and v-new; deprecate v-old only after consumers move.
The seam is itself a task on the board, and usually a blocker that must complete sequentially before any call-site batch runs.
Phase D: Incremental Rollout
Move the surface in small, independently verifiable batches — this is the onion-peel applied to migrations.
for each batch of call sites (grouped by module or pattern):
1. migrate the batch (codemod for mechanical, manual for the tail)
2. run tests for the affected modules → must stay green
3. typecheck / build → must pass
4. commit-worthy checkpoint (suggest commit; user commits)
5. update the inventory: migrated N / totalBatch sizing: keep each batch small enough that if it breaks, the cause is obvious. Group by module boundary or by usage pattern. Migrate the central shim/helper first (one change unblocks many), then the mechanical bulk, then the manual tail last.
Codemod guidance
- Write the codemod against the patterns found in the inventory; dry-run it and diff before applying.
- Codemods handle the uniform majority; they will not handle dynamic dispatch, comments, or unusual formatting. Always hand-review the tail.
- Re-run the inventory grep after the codemod to confirm the count dropped to the expected remainder.
Phase E: Backwards Compatibility
| Concern | Handling |
|---|---|
| External consumers | Keep the old surface working (deprecated) until consumers migrate; announce a removal timeline |
| Persisted data | New code must read old-format data; migrate data lazily on read or via a background job |
| Serialized contracts | Version payloads; tolerate missing/extra fields during transition |
| Public API | Additive changes only during transition; breaking removals happen in a later, separate release |
Backwards-compat shims are temporary debt: add a task to the Deferred Work section to remove them once the migration completes and consumers have moved.
Phase F: Rollback Plan
Every migration records how to undo it, at every checkpoint — not just at the end.
- Snapshot — the pre-migration commit hash on the board (
## Rollback Point). - Per-batch reversibility — each batch is a clean checkpoint the user can revert to.
- Flag kill-switch — if behind a feature flag, the rollback is flipping the flag, no redeploy.
- Data rollback — for dual-write, document how to stop writing new and resume reading old; for destructive schema changes, ensure a backup exists before the change.
- Cutover criteria — define what must be true before the old path is removed (all call sites moved, consumers migrated, monitoring clean for N days).
Never remove the old path in the same step that introduces the new one. Removal is its own task, gated on the cutover criteria being met.
Migration Anti-Patterns
| Anti-Pattern | Better Approach |
|---|---|
| Migrating before inventorying call sites | Grep and count everything first — you can't migrate what you haven't found |
| Big-bang on a large surface | Incremental batches behind a coexistence seam |
| Removing the old path alongside adding the new | Coexist first; remove only after cutover criteria are met |
| Codemod with no dry-run/diff review | Dry-run, diff, then apply; always hand-review the tail |
| Ignoring dynamic/reflective usage | List blind spots explicitly; grep won't catch string-built dispatch |
| No rollback until the very end | Every batch is a reversible checkpoint; snapshot before Wave 1 |
| Forgetting external consumers | Keep old contract alive and deprecated until consumers move |
| Leaving compat shims forever | Track shim removal in Deferred Work, gated on cutover |
spec-writing.md
Spec Writing
Reference for producing the design spec during Phase 2. Covers the document template, section scaling rules, writing style, the decision log, and the scored review protocol.
Spec Document Template
Write to docs/plans/YYYY-MM-DD-<topic>-design.md where <topic> is a short kebab-case slug
(e.g. 2026-06-03-commenting-system-design.md).
# [Topic] Design Spec
## Summary
<!-- 2-3 sentences. What is being built and why. -->
## Context
<!-- What exists today. Why this change is needed. Link relevant code paths. -->
## Design
### Architecture
<!-- How the pieces fit together. ASCII diagram or description. -->
### Components
<!-- Each new/modified component with its responsibility and file path. -->
### Data Model
<!-- Schemas, tables, types. Code blocks for definitions. -->
### Interfaces / API Surface
<!-- Endpoints, function signatures, event contracts. Code blocks. -->
### Data Flow
<!-- Step-by-step for the key operations. -->
## Error Handling
<!-- Failure modes, retry strategy, user-facing error states. -->
## Testing Strategy
<!-- What to test, how, and at what level (unit/integration/e2e). -->
## Migration Path
<!-- Steps from current to new state. Remove if not applicable. -->
## Open Questions
<!-- Unresolved items. Remove if none remain. -->
## Decision Log
<!-- Key decisions from the interview. See format below. -->Section Scaling Rules
Scale depth to complexity. Remove sections that would only say "N/A".
Simple (config change, utility, small fix) — ~1 page
Summary (2-3 sentences), Context (1-2 sentences), Components (bullets of what changes), Data Model / Interfaces only if changed, Testing Strategy (which tests). Skip Architecture, Data Flow, Migration Path, Open Questions, Decision Log.
Medium (new component, endpoint, moderate feature) — 2-3 pages
All core sections at moderate depth: Architecture (brief, no diagram needed), Components (table with name/responsibility/path), full Data Model in a code block, full Interfaces with request/response shapes, Data Flow (numbered steps), Error Handling (table), Testing Strategy (specific cases), Decision Log.
Complex (new system, migration, cross-cutting) — 4-6 pages
Every section at full depth: Architecture with a diagram and component relationships, Components table with dependencies, full schemas with relationships and indexes, all endpoints/functions with full types, primary + secondary data flows, comprehensive error handling with retry logic, test matrix by type, phased Migration Path with rollback, Open Questions with owners, full Decision Log.
Complexity heuristic
| Signal | Simple | Medium | Complex |
|---|---|---|---|
| Files touched | 1-2 | 3-8 | 8+ |
| New components | 0 | 1-2 | 3+ |
| External deps | 0 | 0-1 | 2+ |
| Data model changes | none/trivial | new table/type | schema migration |
| Cross-cutting | no | maybe | yes |
Writing Style
Be concrete, not abstract
| Bad | Good |
|---|---|
| "An endpoint for comments" | POST /api/posts/:postId/comments |
| "A component that shows comments" | src/components/CommentThread.tsx |
| "Some database table" | comments table: id, post_id, author_id, body, created_at |
| "We'll handle errors" | Return 422 { error: "body_required" } when the body is empty |
Include file paths relative to repo root
The auth middleware at
src/middleware/auth.tsvalidates the JWT before the request reachessrc/api/comments/create.ts.
Use tables for comparisons and code blocks for interfaces/schemas
interface CreateCommentRequest {
postId: string;
body: string;
parentId?: string; // for threaded replies
}YAGNI
Remove anything not directly needed: don't spec future phases unless they constrain the current design, don't add "nice to have" sections, don't include sections that only say "N/A", fold one-sentence sections into a neighbor.
Decision Log Format
Record every decision where more than one reasonable option existed. The Rationale column is the most important — it prevents future re-litigation.
| Decision | Options Considered | Chosen | Rationale |
|---|---|---|---|
| Database for comments | PostgreSQL, MongoDB, SQLite | PostgreSQL | Already in stack, ACID for threading, full-text search |
| Comment nesting depth | unlimited, flat, 2-level | 2-level | Simple UI, covers 90% of cases, avoids recursive queries |
| Auth for commenting | anonymous, logged-in, mixed | logged-in only | Reduces spam, simplifies moderation, matches existing auth |
Include both decisions the user made explicitly and ones you recommended. Keep each cell to 1-2 sentences.
Scored Spec Review Protocol
After writing the spec, dispatch a separate reviewer subagent (generator-evaluator separation — the agent that wrote the spec does not review it).
Rubric
| Criterion | Weight | 1 (Fail) | 3 (Acceptable) | 5 (Excellent) |
|---|---|---|---|---|
| Completeness | 25% | TODOs, missing sections | Required sections present but thin | Every section substantive for its tier |
| Consistency | 20% | Names/types contradict | Mostly consistent, minor mismatches | All names, types, paths match perfectly |
| Clarity | 20% | Ambiguous, needs author to interpret | Clear to someone with project context | An unfamiliar dev can build from it |
| Scope | 15% | Creep or missing agreed features | Covers discussed topics | Exactly what was discussed |
| Testability | 20% | Vague "test the happy path" | Test cases listed but generic | Specific cases with inputs/outputs |
Thresholds
| Weighted Score | Verdict | Action |
|---|---|---|
| 4.0 - 5.0 | Approved | Proceed to user review |
| 3.0 - 3.9 | Needs Work | Fix flagged issues, re-dispatch (max 3 iterations) |
| < 3.0 | Major Gaps | Surface to the user immediately, do not iterate |
Reviewer prompt template
You are an independent spec reviewer. Grade this spec skeptically.
Do not give benefit of the doubt on vague sections.
Spec complexity tier: [SIMPLE | MEDIUM | COMPLEX]
--- BEGIN SPEC ---
{spec_content}
--- END SPEC ---
--- BEGIN INTERVIEW CONTEXT ---
{interview_summary}
--- END INTERVIEW CONTEXT ---
Score each criterion 1-5 using the rubric. Output (STRICT):
## Spec Review
- **Completeness**: {score}/5 - {justification}
- **Consistency**: {score}/5 - {justification}
- **Clarity**: {score}/5 - {justification}
- **Scope**: {score}/5 - {justification}
- **Testability**: {score}/5 - {justification}
- **Weighted Score**: {calculated}/5.0
- **Verdict**: Approved | Needs Work | Major Gaps
## Specific Issues (required if score < 4.0)
- [Section]: what is wrong and how to fix it
## What Was Done Well
- {1-2 strengths}Reviewer approval is necessary but not sufficient — the user gate in Phase 2 is mandatory regardless of the reviewer's verdict.
Example Spec (abbreviated, medium tier)
# Commenting System Design Spec
## Summary
Add threaded comments (one level deep) to blog posts for logged-in users.
## Context
Blog at `src/app/blog/` uses Prisma + PostgreSQL. No commenting today.
## Design
### Architecture
New API routes at `/api/posts/:postId/comments`, new `comments` table,
React components in `src/components/comments/`.
### Components
| Component | Responsibility | File Path |
|---|---|---|
| CommentThread | Renders comments + replies | `src/components/comments/CommentThread.tsx` |
| CommentForm | Input form | `src/components/comments/CommentForm.tsx` |
| comments API | CRUD endpoints | `src/app/api/posts/[postId]/comments/route.ts` |
### Data Model
Comment: id, body, postId, authorId, parentId (nullable), createdAt, updatedAt.
Indexes on [postId, createdAt] and [parentId].
## Testing Strategy
11 tests: 8 integration (CRUD + auth + pagination + nesting), 2 unit, 1 e2e.
## Decision Log
| Decision | Chosen | Rationale |
|---|---|---|
| Nesting depth | 2-level | Avoids recursive queries, covers 90% of cases |
| Pagination | Cursor-based | Reliable with concurrent inserts | verification-framework.md
Verification Framework
Every task proves it works before closing. Verification runs in two layers — signals (objective, binary) and evaluator (subjective, scored) — with generator-evaluator separation throughout. This reference also defines the three mandatory tail tasks that close every board.
TDD Workflow Per Task
Red → Green → Refactor
RED: write tests describing the desired behavior → tests FAIL
GREEN: write the minimum code to pass → tests PASS
REFACTOR: clean up while keeping tests green → tests PASSSteps
- Read the acceptance criteria from the task's plan.
- Write test file(s) encoding each criterion as a test case.
- Run tests — confirm they FAIL (red proves the tests are meaningful).
- Implement to make each test pass, one at a time.
- Run tests — confirm they PASS (green).
- Refactor — rename, extract, simplify — keeping tests green.
- Final run — all tests pass, lint clean, types check.
Test categories per task
| Category | What | Priority |
|---|---|---|
| Happy path | Primary use case works | Required |
| Edge cases | Boundaries, empty, nulls | Required |
| Error handling | Invalid inputs, failure modes | Required |
| Integration | Interaction with other components | If applicable |
Follow the project's existing test-naming convention; if none, use
describe("Thing", () => it("should X when Y")).
Layer 1: Verification Signals (Binary Gate)
Run via the project's own scripts — never raw tools.
| Signal | Example command | Required | Notes |
|---|---|---|---|
| Tests | npm test / pytest / project cmd |
Always | All new + existing tests pass |
| Lint | npm run lint / project cmd |
Always | Zero new warnings/errors |
| Type Check | tsc --noEmit / mypy |
If typed | No new type errors |
| Build | npm run build / project cmd |
If applicable | Project still builds |
| Format | prettier --check / black --check |
If configured | Matches project format |
Detect available commands from package.json scripts, Makefile, pyproject.toml, and CI config.
If ANY signal fails, the task goes straight back to the generator to fix (up to 2 signal retries).
Signal failures are unambiguous — no evaluator needed to diagnose a failing test or broken build.
If time-constrained, verify in priority order: Tests → Build → Type Check → Lint → Format.
Layer 2: The Evaluator (Scored Gate)
If all signals pass, dispatch a separate evaluator subagent. Self-evaluation has systematic bias — generators over-praise their own work and talk themselves out of legitimate issues. A fresh, skeptical context sees the work as a reviewer would.
Complexity gating
| Size | Evaluator | Rationale |
|---|---|---|
| S (< 50 lines) | Signals only | Binary signals catch most issues in small changes |
| M (50-200 lines) | Full evaluator | Subjective quality matters at this scale |
| Any failed task | Mandatory full evaluator | Failure means the task is at the capability edge |
| Touches shared interfaces | Mandatory full evaluator | Integration risk demands independent review |
Scored rubric (code tasks)
| Dimension | Weight | 1 (Fail) | 3 (Acceptable) | 5 (Excellent) |
|---|---|---|---|---|
| Correctness | 30% | Tests fail / criteria unmet | Tests pass, basic criteria met | All criteria incl. edge cases, no regressions |
| Code Quality | 20% | Ignores conventions, unclear | Follows conventions, readable | Clean, idiomatic, extends existing patterns |
| Completeness | 20% | Partial, TODOs left | All stated criteria addressed | Handles implicit requirements too |
| Test Coverage | 15% | No/trivial tests | Happy path tested | Edge, error, and boundary cases tested |
| Integration Safety | 15% | Broken imports/types | Builds, existing tests pass | No warnings, clean integration |
Rubric adaptations: test tasks → swap Code Quality for Assertion Quality, raise Test Coverage
to 25%. docs tasks → Clarity/Accuracy replaces Code Quality; Coverage/Completeness replaces Test
Coverage. config/infra → Integration Safety to 25%, add an Idempotency check.
Thresholds
| Weighted Score | Verdict | Action |
|---|---|---|
| 4.0 - 5.0 | PASS | Proceed |
| 3.0 - 3.9 | NEEDS WORK | Generator gets specific feedback, retries |
| < 3.0 | FAIL | Escalate to the user |
Evaluator prompt template
## Evaluator Task: {AW-XXX} - {Title}
You are an independent evaluator. Grade this work skeptically and honestly.
Do not assume good intent. Look for gaps, shortcuts, incomplete work, hidden bugs.
### Acceptance Criteria
{criteria — what "done" means}
### Files Modified / Git Diff
{the actual diff of all changes}
### Test Output / Lint / Type / Build Output
{verification signal results}
### Project Conventions
{detected conventions}
Score each dimension 1-5 using the rubric. Output (STRICT):
#### Evaluation: AW-{XXX}
- **Correctness**: {score}/5 - {justification}
- **Code Quality**: {score}/5 - {justification}
- **Completeness**: {score}/5 - {justification}
- **Test Coverage**: {score}/5 - {justification}
- **Integration Safety**: {score}/5 - {justification}
- **Weighted Score**: {calculated}/5.0
- **Verdict**: PASS | NEEDS WORK | FAIL
#### Specific Feedback (required if score < 4.0)
- What is wrong, what the fix looks like, which files need changes.
#### Bugs Found
- {file:line references}
#### What Was Done Well
- {1-2 strengths}The evaluator receives only outputs and criteria — never the generator's reasoning. This prevents it from rationalizing decisions it should be questioning.
Iterative refinement loop
for iteration in 1..5:
evaluator grades the work
if weighted score >= 4.0: PASS, exit
if iteration >= 3 AND score stagnant/declining: FAIL, escalate with full history
if score < 3.0 on any iteration: FAIL immediately, escalate
else: generator makes TARGETED fixes from the feedback (not a rewrite); loopTrack scores per iteration on the board. A dropping score means the generator is thrashing — stop and escalate. On retry the generator receives: per-dimension scores, the specific-feedback section, the bugs list, and the instruction "Fix ONLY what the evaluator flagged."
Platform fallback
Without subagent support, switch context explicitly: "— EVALUATOR MODE — I evaluate the work I just completed as a different, skeptical reviewer. I do not reference my implementation intent; I judge only what is visible in the code and test output." Weaker than true separation, but strictly better than no gate.
Integration Verification (post-wave)
After a wave, if its tasks share dependencies or feed the next wave: verify import resolution, run the combined test suite (or tests for all files modified in the wave), run the build (watch for circular-dependency warnings), and smoke-test at runtime if applicable. After the final wave, run the FULL suite and build to catch regressions before CONVERGE.
What NOT to Do on Failure
- Do NOT suppress or skip failing tests.
- Do NOT add
@ts-ignore,// eslint-disable, or# type: ignoreto pass checks. - Do NOT reduce coverage or modify existing passing tests to accommodate a bug.
- Do NOT mark a task
donewhile any signal fails.
Mandatory Tail Tasks
Every task graph ends with these three, in order. They are real tasks on the board with acceptance criteria — not afterthoughts.
Third-to-last — Self Code Review (review)
Review all changes since the rollback point. Work the review pyramid bottom-up: Security →
Correctness → Performance → Design → Readability → Convention → Testing. Classify findings
[MAJOR] / [MINOR]. Fix all [MAJOR] immediately and reasonable [MINOR]. Re-run after fixes.
Acceptance: zero [MAJOR] remaining; all [MINOR] documented (fixed or explicitly deferred).
Depends on all implementation/test/docs tasks. If absolute-simplify is installed, run it on the
working changes here.
Second-to-last — Requirements Validation (verify)
Compare all changes against the original prompt, intake summary, and spec. Verify every success criterion and constraint is satisfied. Acceptance: every success criterion demonstrably met; gaps loop back to EXECUTE until resolved. Depends on the self code review.
Last — Full Project Verification (verify)
Run all available checks via the project's package-manager scripts, skipping any not configured:
Tests → Lint → Typecheck → Build. Acceptance: all available checks pass; failures are fixed
and re-run until green. Do not mark the board completed until every available check passes and
its output is recorded on the board. Depends on requirements validation.
Frequently Asked Questions
What is absolute-work?
End-to-end, phase-gated software development lifecycle for AI agents. Turns a ticket, task, plan, or migration into a validated design, a dependency-graphed task board, and verified code. Triggers on "build this end-to-end", "plan and build", "break this into tasks", "pick up this ticket", "grill me on this", "run this migration", "absolute-work this", or any multi-step development task. Relentlessly interviews to a shared design, writes a reviewed spec, decomposes into atomic tasks on a persistent markdown board, then peels tasks one safe wave at a time with test-first verification. Handles features, bugs, refactors, greenfield projects, planning breakdowns, and migrations.
How do I install absolute-work?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill absolute-work in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support absolute-work?
absolute-work works with claude-code, gemini-cli, openai-codex, mcp. Install it once and use it across any supported AI coding agent.
Is absolute-work free?
Yes, absolute-work is completely free and open source under the MIT license. Install it with a single command and start using it immediately.
What is the difference between absolute-work and similar tools?
absolute-work is an AI agent skill that teaches your coding agent specialized workflow knowledge. Unlike standalone tools, it integrates directly into claude-code, gemini-cli, openai-codex and other AI agents.
Can I use absolute-work with Cursor or Windsurf?
absolute-work works with any AI coding agent that supports the skills protocol, including Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and 40+ more.