ai-agent-design

Use this skill when designing AI agent architectures, implementing tool use, building multi-agent systems, or creating agent memory. Triggers on AI agents, tool calling, agent loops, ReAct pattern, multi-agent orchestration, agent memory, planning strategies, agent evaluation, and any task requiring autonomous AI agent design.

What is ai-agent-design?

Quick Start

Open your terminal or command prompt
Run: npx skills add AbsolutelySkilled/AbsolutelySkilled --skill ai-agent-design
Start your AI coding agent (Claude Code, Cursor, Gemini CLI, or any supported agent)
The ai-agent-design skill is now active and ready to use

Overview Files

ai-agent-design

ai-agent-design is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Designing AI agent architectures, implementing tool use, building multi-agent systems, or creating agent memory.

Quick Facts

Field	Value
Category	ai-ml
Version	0.1.0
Platforms	claude-code, gemini-cli, openai-codex
License	MIT

How to Install

Make sure you have Node.js installed on your machine.
Run the following command in your terminal:

npx skills add AbsolutelySkilled/AbsolutelySkilled --skill ai-agent-design

The ai-agent-design skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).

Overview

AI agents are autonomous LLM-powered systems that perceive their environment, decide on actions, execute tools, observe outcomes, and iterate toward a goal. Effective agent design requires deliberate choices about the loop structure, tool schemas, memory strategy, failure modes, and evaluation methodology.

Platforms

claude-code
gemini-cli
openai-codex

Related Skills

Pair ai-agent-design with these complementary skills:

Frequently Asked Questions

What is ai-agent-design?

How do I install ai-agent-design?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill ai-agent-design in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support ai-agent-design?

This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.

Maintainers

@maddhruv

Generated from AbsolutelySkilled

SKILL.md

AI Agent Design

When to use this skill

Trigger this skill when the user:

Designs or implements an agent loop (ReAct, plan-and-execute, reflection)
Defines tool schemas for LLM function-calling
Builds multi-agent systems with orchestration (sequential, parallel, hierarchical)
Implements agent memory (working, episodic, semantic)
Applies planning strategies like chain-of-thought or task decomposition
Adds safety guardrails, max-iteration limits, or human-in-the-loop gates
Evaluates agent behavior, trajectory quality, or task success
Debugs an agent that loops, hallucinates tools, or gets stuck

Do NOT trigger this skill for:

Framework-specific agent APIs (use the Mastra or a2a-protocol skill instead)
Pure LLM prompt engineering with no tool use or autonomy involved

Key principles

Tools over knowledge - agents should act through tools, not hallucinate facts. Every external lookup, write, or side effect belongs in a tool.
Constrain agent scope - give each agent a narrow, well-defined goal. A focused agent with 3 tools outperforms a general agent with 20.
Plan-act-observe loop - structure the core loop as: generate a plan, execute one action, observe the result, update the plan. Never batch unobserved actions.
Fail gracefully with max iterations - every agent loop must have a hard ceiling on steps. When the limit is hit, return a partial result with a clear error message - never loop indefinitely.
Evaluate agent behavior not just output - measure trajectory quality (tool selection accuracy, step efficiency), not only final answer correctness. A correct answer reached via a broken path will fail in production.

Core concepts

Agent loop anatomy

User Input
    |
    v
[ Planner / Reasoner ]  <---- working memory + observations
    |
    v
[ Action Selection ]  ----> tool call OR final answer
    |
    v
[ Tool Execution ]
    |
    v
[ Observation ]  ----> append to context, loop back

The loop terminates when: (a) the agent produces a final answer, (b) max iterations is reached, or (c) an explicit stop condition triggers.

Tool schemas

Tools are the agent's interface to the world. Each tool needs:

A precise, action-oriented description (the LLM's primary signal)
A strict inputSchema (validated before execution)
An outputSchema (validated before returning to the agent)
Deterministic, idempotent behavior where possible

Planning strategies

Strategy	When to use	Characteristics
ReAct	Interactive tasks with frequent tool use	Interleaves reasoning and acting; recovers from errors
Chain-of-thought (CoT)	Complex reasoning before a single action	Produces a scratchpad; no intermediate observations
Plan-and-execute	Long-horizon tasks with predictable subtasks	Upfront decomposition; each step is an independent mini-agent
Tree search (LATS)	Tasks where multiple solution paths exist	Explores branches; expensive but highest quality
Reflexion	Tasks requiring iterative self-improvement	Agent critiques its own output and retries

Memory types

Type	Scope	Storage	Use case
Working memory	Current run	In-context (string/JSON)	Current task state, scratchpad
Episodic memory	Per session	DB (keyed by thread/session)	Recall past interactions
Semantic memory	Cross-session	Vector store	Long-term knowledge retrieval
Procedural memory	Global	Prompt / fine-tune	Baked-in skills and habits

Multi-agent topologies

Topology	Structure	Best for
Sequential	A -> B -> C	Pipelines where each step builds on the last
Parallel	A, B, C run concurrently, results merged	Independent subtasks (research, drafting, validation)
Hierarchical	Orchestrator -> worker agents	Complex tasks requiring delegation and synthesis
Debate	Multiple agents argue, judge decides	High-stakes decisions needing diverse perspectives

Common tasks

1. Build a ReAct agent loop

interface Tool {
  name: string
  description: string
  execute: (input: unknown) => Promise<unknown>
}

interface AgentStep {
  thought: string
  action: string
  actionInput: unknown
  observation: string
}

async function reactAgent(
  goal: string,
  tools: Tool[],
  llm: (prompt: string) => Promise<string>,
  maxIterations = 10,
): Promise<string> {
  const toolMap = Object.fromEntries(tools.map(t => [t.name, t]))
  const toolDescriptions = tools
    .map(t => `- ${t.name}: ${t.description}`)
    .join('\n')

  const history: AgentStep[] = []

  for (let i = 0; i < maxIterations; i++) {
    const context = history
      .map(s => `Thought: ${s.thought}\nAction: ${s.action}[${JSON.stringify(s.actionInput)}]\nObservation: ${s.observation}`)
      .join('\n')

    const prompt = `You are an agent. Available tools:\n${toolDescriptions}\n\nGoal: ${goal}\n\n${context}\n\nThought:`
    const response = await llm(prompt)

    if (response.includes('Final Answer:')) {
      return response.split('Final Answer:')[1].trim()
    }

    const actionMatch = response.match(/Action: (\w+)\[(.*)\]/s)
    if (!actionMatch) break

    const [, actionName, rawInput] = actionMatch
    const tool = toolMap[actionName]
    if (!tool) {
      history.push({ thought: response, action: actionName, actionInput: rawInput, observation: `Error: tool "${actionName}" not found` })
      continue
    }

    let input: unknown
    try { input = JSON.parse(rawInput) } catch { input = rawInput }

    const observation = await tool.execute(input)
    history.push({ thought: response, action: actionName, actionInput: input, observation: JSON.stringify(observation) })
  }

  return `Max iterations (${maxIterations}) reached. Last state: ${JSON.stringify(history.at(-1))}`
}

2. Define tool schemas

import { z } from 'zod'

// Input and output schemas are the contract between the LLM and your system.
// Keep descriptions action-oriented and specific.

const searchWebSchema = {
  name: 'search_web',
  description: 'Search the web for current information. Use for facts, news, or data not in training.',
  inputSchema: z.object({
    query: z.string().describe('Specific search query. Be precise - avoid vague terms.'),
    maxResults: z.number().int().min(1).max(10).default(5).describe('Number of results to return'),
  }),
  outputSchema: z.object({
    results: z.array(z.object({
      title: z.string(),
      url: z.string().url(),
      snippet: z.string(),
    })),
    totalFound: z.number(),
  }),
}

const writeFileSchema = {
  name: 'write_file',
  description: 'Write content to a file on disk. Overwrites if file exists.',
  inputSchema: z.object({
    path: z.string().describe('Absolute file path'),
    content: z.string().describe('Full file content to write'),
    encoding: z.enum(['utf-8', 'base64']).default('utf-8'),
  }),
  outputSchema: z.object({
    success: z.boolean(),
    bytesWritten: z.number(),
  }),
}

3. Implement agent memory

interface WorkingMemory {
  goal: string
  completedSteps: string[]
  currentPlan: string[]
  facts: Record<string, string>
}

interface EpisodicStore {
  save(sessionId: string, entry: { role: string; content: string }): Promise<void>
  load(sessionId: string, limit?: number): Promise<Array<{ role: string; content: string }>>
}

class AgentMemory {
  private working: WorkingMemory
  private episodic: EpisodicStore
  private sessionId: string

  constructor(goal: string, episodic: EpisodicStore, sessionId: string) {
    this.working = { goal, completedSteps: [], currentPlan: [], facts: {} }
    this.episodic = episodic
    this.sessionId = sessionId
  }

  updatePlan(steps: string[]): void {
    this.working.currentPlan = steps
  }

  markStepComplete(step: string): void {
    this.working.completedSteps.push(step)
    this.working.currentPlan = this.working.currentPlan.filter(s => s !== step)
  }

  storeFact(key: string, value: string): void {
    this.working.facts[key] = value
  }

  async persist(role: string, content: string): Promise<void> {
    await this.episodic.save(this.sessionId, { role, content })
  }

  async loadHistory(limit = 20) {
    return this.episodic.load(this.sessionId, limit)
  }

  serialize(): string {
    return JSON.stringify(this.working, null, 2)
  }
}

4. Design multi-agent orchestration

For detailed implementations of sequential pipelines, parallel fan-out with synthesis, and hierarchical orchestration patterns, see references/orchestration-patterns.md.

5. Add guardrails and safety limits

interface GuardrailConfig {
  maxIterations: number
  maxTokensPerStep: number
  allowedToolNames: string[]
  forbiddenPatterns: RegExp[]
  timeoutMs: number
}

class GuardedAgentRunner {
  private config: GuardrailConfig
  private iterationCount = 0
  private startTime = Date.now()

  constructor(config: GuardrailConfig) {
    this.config = config
  }

  checkIterationLimit(): void {
    if (++this.iterationCount > this.config.maxIterations) {
      throw new Error(`Agent exceeded max iterations (${this.config.maxIterations})`)
    }
  }

  checkTimeout(): void {
    if (Date.now() - this.startTime > this.config.timeoutMs) {
      throw new Error(`Agent timed out after ${this.config.timeoutMs}ms`)
    }
  }

  validateToolCall(toolName: string, input: string): void {
    if (!this.config.allowedToolNames.includes(toolName)) {
      throw new Error(`Tool "${toolName}" is not in the allowed list`)
    }
    for (const pattern of this.config.forbiddenPatterns) {
      if (pattern.test(input)) {
        throw new Error(`Tool input matches forbidden pattern: ${pattern}`)
      }
    }
  }

  async runStep<T>(step: () => Promise<T>): Promise<T> {
    this.checkIterationLimit()
    this.checkTimeout()
    return step()
  }
}

6. Implement planning with decomposition

For detailed plan-and-execute implementation with topological task ordering and dependency resolution, see references/orchestration-patterns.md.

7. Evaluate agent performance

interface AgentTrace {
  steps: Array<{
    thought: string
    toolName?: string
    toolInput?: unknown
    observation?: string
  }>
  finalAnswer: string
  tokensUsed: number
  durationMs: number
}

interface EvalResult {
  passed: boolean
  score: number  // 0-1
  details: string[]
}

function evaluateTrace(trace: AgentTrace, expected: {
  answer: string
  requiredTools?: string[]
  maxSteps?: number
  answerValidator?: (answer: string) => boolean
}): EvalResult {
  const details: string[] = []
  const scores: number[] = []

  // Answer correctness
  const answerCorrect = expected.answerValidator
    ? expected.answerValidator(trace.finalAnswer)
    : trace.finalAnswer.toLowerCase().includes(expected.answer.toLowerCase())
  scores.push(answerCorrect ? 1 : 0)
  details.push(`Answer correct: ${answerCorrect}`)

  // Tool coverage
  if (expected.requiredTools) {
    const usedTools = new Set(trace.steps.map(s => s.toolName).filter(Boolean))
    const covered = expected.requiredTools.filter(t => usedTools.has(t))
    const toolScore = covered.length / expected.requiredTools.length
    scores.push(toolScore)
    details.push(`Tools covered: ${covered.length}/${expected.requiredTools.length}`)
  }

  // Efficiency (step count)
  if (expected.maxSteps) {
    const stepScore = Math.max(0, 1 - (trace.steps.length - 1) / expected.maxSteps)
    scores.push(stepScore)
    details.push(`Steps used: ${trace.steps.length} (max: ${expected.maxSteps})`)
  }

  const score = scores.reduce((a, b) => a + b, 0) / scores.length
  return { passed: score >= 0.7, score, details }
}

Anti-patterns

Anti-pattern	Problem	Fix
Monolithic agent	One agent does everything; context explodes and tool selection degrades	Split into specialist agents with narrow charters
Unbounded loops	No `maxIterations` ceiling; agent hallucinates progress forever	Always set a hard iteration limit; return partial result on breach
Vague tool descriptions	LLM picks the wrong tool because descriptions overlap or are too general	Write action-oriented, specific descriptions; test with diverse prompts
Synchronous observation batching	Multiple tool calls before observing results; agent acts on stale state	Strictly interleave: one action, one observation, then re-plan
No input validation	Tool receives malformed input; crashes mid-run with cryptic errors	Validate with Zod (or equivalent) before executing; return structured errors
Evaluating only final output	Agent reached correct answer through a broken trajectory; won't generalize	Evaluate full traces: tool selection accuracy, redundant steps, error recovery

Gotchas

Missing maxIterations causes infinite loops - An agent with no ceiling on iterations will loop indefinitely when it gets confused, hallucinates a tool name, or enters a reasoning cycle. Always set a hard limit (10-20 for most tasks) and return a partial result with a clear message when it's hit. Never rely on the LLM deciding to stop.
Vague tool descriptions cause wrong tool selection - The tool description field is the primary signal the LLM uses to pick a tool. Descriptions that overlap ("get data" vs "fetch information") cause the agent to pick randomly. Write descriptions as action-oriented imperatives with specific use cases and clear exclusions.
Batching tool calls without observing breaks reasoning - Generating multiple tool calls before processing their results means the agent acts on stale state. The plan-act-observe loop must be strictly sequential: one action, one observation, re-plan. Parallel tool calls are only safe for truly independent queries.
Context window exhaustion mid-run - Long agent runs accumulate observation history that eventually exceeds the model's context window. Without a summarization or truncation strategy, the agent silently loses early context and starts making inconsistent decisions. Implement working memory summarization when history exceeds ~70% of the context budget.
Multi-agent trust boundaries - When an orchestrator delegates to worker agents, the worker's output is untrusted input to the orchestrator. An adversarial document processed by a worker agent can inject instructions into the orchestrator's context (prompt injection). Always sanitize worker outputs before incorporating them into the orchestrator's reasoning context.

References

For detailed content on agent patterns and architectures, read:

references/agent-patterns.md - ReAct, plan-and-execute, reflexion, LATS, multi-agent debate - full catalog with design considerations
references/orchestration-patterns.md - Multi-agent orchestration (sequential, parallel, hierarchical) and plan-and-execute with task decomposition

Only load the reference file when the current task requires detailed pattern selection or architectural comparison.

References

agent-patterns.md

Agent Patterns Catalog

A catalog of production-proven agent architectures. Each pattern includes the core loop, when to use it, implementation considerations, and known failure modes.

1. ReAct (Reason + Act)

Paper: "ReAct: Synergizing Reasoning and Acting in Language Models" (Yao et al., 2022)

How it works

The agent interleaves reasoning traces and actions in a single context window:

Thought: I need to find the population of Tokyo.
Action: search_web[{"query": "Tokyo population 2024"}]
Observation: Tokyo has a population of approximately 13.96 million in the city proper.
Thought: Now I have the data. I can answer the question.
Final Answer: Tokyo's population is approximately 13.96 million.

Each step is appended to the context, giving the agent full visibility into its own reasoning history.

When to use

Tasks requiring frequent external lookups (search, APIs, file reads)
Interactive tasks where errors need mid-run correction
Debugging-friendly workflows (the thought chain is readable)
General-purpose agents where task structure is unknown upfront

Implementation notes

Parse the LLM output to extract Action: and Final Answer: markers
Validate tool names and inputs before execution; return structured errors as observations
Set maxIterations (typically 10-15 for complex tasks)
Include the tool list and their descriptions in the initial system prompt

Failure modes

Looping: Agent repeats the same action after receiving the same observation. Fix: track action/observation pairs and break on duplicates.
Tool hallucination: Agent invokes a tool that doesn't exist. Fix: strict tool name validation; return "tool not found" as observation.
Context overflow: Long observation chains fill the window. Fix: summarize old observations; use a sliding window.

2. Plan-and-Execute

Paper: "Plan-and-Solve Prompting" (Wang et al., 2023)

How it works

Two-phase architecture:

Planner - generates a full task decomposition upfront
Executor - runs each subtask independently, optionally in parallel

// Phase 1: plan
const plan = await planner.generate(`
  Goal: Research and summarize AI trends for Q1 2025
  Output: A JSON list of tasks with dependencies
`)
// plan = [
//   { id: "t1", description: "Search for AI news Jan 2025", dependsOn: [] },
//   { id: "t2", description: "Search for AI news Feb 2025", dependsOn: [] },
//   { id: "t3", description: "Summarize findings", dependsOn: ["t1", "t2"] },
// ]

// Phase 2: execute
for (const task of topologicalSort(plan)) {
  const context = getCompletedResults(task.dependsOn)
  results[task.id] = await executor.generate(task.description, context)
}

When to use

Long-horizon tasks with predictable subtask structure (research, report generation)
Workflows where subtasks are independent and can parallelize
When you need human review of the plan before execution begins

Implementation notes

Planner output should be structured (JSON) for reliable parsing
Use dependency tracking for parallel execution of independent tasks
Allow the plan to be revised if a subtask fails (re-plan from failure point)
Executor agents should be stateless - pass all necessary context explicitly

Failure modes

Over-planning: LLM generates too many trivial subtasks. Fix: ask planner to keep plan to N steps max.
Stale plan: Initial plan doesn't account for information discovered during execution. Fix: add a re-planning step after each execution phase.
Dependency deadlock: Circular dependencies in the plan. Fix: validate the DAG before execution; detect cycles.

3. Reflexion

Paper: "Reflexion: Language Agents with Verbal Reinforcement Learning" (Shinn et al., 2023)

How it works

The agent evaluates its own output and iteratively improves through verbal reflection:

Attempt 1 -> Output -> Evaluate -> "Output was too brief, missing key metrics"
Attempt 2 -> Output -> Evaluate -> "Good coverage but incorrect calculation in section 3"
Attempt 3 -> Output -> Evaluate -> "Passes all criteria" -> DONE

Each reflection is stored in an "episodic memory buffer" and injected into the next attempt's context.

When to use

Tasks with a clear quality evaluator (unit tests, rubrics, validators)
Writing or code generation where iterative refinement is natural
Tasks where first-pass quality is often insufficient

Implementation notes

async function reflexionAgent(
  task: string,
  evaluator: (output: string) => Promise<{ passed: boolean; feedback: string }>,
  agent: (task: string, memory: string[]) => Promise<string>,
  maxAttempts = 3,
): Promise<string> {
  const memory: string[] = []

  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const output = await agent(task, memory)
    const { passed, feedback } = await evaluator(output)

    if (passed) return output

    memory.push(`Attempt ${attempt + 1} feedback: ${feedback}`)
  }

  throw new Error(`Failed after ${maxAttempts} attempts`)
}

The evaluator can be another LLM, a programmatic test, or a human
Memory accumulates across attempts - keep it concise (summarize if needed)
Set a hard maxAttempts to prevent infinite refinement

Failure modes

Feedback oscillation: Agent improves one aspect and degrades another. Fix: use a multi-criteria evaluator that scores each dimension independently.
Evaluator bias: LLM evaluator is too lenient or inconsistent. Fix: use programmatic validators where possible (tests, schemas).

4. LATS (Language Agent Tree Search)

Paper: "Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models" (Zhou et al., 2023)

How it works

Applies Monte Carlo Tree Search (MCTS) to agent trajectories:

Root (initial state)
├── Branch A: search_web -> read_page -> summarize
│   ├── Branch A1: different search query
│   └── Branch A2: different page selection
└── Branch B: read_file -> extract_data -> validate
    └── Branch B1: different extraction strategy

Each node is scored by a value function (LLM-based or heuristic). The search expands the most promising branches, backtracks from dead ends, and selects the highest-scoring complete trajectory.

When to use

Tasks with a clear success metric where maximizing quality is worth the compute cost
Complex reasoning tasks (math, code generation) with multiple valid solution paths
When other single-trajectory methods consistently fail

Implementation notes

interface TreeNode {
  state: string        // current context/observations
  action: string       // action taken to reach this node
  parent: TreeNode | null
  children: TreeNode[]
  visits: number
  value: number        // cumulative score
}

// UCB1 selection: balance exploration vs exploitation
function ucb1Score(node: TreeNode, explorationConstant = 1.4): number {
  if (node.visits === 0) return Infinity
  const exploitation = node.value / node.visits
  const exploration = explorationConstant * Math.sqrt(Math.log(node.parent!.visits) / node.visits)
  return exploitation + exploration
}

Expensive: each branch requires LLM calls. Use for high-value tasks only.
Value function quality is critical - a bad evaluator leads to poor branch selection.
Implement beam search as a simpler alternative when full MCTS is too costly.

Failure modes

Compute explosion: Branching factor too high. Fix: limit branching factor to 2-3 per node; prune low-score branches early.
Value function gaming: Agent finds outputs that score well but aren't actually correct. Fix: use diverse evaluation criteria.

5. Multi-Agent Debate

Paper: "Improving Factuality and Reasoning in Language Models through Multiagent Debate" (Du et al., 2023)

How it works

Multiple agents independently produce answers, then iteratively critique and refine each other's responses. A judge (or consensus) produces the final answer.

Round 1:
  Agent A: "The capital is Paris"
  Agent B: "The capital is Lyon"
  Agent C: "The capital is Paris"

Round 2 (each sees all round 1 answers):
  Agent A: "I maintain Paris - it's clearly the capital"
  Agent B: "After review, I agree Paris is correct"
  Agent C: "Paris. Agent B was incorrect initially"

Judge: "Consensus: Paris" -> Final Answer

When to use

High-stakes factual questions where hallucination risk is high
Complex reasoning where diverse perspectives reduce blind spots
Tasks where a single LLM consistently makes the same systematic error

Implementation notes

async function multiAgentDebate(
  question: string,
  agents: Array<(question: string, context: string) => Promise<string>>,
  rounds = 2,
  judge: (question: string, responses: string[]) => Promise<string>,
): Promise<string> {
  let responses = await Promise.all(agents.map(a => a(question, '')))

  for (let round = 1; round < rounds; round++) {
    const context = responses.map((r, i) => `Agent ${i + 1}: ${r}`).join('\n')
    responses = await Promise.all(agents.map(a => a(question, context)))
  }

  return judge(question, responses)
}

Use agents with different system prompts or temperatures to ensure diversity
Typically 2-3 rounds is sufficient; diminishing returns after that
Judge can be a separate LLM or a majority-vote function

Failure modes

Echo chamber: Agents converge too quickly and reinforce each other's errors. Fix: use agents with different base prompts or models; force disagreement in round 1.
Indecisive judge: Judge fails to pick between evenly split responses. Fix: instruct the judge to always select one answer with explicit reasoning.

Pattern Selection Guide

Situation	Recommended Pattern
Interactive task, errors need mid-run correction	ReAct
Long task, subtasks are known upfront, parallelizable	Plan-and-Execute
Output quality matters, a validator exists	Reflexion
Maximize quality regardless of compute cost	LATS
High-stakes facts, hallucination risk is critical concern	Multi-Agent Debate
Simple one-shot task, no iteration needed	Single LLM call (no agent loop)

Combining Patterns

Patterns compose. Common combinations:

Plan-and-Execute + ReAct: Each executor step is itself a ReAct loop
Reflexion + Multi-Agent Debate: Debate evaluates each reflexion attempt
Plan-and-Execute + LATS: Planner uses tree search; executor uses ReAct
Hierarchical + Debate: Orchestrator spawns debaters, synthesizes consensus

Start with the simplest pattern that can solve the task. Add complexity only when benchmarking shows the simpler pattern falls short.

orchestration-patterns.md

Multi-Agent Orchestration and Planning Patterns

Multi-agent orchestration

interface AgentResult {
  agentId: string
  output: string
  success: boolean
}

type AgentFn = (input: string, context: string) => Promise<AgentResult>

// Sequential pipeline - each agent feeds the next
async function sequentialPipeline(
  agents: Array<{ id: string; fn: AgentFn }>,
  initialInput: string,
): Promise<AgentResult[]> {
  const results: AgentResult[] = []
  let current = initialInput

  for (const { id, fn } of agents) {
    const context = results.map(r => `${r.agentId}: ${r.output}`).join('\n')
    const result = await fn(current, context)
    results.push(result)
    if (!result.success) break  // fail fast
    current = result.output
  }

  return results
}

// Parallel fan-out with synthesis
async function parallelFanOut(
  workers: Array<{ id: string; fn: AgentFn }>,
  synthesizer: AgentFn,
  input: string,
): Promise<AgentResult> {
  const workerResults = await Promise.allSettled(
    workers.map(({ id, fn }) => fn(input, ''))
  )

  const outputs = workerResults
    .filter((r): r is PromiseFulfilledResult<AgentResult> => r.status === 'fulfilled')
    .map(r => r.value)

  const synthesisInput = outputs.map(r => `[${r.agentId}]: ${r.output}`).join('\n\n')
  return synthesizer(synthesisInput, input)
}

// Hierarchical: orchestrator delegates to specialists
async function hierarchical(
  orchestrator: AgentFn,
  specialists: Record<string, AgentFn>,
  goal: string,
): Promise<string> {
  // Orchestrator plans which specialists to invoke
  const plan = await orchestrator(goal, JSON.stringify(Object.keys(specialists)))
  const lines = plan.output.split('\n').filter(l => l.startsWith('DELEGATE:'))

  const delegations = await Promise.all(
    lines.map(line => {
      const [, agentId, task] = line.match(/DELEGATE:(\w+):(.+)/) ?? []
      const specialist = specialists[agentId]
      return specialist ? specialist(task, goal) : Promise.resolve({ agentId, output: 'agent not found', success: false })
    })
  )

  return orchestrator(
    `Synthesize these specialist outputs into a final answer for: ${goal}`,
    delegations.map(d => `${d.agentId}: ${d.output}`).join('\n'),
  ).then(r => r.output)
}

Planning with task decomposition

interface Task {
  id: string
  description: string
  dependsOn: string[]
  status: 'pending' | 'running' | 'done' | 'failed'
  result?: string
}

async function planAndExecute(
  goal: string,
  planner: (goal: string) => Promise<Task[]>,
  executor: (task: Task, context: Record<string, string>) => Promise<string>,
): Promise<Record<string, string>> {
  const tasks = await planner(goal)
  const results: Record<string, string> = {}

  // Topological execution respecting dependencies
  while (tasks.some(t => t.status === 'pending')) {
    const ready = tasks.filter(
      t => t.status === 'pending' && t.dependsOn.every(dep => results[dep] !== undefined)
    )

    if (ready.length === 0) {
      const stuck = tasks.filter(t => t.status === 'pending')
      throw new Error(`Deadlock: tasks ${stuck.map(t => t.id).join(', ')} cannot proceed`)
    }

    // Run independent ready tasks in parallel
    await Promise.all(
      ready.map(async task => {
        task.status = 'running'
        try {
          results[task.id] = await executor(task, results)
          task.status = 'done'
        } catch (err) {
          task.status = 'failed'
          results[task.id] = `Error: ${String(err)}`
        }
      })
    )
  }

  return results
}

Frequently Asked Questions

What is ai-agent-design?

How do I install ai-agent-design?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill ai-agent-design in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support ai-agent-design?

ai-agent-design works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.

Is ai-agent-design free?

Yes, ai-agent-design is completely free and open source under the MIT license. Install it with a single command and start using it immediately.

What is the difference between ai-agent-design and similar tools?

ai-agent-design is an AI agent skill that teaches your coding agent specialized ai & machine learning knowledge. Unlike standalone tools, it integrates directly into claude-code, gemini-cli, openai-codex and other AI agents.

Can I use ai-agent-design with Cursor or Windsurf?

ai-agent-design works with any AI coding agent that supports the skills protocol, including Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and 40+ more.

ai-agent-design

What is ai-agent-design?

Quick Start

ai-agent-design

Quick Facts

How to Install

Overview

Tags

Platforms

Related Skills

Frequently Asked Questions

What is ai-agent-design?

How do I install ai-agent-design?

What AI agents support ai-agent-design?

Maintainers

SKILL.md

AI Agent Design

When to use this skill

Key principles

Core concepts

Agent loop anatomy

Tool schemas

Planning strategies

Memory types

Multi-agent topologies

Common tasks

1. Build a ReAct agent loop

2. Define tool schemas

3. Implement agent memory

4. Design multi-agent orchestration

5. Add guardrails and safety limits

6. Implement planning with decomposition

7. Evaluate agent performance

Anti-patterns

Gotchas

References

References

agent-patterns.md

Agent Patterns Catalog

1. ReAct (Reason + Act)

How it works

When to use

Implementation notes

Failure modes

2. Plan-and-Execute

How it works

When to use

Implementation notes

Failure modes

3. Reflexion

How it works

When to use

Implementation notes

Failure modes

4. LATS (Language Agent Tree Search)

How it works

When to use

Implementation notes

Failure modes

5. Multi-Agent Debate

How it works

When to use

Implementation notes

Failure modes

Pattern Selection Guide

Combining Patterns

orchestration-patterns.md

Multi-Agent Orchestration and Planning Patterns

Multi-agent orchestration

Planning with task decomposition

Frequently Asked Questions

What is ai-agent-design?

How do I install ai-agent-design?

What AI agents support ai-agent-design?

Is ai-agent-design free?

What is the difference between ai-agent-design and similar tools?

Can I use ai-agent-design with Cursor or Windsurf?