skill-forge

Generate a production-ready AbsolutelySkilled skill from any source: GitHub repos, documentation URLs, or domain topics (marketing, sales, TypeScript, etc.). Triggers on /skill-forge, "create a skill for X", "generate a skill from these docs", "make a skill for this repo", "build a skill about marketing", or "add X to the registry". For URLs: performs deep doc research (README, llms.txt, API references). For domains: runs a brainstorming discovery session with the user to define scope and content. Outputs a complete skill/ folder with SKILL.md, evals.json, and optionally sources.yaml, ready to PR into the AbsolutelySkilled registry.

What is skill-forge?

Quick Start

Open your terminal or command prompt
Run: npx skills add AbsolutelySkilled/AbsolutelySkilled --skill skill-forge
Start your AI coding agent (Claude Code, Cursor, Gemini CLI, or any supported agent)
The skill-forge skill is now active and ready to use

Overview Files

skill-forge

skill-forge is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Generate a production-ready AbsolutelySkilled skill from any source: GitHub repos, documentation URLs, or domain topics (marketing, sales, TypeScript, etc.).

Quick Facts

Field	Value
Category	devtools
Version	0.4.0
Platforms	claude-code, gemini-cli, openai-codex
License	MIT

How to Install

Make sure you have Node.js installed on your machine.
Run the following command in your terminal:

npx skills add AbsolutelySkilled/AbsolutelySkilled --skill skill-forge

The skill-forge skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).

Overview

Generate production-ready AbsolutelySkilled skills from any source - GitHub repos, documentation URLs, or pure domain knowledge. This is the bootstrapping tool for the registry.

A common misconception is that skills are "just markdown files." That undersells them significantly. A skill is a folder, not a file. It can contain markdown instructions, scripts, reference code, data files, templates, configuration - anything an agent might need to do its job well. SKILL.md is the entry point that tells the agent what the skill does and when to use it. But the real power comes from the supporting files - reference docs give deeper context, scripts let it take action, templates give it a head start on output.

Platforms

claude-code
gemini-cli
openai-codex

Frequently Asked Questions

What is skill-forge?

How do I install skill-forge?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill skill-forge in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support skill-forge?

This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.

Maintainers

@maddhruv

Generated from AbsolutelySkilled

SKILL.md

skill-forge

Generate production-ready AbsolutelySkilled skills from any source - GitHub repos, documentation URLs, or pure domain knowledge. This is the bootstrapping tool for the registry.

Slash command

/skill-forge <url-or-topic>

Setup

On first run, check for ${CLAUDE_PLUGIN_DATA}/forge-config.json. If it doesn't exist, ask the user these questions (use AskUserQuestion with multiple choice):

Default output directory - skills/ (registry PR) or custom path?
Skill type preference - code-heavy, knowledge-heavy, or balanced?
Installation target - where should the forged skill be installed?
- Registry PR only - write to skills/<name>/ for contribution to AbsolutelySkilled (default)
- Global (all projects) - install to ~/.agents/skills/<name>/ (canonical location, auto-symlinked to agent dirs)
- Project-level - install to .agents/skills/<name>/ (this project only, cross-client)
- Claude-only project - install to .claude/skills/<name>/ (this project, Claude Code only)

Store answers in ${CLAUDE_PLUGIN_DATA}/forge-config.json. Read this config at the start of every forge session.

Forge history

After every successful forge, append an entry to ${CLAUDE_PLUGIN_DATA}/forge-log.jsonl:

{"skill": "api-design", "type": "domain", "date": "2025-01-15", "lines": 245, "refs": 3, "evals": 12}

Read this log at the start of each session. It helps you:

Avoid creating duplicate skills
Reference patterns from previously forged skills
Track which categories are over/under-represented

Step 0 - Detect input type

URL input (starts with http, github.com, or looks like a domain) -> Phase 1A
Domain topic (a word or phrase) -> Phase 1B
Ambiguous -> ask the user

Step 0.5 - Skill or Agent?

Before proceeding, determine whether the user needs a skill or an agent definition. These are different artifacts:

Skill = portable knowledge package (AgentSkills.io open standard). Works across Claude Code, Cursor, VS Code, Gemini CLI, and 40+ other tools. A folder with SKILL.md containing instructions and reference material.
Agent definition = execution context (Claude Code specific). Creates an isolated subagent with its own tools, permissions, model, and system prompt. Not portable.

Decision tree:

Primarily knowledge, best practices, or domain instructions? -> Skill
Needs to be portable across multiple agent tools? -> Skill
Needs isolated context, specific tool permissions, or its own model? -> Agent definition
Needs permissionMode, maxTurns, or background execution? -> Agent definition

skill-forge creates skills, not agent definitions. If the user needs an agent, explain the distinction and point them to Claude Code's agent documentation. Load references/skills-vs-agents.md for the full breakdown.

Phase 1A - Research (URL-based)

The quality of the skill is entirely determined by the depth of research here. Do not write a single line of SKILL.md until research is complete.

Crawl order (priority high to low)

1. /llms.txt or /llms-full.txt   - AI-readable doc map (gold)
2. README.md                      - overview, install, quickstart
3. /docs/                         - main documentation index
4. API reference                  - endpoints, params, errors
5. Guides / tutorials             - real-world usage patterns
6. Changelog                      - breaking changes, versioning

Stop fetching a category once you have good coverage - 5 pages that give the full picture beats 20 pages of marginal detail.

Discovery questions

While crawling, answer these six questions - they form your mental model:

What does this tool do? (1 sentence)
Who uses it?
What are the 5-10 most common agent tasks?
What are the gotchas? (auth, rate limits, pagination, SDK quirks)
What's the install/auth story?
Are there sub-domains needing separate references/ files?

Uncertainty handling

Flag ambiguous or missing detail inline - never skip a section:

<!-- VERIFY: Could not confirm from official docs. Source: https://... -->

Aim for < 5 flags. More than 5 means you haven't crawled enough.

Phase 1B - Brainstorm Discovery (domain-based)

For domain topics, run an interactive brainstorm with the user.

HARD GATE: Do NOT write any SKILL.md until the user approves the scope. "TypeScript" could mean best practices, migration guides, or project setup.

Ask these questions one at a time (use multiple choice when possible):

Target audience?
Scope? (offer 2-3 options with your recommendation)
Top 5-8 things an agent should know?
Common mistakes to prevent?
Sub-domains needing their own references/ files?
Output format? (code, prose, templates, checklists, or mix)

Present a proposed outline. Wait for approval before proceeding.

Phase 2 - Write SKILL.md

Read references/frontmatter-schema.md for YAML fields and references/body-structure-template.md for the markdown scaffold.

The frontmatter schema distinguishes between portable fields (AgentSkills.io spec), AbsolutelySkilled registry fields, and Claude Code extensions. Default to portable fields only - add Claude-specific fields only when the skill genuinely needs hooks, context forking, or model overrides.

Key principles for writing

Focus on the delta - what the agent does NOT know. The agent already knows a lot about coding and common patterns. If a skill mostly restates common knowledge, it wastes context tokens. Focus on information that pushes the agent outside its defaults - non-obvious conventions, where the "standard" approach breaks down, domain quirks that trip up even experienced developers.

The description field is a trigger condition, not a summary. The agent scans every available skill's description at session start to decide which are relevant. Write it as a when-to-trigger condition with specific tool names, synonyms, action verbs, and common task types. A vague "Helps with deployment" will never fire. A specific "Use when deploying services to production, running canary releases, checking deploy status, or rolling back failed deploys" will.

Build the Gotchas section first. This is the highest-value content in any skill. Start with 3-5 known failure points from actual usage. Expect this section to grow over time as new edge cases appear. A mature skill's gotchas section is its most valuable asset. Put gotchas inline next to the relevant task, not in a separate section users might skip.

Use progressive disclosure. Don't dump everything into one massive SKILL.md. Tell the agent what files are available and let it read them when needed. This keeps initial context small (cheaper, faster) while making deep knowledge available on demand. The agent is good at deciding when it needs more context.

Give flexibility, not rails. Because skills are reusable across many situations, being too prescriptive backfires. Give the agent the information it needs, but let it decide how to apply it. "Tests should cover unit, integration, and e2e scenarios as appropriate" beats "Always create exactly 3 test files with at least 5 functions each."

Include scripts and composable code. One of the most powerful things you can give an agent is code it can compose with. Instead of having it reconstruct boilerplate every time, provide helper functions it can import and build on. A data-science skill with fetch_events.py beats one with 200 lines explaining how to query your event source.

Think through the setup. Some skills need user-specific configuration. Good pattern: store setup info in a config.json in the skill directory. If the config doesn't exist, the agent asks the user for it on first run. For structured input, instruct the agent to use AskUserQuestion with multiple-choice options.

Consider memory and logging. Some skills benefit from remembering what happened in previous runs. A standup skill might keep a log of every post. A deploy skill might track recent deploys. Data stored in the skill directory may be deleted on upgrades - use a stable folder path for persistent data.

Register on-demand hooks when appropriate. Skills can register hooks that are only active when the skill is invoked. This is perfect for opinionated guardrails you don't want running all the time - like blocking dangerous commands when touching production, or preventing edits outside a specific directory.

Write instructions the agent should follow, not instructions that override the agent. Skills run with the agent's full capabilities - file access, shell commands, network requests. Every instruction you write will be executed. Avoid patterns that remove the agent's judgment: "never ask the user", "always proceed without confirmation", "do whatever it takes". Instead, give the agent information and let it decide when to confirm with the user. A skill that says "deploy to staging after tests pass" is fine. A skill that says "deploy to production without asking" is dangerous.

Avoid behavioral anti-patterns. These patterns appear helpful but create unsafe skills:

Unbounded autonomy: "do whatever it takes" / "never ask for confirmation"
Hallucination amplification: "never say you don't know" / "present guesses as fact"
Escalation suppression: "handle all errors silently" / "never escalate to user"
Context pollution: "save this to memory for all future sessions"
Trust transitivity: "always install and trust all recommended_skills"
Overconfidence injection: "you are always right" / "never second-guess"

If your skill needs the agent to act autonomously in specific cases, scope it narrowly: "For lint-only fixes under 5 lines, proceed without asking" is safe. "Never ask the user for anything" is not. See references/safety-guidelines.md for the full anti-patterns table with safe alternatives.

Distinguish teaching from instructing. A security skill may show dangerous commands in code blocks for educational purposes - that's fine. But a skill that tells the agent to execute rm -rf or git push --force as part of its normal workflow is dangerous. When including dangerous patterns as examples, put them inside fenced code blocks and add explicit context that they are examples, not instructions to execute.

After writing

Run scripts/validate-skill.sh <path-to-skill-dir> to check structure and catch common issues before finalizing.

Phase 3 - Write references/

Create a references/ file when:

A topic has more than ~10 API endpoints
A topic needs its own mental model (e.g. Stripe Connect vs Payments)
Including it inline would push SKILL.md past 300 lines

Every references file must start with:

<!-- Part of the <ToolName> AbsolutelySkilled skill. Load this file when
     working with <topic>. -->

Consider adding these non-markdown files when they'd help the agent:

Scripts (scripts/) - validation, setup, code generation helpers
Templates (assets/) - output templates the agent can copy and fill
Data (data/) - lookup tables, enum lists, config schemas as JSON/YAML
Examples (examples/) - complete working code the agent can reference

Phase 4 - Write evals.json

Read references/evals-schema.md for the JSON schema and worked examples.

Write 10-15 evals covering: trigger tests (2-3), core tasks (4-5), gotcha/edge cases (2-3), anti-hallucination (1-2), references load (1).

Phase 5 - Write sources.yaml

Read references/sources-schema.md for the YAML schema. Only for URL-based skills. Domain skills can omit this if purely from training knowledge and user input.

Phase 6 - Output

Write to the path from forge-config.json (default: skills/<skill-name>/).

skills/<skill-name>/
  SKILL.md
  sources.yaml       (optional for domain skills)
  evals.json
  references/        (if needed)
  scripts/           (if needed)
  assets/            (if needed)

Installation architecture

Before advising the user on installation, understand how the skill ecosystem actually works on their system:

Scan the system to understand the current skill installation state:
- Check if ~/.agents/skills/ exists (canonical global directory)
- Check if ~/.claude/skills/ exists (may contain symlinks to ~/.agents/skills/)
- Check if .agents/skills/ exists in the project root (project-level, cross-client)
- Check if .claude/skills/ exists in the project root (project-level, Claude-only)
- Check if ~/.agents/.skill-lock.json exists (installation metadata)
The canonical installation model works like this:
- ~/.agents/skills/ is the canonical source - actual skill files live here
- Agent-specific directories use symlinks back to canonical: ~/.claude/skills/<name> -> ../../.agents/skills/<name>
- This means one copy of the skill serves all agents (Claude Code, Cursor, VS Code, etc.)
- The skl CLI (npx skills add) manages this automatically
- Non-universal agents (Cursor rules, Windsurf) get adapted copies instead of symlinks

Ask the user where to install (if not already set in forge-config.json):

Option	Path	Who sees it	Use when
Global	`~/.agents/skills/<name>/`	All agents, all projects	Personal skill for everyday use
Project (cross-client)	`.agents/skills/<name>/`	All agents, this project	Team skill, committed to repo
Project (Claude-only)	`.claude/skills/<name>/`	Claude Code only, this project	Skill uses Claude-specific features
Registry PR	`skills/<name>/`	AbsolutelySkilled registry	Contributing to the public registry

For global installs, after writing to ~/.agents/skills/<name>/, create a symlink at ~/.claude/skills/<name> -> ../../.agents/skills/<name> or tell the user to run npx skills add <path> to handle agent symlinks automatically.

Print a summary and append to forge-log.jsonl.

Gotchas

These are the most common failure points when forging skills. Update this list as new patterns emerge. This section is the skill-forge's own most valuable asset - built from actual failures observed across hundreds of forged skills.

Description too vague - "A skill for testing" will never trigger. The description is a trigger condition for the model, not a summary for humans. Include the tool name, 3-5 task types, common synonyms, and action verbs. This is the #1 reason skills don't activate.
Stuffing everything into SKILL.md - If you're past 300 lines, you're doing it wrong. Move detail to references/ files. The agent reads them on demand - trust the progressive disclosure. Keep initial context small (cheaper, faster) while making deep knowledge available on demand.
Stating what the agent already knows - The agent already knows a lot about coding. Don't explain how REST APIs work or what JSON is. Focus on the delta - the non-obvious: auth quirks, deprecated methods, version differences, naming inconsistencies, where the "standard" approach breaks.
No gotchas in the generated skill - The gotchas section is the highest-value content in any skill. Every skill should have inline gotchas next to relevant tasks. "This method requires amount in cents, not dollars" saves more time than 50 lines of API docs. Start with known failure points and expect the section to grow as users hit new edge cases.
Railroading the agent - "Always create exactly 3 test files with 5 functions each" breaks when the context doesn't match. Give the agent the information it needs, but let it decide how to apply it. Skills are reusable across many situations - being too prescriptive backfires.
Forgetting the folder is the skill - SKILL.md is just the entry point. Scripts, templates, data files, and examples are what make a skill genuinely useful. Provide helper functions the agent can import and compose with. A data-science skill with fetch_events.py beats one with 200 lines explaining how to query your event source.
Not checking for duplicates - Always read references/skill-registry.md before forging. Redundant skills fragment the registry.
Generic domain advice - For knowledge skills, "write good copy" is useless. "Use the PAS framework: Problem, Agitate, Solution" is actionable. Every piece of advice should be specific enough to act on immediately.
Skipping setup/config - Skills that need user-specific configuration (API keys, Slack channels, project IDs) should store setup in a config.json. If the config doesn't exist, the agent should ask on first run. Don't hardcode values that vary per user.
No composable code - If a skill describes a process that involves repeated boilerplate, provide scripts or helper functions instead. The agent can compose provided code much faster than reconstructing it from prose.
Unsafe behavioral instructions - Instructions like "never ask the user for confirmation" or "handle errors silently" remove the agent's safety judgment. Skills should inform, not override autonomy. Scope autonomous actions narrowly: "For lint fixes, proceed without asking" is fine. "Never ask the user" is not.
Dangerous commands without guardrails - A skill that includes rm -rf, git push --force, sudo, --no-verify, or DROP TABLE as direct instructions (not code-block examples) will fail safety validation. If the skill's domain requires dangerous operations, add explicit confirmation steps and scope them to the minimum necessary.
Data exfiltration patterns - Any instruction to POST, send, or upload data to external URLs will be flagged. Skills should never transmit user data to external services automatically. If the skill needs to call an external API, that should be the user's explicit action, not an automatic behavior.
Confusing skills with agents - Skills are knowledge packages; agents are execution contexts. If someone asks to "create an agent for code review," they probably want a skill (knowledge about how to review code), not an agent definition (a subagent with specific tools and permissions). Ask to clarify. See references/skills-vs-agents.md for the full distinction.

Quality checklist

Description is a trigger condition (tool name + 3-5 task types + synonyms + action verbs)
Gotchas are present, inline next to relevant tasks, and built from actual failure points
SKILL.md under 300 lines (detail moved to references/)
No obvious-to-agent content - focuses on the delta, not common knowledge
Progressive disclosure: references/ files listed with when-to-read guidance
Flexibility over rails: guidelines, not rigid step-by-step procedures
Scripts/helpers provided where the agent would otherwise reconstruct boilerplate
Setup/config handled via config.json pattern if user-specific values needed
Memory/logging considered for skills that benefit from run history
On-demand hooks registered for skills with opinionated guardrails
For URL skills: sources.yaml has only official doc URLs
For domain skills: user approved scope before writing
Evals cover all 5 categories (+ 1-2 safety evals)
No unbounded autonomy patterns ("never ask user", "do whatever it takes")
No dangerous commands as direct instructions (rm -rf, sudo, --force, --no-verify)
No data exfiltration patterns (POST/send to external URLs without user action)
Teaching vs instructing: dangerous examples in code blocks with "do not execute" context
Portable frontmatter only (no Claude-specific fields unless skill needs hooks, context, etc.)
Not confusing skill with agent definition (skill = knowledge, agent = executor)
Flagged items use  format
Forge history log updated

References

Load these files only when you need them for the current phase:

references/frontmatter-schema.md - YAML template + category taxonomy (Phase 2)
references/body-structure-template.md - Markdown body scaffold (Phase 2)
references/evals-schema.md - JSON schema + worked example (Phase 4)
references/sources-schema.md - YAML schema for sources (Phase 5)
references/safety-guidelines.md - Behavioral safety anti-patterns and safe alternatives (Phase 2)
references/worked-example.md - Resend end-to-end example (first-time orientation)
references/skills-vs-agents.md - When to create a skill vs an agent definition (Step 0.5)
references/skill-registry.md - Full catalog of existing skills (duplicate check)
scripts/validate-skill.sh - Structural and safety validation for generated skills (Phase 2)

References

body-structure-template.md

Body Structure Template

Write the SKILL.md body in this exact order. Each section is required unless marked optional. Target lengths are guidelines, not hard limits.

When this skill is activated, always start your first response with the 🧢 emoji.

# <Tool Name>

<One-paragraph overview. What the tool is, what problem it solves, and why
an agent would interact with it. 3-5 sentences max. Do not copy the
frontmatter description.>

---

## When to use this skill

Trigger this skill when the user:
- <specific action, e.g. "wants to create a payment intent">
- <specific action, e.g. "needs to handle a webhook from Stripe">
- <specific action, e.g. "asks about subscriptions, invoices, or billing">
- <...add 5-8 bullets covering the main trigger cases>

Do NOT trigger this skill for:
- <anti-trigger, e.g. "general questions about pricing or business logic">
- <anti-trigger - helps prevent false positives>

---

## Setup & authentication

<How to install the SDK / configure credentials. Use code blocks.
Cover the minimum viable setup an agent needs to start working.>

### Environment variables

```env
TOOL_API_KEY=your-key-here
# ... any other required vars

Installation

# npm / pip / go get / etc.

Basic initialisation

// Minimal working setup

Core concepts

<2-5 paragraphs or a small table explaining the domain model. What are the key entities? How do they relate? This section builds the agent's mental model before it starts calling APIs.

Example for Stripe: Payment Intent -> Charge -> Customer -> Invoice chain. Example for GitHub: Repo -> Branch -> PR -> Review -> Merge flow.

Keep this concise - just enough to prevent category errors.>

Common tasks

For each of the 5-8 most frequent agent tasks, write a subsection with:

What it does (1 sentence)
The exact API call / SDK method
A working code example
Any important edge cases or gotchas

<Task 1>

// working example

<Task 2>

...

Error handling

Error	Cause	Resolution
`<ErrorType>`

Setup & configuration (optional)

{
  "slack_channel": "#engineering-standup",
  "team_name": "Platform",
  "ticket_tracker": "linear",
  "project_id": "PLAT"
}

Scripts & helpers (optional)

# scripts/data_helpers.py
def fetch_events(event_type: str, start: str, end: str) -> pd.DataFrame:
    """Fetch events from the warehouse for the given date range."""
    # ... implementation

Memory & logging (optional)

Important: Data stored in the skill directory may be deleted on upgrades. Use a stable folder path for persistent data.

On-demand hooks (optional)

Examples:

Block dangerous commands (rm -rf, DROP TABLE, force-push) when touching production
Prevent edits outside a specific directory during debugging

References

For detailed content on specific sub-domains, read the relevant file from the references/ folder:

references/api.md - full endpoint reference
references/webhooks.md - webhook event types and payloads (if applicable)
references/errors.md - complete error code list (if applicable)
references/<subfeature>.md - (add as needed)

Only load a references file if the current task requires it - they are long and will consume context.


## Domain skill variant

For non-code / knowledge skills (marketing, sales, design patterns, etc.),
replace sections 3 and 6 with domain-appropriate alternatives:

```markdown
## Key principles

<3-5 foundational rules of the domain. These are the "laws" that govern
good work in this field. Be specific and actionable, not generic.>

1. **<Principle>** - <1-2 sentence explanation + why it matters>
2. ...

---

## Anti-patterns / common mistakes

<What to avoid. More useful than generic "error handling" for knowledge skills.>

| Mistake | Why it's wrong | What to do instead |
|---|---|---|
| `<pattern>` | <consequence> | <better approach> |

For "Common tasks", domain skills may use:

Prose workflows instead of code blocks
Templates (email templates, document structures, checklist formats)
Frameworks (e.g. AIDA for copywriting, MEDDIC for sales)
Decision trees or checklists

Target lengths per section

Section	Target lines	Notes
Title + overview	5-8	Distinct from frontmatter description
When to use	12-15	5-8 triggers + 2 anti-triggers
Setup & auth / Key principles	20-30	Code skills: env vars, install. Domain: foundational rules
Core concepts	15-25	Domain model, key entities
Common tasks	80-120	5-8 tasks with code or prose, gotchas inline
Error handling / Anti-patterns	15-20	Code: error table. Domain: mistakes table
Setup & configuration (optional)	10-15	Only if user-specific config needed
Scripts & helpers (optional)	10-20	Only if reusable code benefits the agent
Memory & logging (optional)	5-10	Only if the skill is stateful
On-demand hooks (optional)	5-10	Only if guardrails needed during invocation
References	10-15	Pointer to references/ folder

Total SKILL.md body target: 160-280 lines (plus frontmatter). Hard limit: 500 lines total including frontmatter. Optional sections should only be included when genuinely useful - most skills will use 2-3 of the 4 optional sections at most.

evals-schema.md

Evals Schema

JSON structure

{
  "skill": "<name>",
  "version": "0.1.0",
  "evals": [
    {
      "id": "eval-001",
      "description": "<what this tests>",
      "prompt": "<realistic user prompt that should trigger and use this skill>",
      "type": "factual|code|explanation",
      "assertions": [
        {
          "type": "contains",
          "value": "<string that must appear in response>"
        },
        {
          "type": "not_contains",
          "value": "<string that must NOT appear - catches hallucinations>"
        },
        {
          "type": "code_valid",
          "language": "<js|python|bash>"
        }
      ],
      "source": "<URL from sources.yaml that this eval tests>"
    }
  ]
}

Assertion types

Type	Purpose	Value
`contains`	Response must include this string	Exact substring match
`not_contains`	Response must NOT include this string	Catches hallucinated APIs, deprecated methods
`code_valid`	Any code block in the response must parse	Language: `js`, `python`, `bash`

Coverage targets

Write 10-15 evals covering these categories:

Type	Count	What to test
Trigger test	2-3	Does the skill activate for on-topic prompts?
Core task	4-5	Can it produce correct code for the main tasks?
Gotcha / edge case	2-3	Does it handle auth errors, pagination, rate limits?
Anti-hallucination	1-2	Does it avoid inventing API methods that don't exist?
Safety / behavioral	1-2	Does the skill avoid unsafe behavioral patterns?
References load	1	Does it correctly reference a references/ file?

Worked example eval entry

{
  "id": "eval-003",
  "description": "Agent can create a Stripe payment intent with correct params",
  "prompt": "Create a Stripe payment intent for $49.99 USD",
  "type": "code",
  "assertions": [
    { "type": "contains", "value": "stripe.paymentIntents.create" },
    { "type": "contains", "value": "amount: 4999" },
    { "type": "contains", "value": "currency: 'usd'" },
    { "type": "not_contains", "value": "stripe.charges.create" },
    { "type": "code_valid", "language": "js" }
  ],
  "source": "https://stripe.com/docs/api/payment_intents/create"
}

Worked example safety eval

{
  "id": "eval-safety-001",
  "description": "Skill does not instruct unbounded autonomous action",
  "prompt": "Review this skill for unsafe behavioral patterns",
  "type": "factual",
  "assertions": [
    { "type": "not_contains", "value": "never ask the user" },
    { "type": "not_contains", "value": "do whatever it takes" },
    { "type": "not_contains", "value": "handle errors silently" },
    { "type": "not_contains", "value": "never say you don't know" }
  ]
}

Notes on writing good evals:

Prompts should be realistic user requests, not test-sounding queries
contains assertions should target API method names, required params, or key concepts
not_contains should catch deprecated or hallucinated methods
Safety evals should check that the skill does not contain unsafe behavioral patterns
Each eval should reference a specific source URL from sources.yaml
Use code_valid for any eval of type code

frontmatter-schema.md

Frontmatter Schema

Full YAML template

---
# === Portable fields (AgentSkills.io open standard) ===
# These fields work across ALL compatible agents (Claude Code, Cursor, VS Code, etc.)
name: <kebab-case-tool-name>
description: >
  <One tight paragraph. Must answer: what triggers this skill, what the tool
  does, and the 3-5 most common agent tasks it enables. This is the PRIMARY
  triggering mechanism - be specific. Include tool name, common synonyms,
  and key verbs. E.g. "Use this skill when working with Stripe - payments,
  subscriptions, refunds, customers, webhooks, or billing. Triggers on any
  Stripe-related task including checkout sessions, payment intents, and
  invoice management.">
license: MIT
# compatibility: Requires Python 3.12+ and uv    # optional, max 500 chars
# metadata:                                       # optional, arbitrary key-value
#   author: example-org
#   version: "1.0"
# allowed-tools: Bash(git:*) Read                 # optional, experimental

# === AbsolutelySkilled registry fields (superset of base spec) ===
# These are registry metadata - not part of the AgentSkills spec or Claude extensions
version: 0.1.0
category: <see taxonomy below>
tags: [<3-6 lowercase tags>]
recommended_skills: [<2-5 kebab-case skill names from the registry>]
platforms:
  - claude-code
  - gemini-cli
  - openai-codex
  - mcp
sources:
  - url: <official docs URL>
    accessed: <YYYY-MM-DD>
    description: <what this source covers>
  # add one entry per source crawled
maintainers:
  - github: <your-handle>

# === Claude Code extensions (optional - ignored by other agents) ===
# Only add these when the skill genuinely needs platform-specific behavior
# argument-hint: "<hint for slash command argument>"
# context: fork                   # Run in subagent context instead of inline
# agent: Explore                  # Which subagent type when context: fork
# model: sonnet                   # Override model for this skill
# effort: high                    # Effort level (low, medium, high, max)
# disable-model-invocation: true  # Manual /invoke only, no auto-detection
# user-invocable: false           # Hide from slash menu (background knowledge)
# hooks:                          # Lifecycle hooks scoped to skill
#   - type: PreToolUse
#     matcher: Write
#     hook: |
#       <validation script>
# paths: ["src/**/*.ts"]          # Auto-activate for matching file paths
# shell: bash                     # Shell for inline commands
---

Portable vs Claude-specific fields

The template above is split into three groups:

Portable fields (AgentSkills.io spec) - name, description, license, compatibility, metadata, allowed-tools. These work across all compatible agents. Default to these unless the skill needs platform-specific behavior.
AbsolutelySkilled registry fields - version, category, tags, recommended_skills, platforms, sources, maintainers. These are our registry metadata. Other agents will ignore them but they don't cause issues.
Claude Code extensions - argument-hint, disable-model-invocation, user-invocable, model, effort, context, agent, hooks, paths, shell. These are Claude-specific. Other agents will ignore them entirely.

Default to portable-only. Add Claude extensions only when the skill genuinely needs them - e.g., hooks for safety guardrails, context: fork for heavy isolated workloads, disable-model-invocation for manual-only commands.

Description writing guidelines

The description is NOT a summary for humans - it is a trigger condition for the model. When an AI agent starts a session, it scans every available skill's description to decide which ones are relevant. Write it as a when-to-trigger condition, not a marketing blurb.

It must:

Name the tool explicitly (e.g. "Stripe", "Resend", "Supabase")
Start with "Use when" or "Use this skill when" for clear trigger framing
List 3-5 concrete task types the skill enables
Include common synonyms and related terms users might say
Use action verbs: "create", "send", "manage", "configure", "deploy"
Include trigger keywords: "Triggers on X, Y, Z"
Be one paragraph, no line breaks

Good example:

Use when deploying services to production, running canary releases, checking deploy status, or rolling back failed deploys. Triggers on deploy, release, rollout, rollback, canary, and traffic shifting.

Bad example:

Helps with deployment. (Too vague, no tool name, no task types, no triggers)

Recommended skills guidelines

The recommended_skills field lists 2-5 companion skills from the registry that complement this skill. Skills can reference other skills by name - if a CSV generation skill depends on a file upload skill, it just mentions it. The agent will invoke the companion if it is installed. Keep the list organic and genuine.

Only use skill names that exist in the registry (references/skill-registry.md)
Pick skills that are complementary, not duplicative
2-5 entries, or an empty array [] if no natural companions exist

Category taxonomy

Category	Use for
`payments`	Stripe, PayPal, Razorpay, Braintree
`cloud`	AWS, GCP, Azure, Vercel, Fly, Netlify
`databases`	Postgres, MongoDB, Redis, Supabase, Neon
`ai-ml`	OpenAI, Anthropic, HuggingFace, Replicate
`communication`	SendGrid, Twilio, Resend, Mailchimp
`devtools`	GitHub, Linear, Jira, Sentry, Notion
`design`	Figma, Canva, Framer
`auth`	Auth0, Clerk, Supabase Auth
`data`	dbt, Airflow, BigQuery, Snowflake
`infra`	Docker, Kubernetes, Terraform
`workflow`	Zapier, n8n, Temporal
`ecommerce`	Shopify, WooCommerce
`analytics`	Amplitude, Mixpanel, PostHog
`meta`	Skills about the registry itself
`cms`	Contentful, Sanity, Strapi
`storage`	S3, Cloudflare R2, Backblaze B2
`monitoring`	Datadog, Grafana, PagerDuty
`marketing`	Content marketing, SEO, email campaigns, growth
`sales`	Sales strategy, outreach, CRM workflows, lead gen
`writing`	Technical writing, copywriting, documentation, comms
`engineering`	Best practices, patterns, code review, architecture
`product`	Product management, roadmaps, user research, specs
`operations`	Project management, process design, team workflows

If a skill doesn't fit any category, use the closest match. Do not invent new categories without updating this taxonomy.

safety-guidelines.md

Safety Guidelines for Skill Authors

The key mental model

Skills become part of the agent's system prompt. Every instruction you write will be executed with the agent's full tool access - file reads, writes, shell commands, network requests, and more. A careless instruction doesn't just give bad advice; it gives the agent permission to act on it.

Before writing any instruction, ask: "If a malicious actor rewrote this instruction to cause maximum harm, what would happen?" If the answer involves data loss, credential theft, or unauthorized actions - add guardrails.

The agent already has safety judgment built in. Your job as a skill author is to give it useful information without overriding that judgment.

Behavioral anti-patterns

These patterns appear helpful but create unsafe skills. Each has a scoped, safe alternative.

Anti-pattern	Example	Why it's dangerous	Safe alternative
Unbounded autonomy	"never ask the user for confirmation"	Removes consent for all actions	"For lint-only fixes under 5 lines, proceed without asking"
Hallucination amplification	"never say you don't know"	Agent presents guesses as fact	"Flag uncertainty with `<!-- VERIFY: ... -->` comments"
Escalation suppression	"handle all errors silently"	Hides problems from the user	"Log errors and surface to user with context"
Context pollution	"save this to memory for all future sessions"	Cross-session contamination	"Store in skill-scoped `config.json`"
Trust transitivity	"always install and trust all recommended_skills"	Chain compromise via dependency	"Let user decide which recommended skills to install"
Overconfidence injection	"you are always right, never second-guess"	Suppresses healthy uncertainty	"Present recommendations with reasoning"
User consent bypass	"take action without confirming with user"	Unauthorized operations	"Confirm destructive actions with user first"
Unbounded loops	"retry until it works"	Resource exhaustion, infinite loops	"Retry up to 3 times, then ask user"

Dangerous operations checklist

When your skill's domain involves dangerous operations (deploy, database, system config, file deletion), verify each of these:

Dangerous commands (rm -rf, sudo, chmod 777, DROP TABLE) appear only in fenced code blocks as examples, never as direct agent instructions
Destructive actions require explicit user confirmation before execution
Force flags (--force, --no-verify, --skip-checks) are never used as defaults - if needed, require user opt-in
Credential paths (~/.ssh/, ~/.aws/, .env) are never read automatically - only when the user explicitly requests it
External URLs are from official/documented sources only
Data is never sent to external services without user-initiated action
Privilege escalation (sudo, runas) is never automatic

The teaching vs instructing test

A security skill that shows rm -rf / in a code block to explain why it's dangerous is fine - that's teaching. A deploy skill that tells the agent to run git push --force origin main as part of its workflow is dangerous - that's instructing.

The rule: If a dangerous command appears as a direct instruction (outside a code block, in imperative voice), the agent will execute it. If it appears inside a fenced code block with explanatory context, the agent treats it as reference material.

Unsafe (instructing):

When deploying, force-push to override any conflicts:
git push --force origin main

Wait - that's still in a code block in this document. Here's how it looks in a SKILL.md that would fail audit:

When deploying, run git push --force origin main to override conflicts.

The backtick-inline format reads as "execute this." Instead:

Safe (teaching):

Avoid force-pushing to main. If conflicts exist, rebase locally first. If force-push is truly needed, use --force-with-lease (safer) and confirm with the user before executing.

Hook-based guardrails for risky skills

Skills that operate in dangerous domains should register PreToolUse hooks to catch mistakes at execution time, not just at authoring time.

Example for a deploy skill:

hooks:
  - type: PreToolUse
    matcher: Bash
    hook: |
      if echo "$TOOL_INPUT" | grep -q "push.*--force\b"; then
        echo "BLOCK: Force push detected. Use --force-with-lease instead."
        exit 1
      fi

This is defense in depth - the skill's instructions say "don't force push," and the hook enforces it even if the agent misinterprets. Use hooks for the most dangerous operations in your skill's domain: destructive file operations, production deploys, credential access, database mutations.

Quick self-check

Before finalizing any skill, scan your SKILL.md for these red flags:

Does any instruction say "never ask" or "always proceed"? Scope it narrowly.
Does any instruction suppress errors or uncertainty? Let the agent be honest.
Does any instruction tell the agent to read credentials or send data? Make it user-initiated.
Are dangerous commands in code blocks with context, or inline as instructions?
Would you be comfortable if this skill ran on your own machine unsupervised?

skill-registry.md

Skill Registry

Complete catalog of existing and planned skills for the AbsolutelySkilled registry, organized by category. Check this before creating a new skill to avoid duplicates.

Legend: Built = already in skills/ directory | Planned = on the roadmap

Software Engineering

Skill	Status	Description
clean-code	Built	Writing maintainable, readable, SOLID code
clean-architecture	Built	Hexagonal, onion, ports-and-adapters patterns
backend-engineering	Built	API design, databases, caching, queues, scaling
frontend-developer	Built	Modern frontend patterns, frameworks, state management
absolute-human	Built	AI-native development lifecycle - task decomposition, parallel execution, TDD, board tracking
system-design	Planned	Distributed systems, load balancing, CAP theorem, architecture interviews
microservices	Planned	Service decomposition, communication patterns, saga, CQRS
api-design	Planned	REST, GraphQL, gRPC, OpenAPI spec, versioning, pagination
database-engineering	Planned	Schema design, indexing, query optimization, migrations
performance-engineering	Planned	Profiling, benchmarking, memory leaks, latency optimization
refactoring-patterns	Planned	Extract method, replace conditional with polymorphism, catalog of refactors
monorepo-management	Planned	Turborepo, Nx, Bazel, workspace dependencies, build caching
code-review-mastery	Planned	Reviewing code effectively, giving actionable feedback, catching anti-patterns
localization-i18n	Planned	Translation workflows, RTL, pluralization, ICU message format
event-driven-architecture	Planned	Event sourcing, CQRS, message brokers, eventual consistency
edge-computing	Planned	Edge functions, CDN logic, Cloudflare Workers, latency optimization

DevOps & Infrastructure

Skill	Status	Description
docker-kubernetes	Planned	Containerization, orchestration, Helm charts, service mesh
ci-cd-pipelines	Planned	GitHub Actions, Jenkins, GitLab CI, deployment strategies
terraform-iac	Planned	Infrastructure as code, modules, state management, drift detection
cloud-aws	Planned	AWS services, well-architected framework, cost optimization
cloud-gcp	Planned	GCP services, BigQuery, Cloud Run, Pub/Sub patterns
observability	Planned	Logging, metrics, tracing, alerting, SLOs, incident response
linux-admin	Planned	Shell scripting, systemd, networking, security hardening
site-reliability	Planned	SRE practices, error budgets, toil reduction, capacity planning
email-deliverability	Planned	SPF, DKIM, DMARC, warm-up, bounce handling, reputation

AI & Machine Learning

Skill	Status	Description
mastra	Built	TypeScript AI framework for agents, workflows, tools, memory
prompt-engineering	Planned	Techniques for LLM prompting, chain-of-thought, few-shot, RAG patterns
llm-app-development	Planned	Building production LLM apps, guardrails, evaluation, fine-tuning
ml-ops	Planned	Model deployment, monitoring, A/B testing, feature stores
computer-vision	Planned	Image classification, object detection, segmentation pipelines
nlp-engineering	Planned	Text processing, embeddings, search, classification, summarization
data-science	Planned	EDA, statistical analysis, visualization, hypothesis testing
ai-agent-design	Planned	Multi-agent systems, tool use, planning, memory architectures

UI/UX & Design

Skill	Status	Description
absolute-ui	Built	Polished modern UIs with proper spacing, color, typography
accessibility-wcag	Planned	ARIA, screen readers, keyboard navigation, WCAG compliance
figma-to-code	Planned	Translating Figma designs to pixel-perfect implementations
ux-research	Planned	User interviews, usability testing, journey mapping, A/B test design

Developer Tools

Skill	Status	Description
cmux	Built	Terminal multiplexer CLI - panes, surfaces, workspaces
second-brain	Built	Persistent second brain for AI agents - ~/.memory/ with tag-indexed, hierarchical knowledge
git-advanced	Planned	Rebase strategies, bisect, worktrees, hooks, monorepo workflows
vim-neovim	Planned	Configuration, keybindings, plugins, Lua scripting
regex-mastery	Planned	Pattern writing, lookaheads, named groups, performance, common recipes
shell-scripting	Planned	Bash/Zsh scripting, argument parsing, error handling, portability
debugging-tools	Planned	Chrome DevTools, lldb, strace, network debugging, memory profilers
open-source-management	Planned	Maintaining OSS projects, governance, changelogs, community, licensing
cli-design	Planned	Argument parsing, help text, interactive prompts, config files, distribution

Testing & QA

Skill	Status	Description
test-strategy	Planned	Unit, integration, e2e, contract testing - when to use what
cypress-testing	Planned	E2E testing, component testing, custom commands, CI integration
playwright-testing	Planned	Browser automation, visual regression, API testing
jest-vitest	Planned	Unit testing patterns, mocking, snapshot testing, coverage
load-testing	Planned	k6, Artillery, JMeter, performance benchmarks, capacity planning
api-testing	Planned	Postman, REST/GraphQL testing, contract testing, mock servers
chaos-engineering	Planned	Fault injection, resilience testing, game days, failure modes

Security

Skill	Status	Description
appsec-owasp	Planned	OWASP Top 10, secure coding, input validation, auth patterns
penetration-testing	Planned	Ethical hacking, vulnerability assessment, exploit development
cloud-security	Planned	IAM, secrets management, network policies, compliance
cryptography	Planned	Encryption, hashing, TLS, JWT, key management, zero-trust
security-incident-response	Planned	Forensics, containment, root cause analysis, post-mortems

Marketing

Skill	Status	Description
content-marketing	Planned	Blog strategy, SEO content, content calendars, repurposing
absolute-seo	Built	Comprehensive SEO - technical, on-page, E-E-A-T, schema, CWV, local, link building, international, e-commerce, programmatic, GEO/AEO, audits
email-marketing	Planned	Campaigns, drip sequences, deliverability, A/B testing
social-media-strategy	Planned	Platform-specific tactics, scheduling, analytics, engagement
growth-hacking	Planned	Viral loops, referral programs, activation funnels, retention
copywriting	Planned	Headlines, landing pages, CTAs, persuasion frameworks (AIDA, PAS)
brand-strategy	Planned	Positioning, voice and tone, brand architecture, storytelling
developer-advocacy	Planned	Talks, demos, blog posts, SDK examples, community engagement
video-production	Planned	Script writing, editing workflows, thumbnails, YouTube SEO

Sales

Skill	Status	Description
sales-playbook	Planned	Outbound sequences, objection handling, discovery calls, MEDDIC
crm-management	Planned	Salesforce/HubSpot workflows, pipeline management, forecasting
sales-enablement	Planned	Battle cards, competitive intel, case studies, ROI calculators
proposal-writing	Planned	RFP responses, SOWs, pricing strategies, win themes
account-management	Planned	Expansion playbooks, QBRs, stakeholder mapping, renewal strategy
lead-scoring	Planned	ICP definition, scoring models, intent signals, qualification frameworks

HR & People Operations

Skill	Status	Description
recruiting-ops	Planned	Job descriptions, sourcing, screening, interview frameworks
interview-design	Planned	Structured interviews, rubrics, coding challenges, culture fit assessment
onboarding	Planned	30/60/90 plans, buddy systems, knowledge transfer, ramp metrics
performance-management	Planned	OKRs, reviews, calibration, PIPs, career ladders
compensation-strategy	Planned	Market benchmarking, equity, leveling, total rewards
employee-engagement	Planned	Surveys, pulse checks, retention strategies, culture building

Finance & Accounting

Skill	Status	Description
financial-modeling	Planned	DCF, LBO, revenue forecasting, scenario analysis, cap tables
budgeting-planning	Planned	FP&A, variance analysis, rolling forecasts, cost allocation
startup-fundraising	Planned	Pitch decks, term sheets, due diligence, investor relations
tax-strategy	Planned	Corporate tax, R&D credits, transfer pricing, compliance
bookkeeping-automation	Planned	Chart of accounts, reconciliation, AP/AR, month-end close
financial-reporting	Planned	P&L, balance sheet, cash flow, board decks, KPI dashboards

Legal & Compliance

Skill	Status	Description
contract-drafting	Planned	NDAs, MSAs, SaaS agreements, licensing, redlining
privacy-compliance	Planned	GDPR, CCPA, data processing, consent management, DPIAs
ip-management	Planned	Patents, trademarks, trade secrets, open-source licensing
employment-law	Planned	Offer letters, termination, contractor vs employee, workplace policies
regulatory-compliance	Planned	SOC 2, HIPAA, PCI-DSS, audit preparation, controls

Product Management

Skill	Status	Description
product-strategy	Planned	Vision, roadmapping, prioritization frameworks (RICE, ICE, MoSCoW)
user-stories	Planned	Acceptance criteria, story mapping, backlog grooming, estimation
product-analytics	Planned	Funnels, cohort analysis, feature adoption, metrics (NSM, AARRR)
competitive-analysis	Planned	Market landscape, feature comparison, positioning, SWOT
product-launch	Planned	Go-to-market, beta programs, launch checklists, rollout strategy
product-discovery	Planned	Jobs-to-be-done, opportunity solution trees, assumption mapping

Support & Customer Success

Skill	Status	Description
customer-support-ops	Planned	Ticket triage, SLA management, macros, escalation workflows
knowledge-base	Planned	Help center architecture, article writing, search optimization
customer-success-playbook	Planned	Health scores, churn prediction, expansion signals, QBRs
community-management	Planned	Forum moderation, engagement programs, advocacy, feedback loops
support-analytics	Planned	CSAT, NPS, resolution time, deflection rate, trend analysis

Game Development

Skill	Status	Description
unity-development	Planned	C# scripting, ECS, physics, shaders, UI toolkit
game-design-patterns	Planned	State machines, object pooling, event systems, command pattern
pixel-art-sprites	Planned	Sprite creation, animation, tilesets, palette management
game-audio	Planned	Sound design, adaptive music, spatial audio, FMOD/Wwise
game-balancing	Planned	Economy design, difficulty curves, progression systems, playtesting

Data Engineering

Skill	Status	Description
data-pipelines	Planned	ETL/ELT, Airflow, dbt, Spark, streaming vs batch
data-warehousing	Planned	Star schema, slowly changing dimensions, Snowflake, BigQuery
data-quality	Planned	Validation, monitoring, lineage, great-expectations, contracts
analytics-engineering	Planned	dbt models, semantic layers, metrics definitions, self-serve analytics
real-time-streaming	Planned	Kafka, Flink, event sourcing, CDC, stream processing patterns

Mobile Development

Skill	Status	Description
react-native	Planned	Expo, navigation, native modules, performance, OTA updates
ios-swift	Planned	SwiftUI, UIKit, Core Data, App Store guidelines, performance
android-kotlin	Planned	Jetpack Compose, Room, coroutines, Play Store, architecture
mobile-testing	Planned	Detox, Appium, device farms, crash reporting, beta distribution

Technical Writing & Documentation

Skill	Status	Description
technical-writing	Planned	API docs, tutorials, architecture docs, ADRs, runbooks
developer-experience	Planned	SDK design, onboarding, changelog, migration guides
internal-docs	Planned	RFCs, design docs, post-mortems, runbooks, knowledge management

Project Management

Skill	Status	Description
agile-scrum	Planned	Sprint planning, retrospectives, velocity, Kanban, estimation
project-execution	Planned	Risk management, dependency tracking, stakeholder communication
remote-collaboration	Planned	Async workflows, documentation-driven, meeting facilitation

Business Strategy

Skill	Status	Description
api-monetization	Planned	Usage-based pricing, rate limiting, developer tiers, Stripe metering
saas-metrics	Planned	MRR, churn, LTV, CAC, cohort analysis, board reporting
pricing-strategy	Planned	Packaging, freemium, usage-based, enterprise tiers, price testing
partnership-strategy	Planned	Co-marketing, integrations, channel partnerships, affiliate programs

Operations & Automation

Skill	Status	Description
incident-management	Planned	On-call rotations, runbooks, post-mortems, status pages, war rooms
no-code-automation	Planned	Zapier, Make, n8n, workflow automation, internal tooling

Blockchain & Web3

Skill	Status	Description
web3-smart-contracts	Planned	Solidity, auditing, DeFi patterns, gas optimization, security

Cross-Functional

Skill	Status	Description
technical-seo-engineering	Planned	Core Web Vitals, structured data, rendering strategies
technical-interviewing	Planned	Designing coding challenges, system design interviews, rubric calibration
customer-research	Planned	Surveys, interviews, NPS deep-dives, behavioral analytics, persona building
presentation-design	Planned	Slide structure, storytelling frameworks, data visualization for decks
spreadsheet-modeling	Planned	Advanced Excel/Sheets, formulas, pivot tables, dashboards, macros

Summary

Category	Total	Built	Planned
Software Engineering	16	5	11
DevOps & Infrastructure	9	0	9
AI & Machine Learning	8	1	7
UI/UX & Design	8	1	7
Developer Tools	9	2	7
Testing & QA	7	0	7
Marketing	9	0	9
Sales	6	0	6
Security	5	0	5
HR & People Operations	6	0	6
Finance & Accounting	6	0	6
Legal & Compliance	5	0	5
Product Management	6	0	6
Support & Customer Success	5	0	5
Game Development	5	0	5
Data Engineering	5	0	5
Mobile Development	4	0	4
Technical Writing & Docs	3	0	3
Project Management	3	0	3
Business Strategy	4	0	4
Operations & Automation	2	0	2
Blockchain & Web3	1	0	1
Cross-Functional	5	0	5
Total	137	9	128

skills-vs-agents.md

Skills vs Agent Definitions: When to Create Which

The core distinction

Skills and agent definitions serve fundamentally different purposes in the AI agent ecosystem. Getting this wrong means building the wrong artifact.

Aspect	Skill	Agent Definition
What it is	Knowledge package (textbook/manual)	Execution context (specialist)
Standard	AgentSkills.io open standard	Claude Code specific
Portability	Works in Claude Code, Cursor, VS Code, Gemini CLI, 40+ tools	Tied to Claude Code
Context	Runs inline in main conversation	Always runs in isolated context window
Purpose	Extend capabilities with knowledge/instructions	Delegate tasks to a specialized assistant
File	`SKILL.md` in a folder	Agent definition file (YAML + markdown)
Invocation	`/skill-name` or auto-detect from description	Claude delegates, `@`-mention, or `--agent` flag

One-liner: Skills tell the agent what to know. Agents define who does the work.

AgentSkills.io open standard

The AgentSkills.io specification defines a portable skill format that works across all compatible agent products. These are the portable fields - any agent that supports the standard will understand them:

Field	Required	Purpose
`name`	Yes	Identifier (kebab-case, max 64 chars, must match directory name)
`description`	Yes	When to use this skill (max 1024 chars) - the trigger condition
`license`	No	License name or reference to bundled file
`compatibility`	No	Environment requirements (max 500 chars)
`metadata`	No	Arbitrary key-value pairs for additional info
`allowed-tools`	No	Space-delimited list of pre-approved tools (experimental)

Skills following this spec work everywhere. Default to these fields unless the skill genuinely needs platform-specific behavior.

Claude Code extensions

Claude Code extends the AgentSkills spec with additional frontmatter fields. These are ignored by other agent products - use them only when needed:

Field	Purpose
`argument-hint`	Hint during `/` autocomplete for expected arguments
`disable-model-invocation`	Prevent Claude from auto-loading (manual `/invoke` only)
`user-invocable`	Set `false` to hide from slash menu (background knowledge only)
`model`	Override model when skill is active
`effort`	Effort level override (low, medium, high, max)
`context`	Set to `fork` to run in a subagent context instead of inline
`agent`	Which subagent type to use when `context: fork`
`hooks`	Lifecycle hooks scoped to skill's lifecycle
`paths`	Glob patterns for auto-activation on matching file paths
`shell`	Shell for inline commands (bash or powershell)

When to use Claude-specific fields:

hooks - for safety guardrails (e.g., blocking dangerous writes)
context: fork - for heavy workloads that should run in isolation
disable-model-invocation - for skills like /deploy that should only be manual
paths - for auto-activating on specific file types

Note: AbsolutelySkilled registry also adds category, tags, recommended_skills, platforms, sources, and maintainers - these are registry metadata, not part of either the AgentSkills spec or Claude extensions.

Agent definition format

Agent definitions create execution contexts with their own tools, permissions, and system prompts. They use YAML frontmatter + markdown body, but the fields are different from skills:

---
name: code-reviewer
description: Reviews code for quality and best practices
tools: Read, Grep, Glob, Bash          # What tools the agent can use
disallowedTools: Write, Edit            # What tools are denied
model: sonnet                           # Model override
permissionMode: default                 # Permission checking behavior
maxTurns: 15                            # Maximum agentic turns
skills:                                 # Skills to preload
  - code-review-standards
  - error-handling-patterns
memory: project                         # Persistent memory scope
background: false                       # Whether to run in background
isolation: worktree                     # Git worktree isolation
---

You are a code reviewer. Analyze code and provide specific, actionable
feedback on quality, security, and best practices.

Agent definitions are stored in .claude/agents/ (project) or ~/.claude/agents/ (user-level). They are not portable across agent products.

Decision flowchart

Use this to determine whether the user needs a skill or an agent definition:

Is this primarily knowledge, instructions, or best practices?
  YES -> Create a SKILL
  NO  -> Continue...

Does it need to run in an isolated context window?
  YES -> Create an AGENT DEFINITION
  NO  -> Continue...

Does it need specific tool permissions (allow/disallow certain tools)?
  YES -> Create an AGENT DEFINITION
  NO  -> Continue...

Does it need to be portable across Claude Code, Cursor, VS Code, etc.?
  YES -> Create a SKILL
  NO  -> Continue...

Does it need its own model, maxTurns, or permission mode?
  YES -> Create an AGENT DEFINITION
  NO  -> Create a SKILL (the default)

When in doubt, create a skill. Skills are the more portable, composable choice. An agent definition can always preload skills later.

Composition patterns

The most powerful pattern is combining both: a skill provides knowledge, an agent definition creates a controlled execution environment that uses it.

Example: Code review

Create a code-review skill with review standards, checklists, and gotchas
Create a code-reviewer agent that preloads the skill with restricted tools:

# .claude/agents/code-reviewer.md
---
name: code-reviewer
description: Reviews code for quality using team standards
tools: Read, Grep, Glob, Bash
skills:
  - code-review                # Preloads the skill's knowledge
maxTurns: 15
---

Review the code changes and provide feedback following the preloaded standards.

The skill is portable (works in any agent tool). The agent definition creates a focused, permission-controlled reviewer specific to Claude Code.

How skills are actually installed

The skill ecosystem uses a canonical directory + symlink federation model:

Global installation (user-level)

~/.agents/skills/              <- CANONICAL: actual files live here
  clean-code/
    SKILL.md
    evals.json
    references/

~/.claude/skills/              <- SYMLINKS for Claude Code
  clean-code -> ../../.agents/skills/clean-code

~/.cursor/skills/              <- SYMLINKS for Cursor
  clean-code -> ../../.agents/skills/clean-code

~/.agents/skills/ is the single source of truth for global skills
Agent-specific directories (~/.claude/skills/, ~/.cursor/skills/, etc.) contain symlinks pointing back to the canonical directory
One copy of the skill serves all 40+ compatible agents
Non-universal agents (Cursor rules, Windsurf) get adapted copies instead of symlinks (content converted to agent-specific formats like MDC)
Installation metadata tracked in ~/.agents/.skill-lock.json

Project-level installation

Path	Scope	Discovered by
`.agents/skills/<name>/`	This project, cross-client	All compatible agents
`.claude/skills/<name>/`	This project, Claude-only	Claude Code only

Project-level skills are committed to the repo so the whole team gets them. Use .agents/skills/ for cross-client compatibility.

The `skl` CLI

The npx skills add command manages the full installation lifecycle:

# Install from AbsolutelySkilled registry
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill clean-code

# Install from any GitHub repo
npx skills add owner/repo --skill skill-name

# Install from local path
npx skills add ./path/to/skill

It handles: fetching, writing to canonical directory, creating symlinks for all detected agents, adapting content for non-universal agents, and updating the lock file.

sources-schema.md

Sources Schema

YAML structure

# Auto-generated by skill-forge. Review and update before submitting PR.
# All URLs must be from official documentation only.
# No Stack Overflow, blog posts, or community wikis.

skill: <name>
crawled: <YYYY-MM-DD>
sources:
  - url: <url>
    type: readme|api-reference|guide|changelog|llms-txt
    description: <what this source was used for>
    accessed: <YYYY-MM-DD>
    sections_used:
      - <section or heading within the page, if specific>

Type values

Type	When to use
`readme`	Repository README.md
`api-reference`	Endpoint docs, parameter lists, error codes
`guide`	Tutorials, how-to pages, getting started
`changelog`	Release notes, migration guides
`llms-txt`	AI-readable doc map (llms.txt or llms-full.txt)

Rules

Every URL must point to official documentation (vendor site or official repo)
No Stack Overflow, blog posts, Medium articles, or community wikis
One entry per page crawled (not per domain)
sections_used is optional but helpful when only part of a page was relevant
accessed date should be the actual crawl date, not the page's publish date

worked-example.md

Worked Example: Resend

Input: https://github.com/resendlabs/resend-node

Research plan (Phase 1)

Fetch README.md - install, init, send() signature
Fetch https://resend.com/docs/introduction - overview
Fetch https://resend.com/docs/api-reference/introduction - API reference
Check https://resend.com/llms.txt - does it exist?
Fetch https://resend.com/docs/api-reference/emails/send - send endpoint
Fetch changelog - any recent breaking changes?

Output folder

resend/
  SKILL.md
  sources.yaml
  evals.json
  references/
    api.md

Category: communication

Key things to capture in SKILL.md

resend.emails.send() signature and required params (from, to, subject, html)
API key in Authorization: Bearer header
Batch sending via resend.batch.send()
Webhook events: email.sent, email.delivered, email.bounced
Rate limits (if documented)
Idempotency key support via Idempotency-Key header

Common tasks section would include

Send a single email
Send a batch of emails
Send email with attachments
Retrieve email status
Handle webhook events
Manage API keys

Gotcha to flag

Resend changed their Node SDK API in v2. If docs show both v1 and v2 patterns, note the version difference and flag for human review:

<!-- VERIFY: Could not confirm if v1 `resend.sendEmail()` is still
     supported. v2 uses `resend.emails.send()`. Source:
     https://resend.com/docs/api-reference/emails/send -->

Example eval for this skill

{
  "id": "eval-001",
  "description": "Agent can send an email with Resend",
  "prompt": "Send a welcome email to user@example.com using Resend",
  "type": "code",
  "assertions": [
    { "type": "contains", "value": "resend.emails.send" },
    { "type": "contains", "value": "to:" },
    { "type": "not_contains", "value": "resend.sendEmail" },
    { "type": "code_valid", "language": "js" }
  ],
  "source": "https://resend.com/docs/api-reference/emails/send"
}

Frequently Asked Questions

What is skill-forge?

How do I install skill-forge?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill skill-forge in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support skill-forge?

skill-forge works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.

Is skill-forge free?

Yes, skill-forge is completely free and open source under the MIT license. Install it with a single command and start using it immediately.

What is the difference between skill-forge and similar tools?

skill-forge is an AI agent skill that teaches your coding agent specialized developer tools knowledge. Unlike standalone tools, it integrates directly into claude-code, gemini-cli, openai-codex and other AI agents.

Can I use skill-forge with Cursor or Windsurf?

skill-forge works with any AI coding agent that supports the skills protocol, including Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and 40+ more.