system-design
Use this skill when designing distributed systems, architecting scalable services, preparing for system design interviews, or making infrastructure decisions. Triggers on load balancing, CAP theorem, sharding, replication, caching strategies, message queues, microservices architecture, database selection, rate limiting, and any task requiring high-level system architecture decisions.
engineering architecturedistributed-systemsscalabilityinfrastructuredesignWhat is system-design?
Use this skill when designing distributed systems, architecting scalable services, preparing for system design interviews, or making infrastructure decisions. Triggers on load balancing, CAP theorem, sharding, replication, caching strategies, message queues, microservices architecture, database selection, rate limiting, and any task requiring high-level system architecture decisions.
system-design
system-design is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Designing distributed systems, architecting scalable services, preparing for system design interviews, or making infrastructure decisions.
Quick Facts
| Field | Value |
|---|---|
| Category | engineering |
| Version | 0.1.0 |
| Platforms | claude-code, gemini-cli, openai-codex |
| License | MIT |
How to Install
- Make sure you have Node.js installed on your machine.
- Run the following command in your terminal:
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill system-design- The system-design skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).
Overview
A practical framework for designing distributed systems and architecting scalable services. This skill covers the core building blocks - load balancers, databases, caches, queues, and CDNs - plus the trade-off reasoning required to use them well. It is built around interview scenarios because they compress the full design process into a repeatable structure you can also apply in real-world architecture decisions. Agents can use this skill to work through any system design problem from capacity estimation through detailed component design.
Tags
architecture distributed-systems scalability infrastructure design
Platforms
- claude-code
- gemini-cli
- openai-codex
Related Skills
Pair system-design with these complementary skills:
Frequently Asked Questions
What is system-design?
Use this skill when designing distributed systems, architecting scalable services, preparing for system design interviews, or making infrastructure decisions. Triggers on load balancing, CAP theorem, sharding, replication, caching strategies, message queues, microservices architecture, database selection, rate limiting, and any task requiring high-level system architecture decisions.
How do I install system-design?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill system-design in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support system-design?
This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.
Maintainers
Generated from AbsolutelySkilled
SKILL.md
System Design
A practical framework for designing distributed systems and architecting scalable services. This skill covers the core building blocks - load balancers, databases, caches, queues, and CDNs - plus the trade-off reasoning required to use them well. It is built around interview scenarios because they compress the full design process into a repeatable structure you can also apply in real-world architecture decisions. Agents can use this skill to work through any system design problem from capacity estimation through detailed component design.
When to use this skill
Trigger this skill when the user:
- Asks "how would you design X?" where X is a product or service
- Needs to choose between SQL and NoSQL databases
- Is evaluating load balancing, sharding, or replication strategies
- Asks about the CAP theorem or consistency vs availability trade-offs
- Is designing a caching strategy (what to cache, where, how to invalidate)
- Needs to estimate traffic, storage, or bandwidth for a system
- Is preparing for a system design interview
- Asks about rate limiting, API gateways, or CDN placement
Do NOT trigger this skill for:
- Line-level code review or specific algorithm implementations (use a coding skill)
- DevOps/infrastructure provisioning details like Terraform or Kubernetes manifests
Key principles
Start simple and justify complexity - Design the simplest system that satisfies the requirements. Introduce each new component (queue, cache, shard) only when you can name the specific constraint it solves. Complexity is a cost, not a feature.
Network partitions will happen - choose C or A - CAP theorem says distributed systems must sacrifice either consistency or availability during a partition. You cannot avoid partitions (P is not a choice). Pick CP for financial and inventory data; pick AP for feeds, caches, and preferences.
Scale horizontally, partition vertically - Stateless services scale out behind a load balancer. Data scales by separating hot from cold paths: read replicas before sharding, sharding before multi-region. Vertical scaling buys time; horizontal scaling buys headroom.
Design for failure at every layer - Every service will go down. Every disk will fill. Design fallback behavior before the happy path. Timeouts, retries with backoff, circuit breakers, and bulkheads are not optional refinements - they are table stakes.
Single responsibility for components - A component that does two things will be bad at both. Load balancers balance load. Caches serve reads. Queues decouple producers from consumers. Mixing responsibilities creates invisible coupling that makes the system fragile under load.
Core concepts
System design assembles six core building blocks. Each solves a specific problem.
Load balancers distribute requests across backend instances. L4 balancers route by TCP/IP; L7 balancers route by HTTP path, headers, and cookies. Use L7 for HTTP services. Algorithms: round-robin (default), least-connections (when request latency varies), consistent hashing (when you need sticky routing, e.g., cache affinity).
Caches reduce read latency and database load. Sit in front of the database. Patterns: cache-aside (default), write-through (strong consistency), write-behind (high write throughput, tolerate loss). Key concerns: TTL, invalidation strategy, and stampede prevention. Redis is the default; Memcached only when pure key-value at massive scale.
Databases are the source of truth. SQL for structured data with ACID transactions; NoSQL for scale, flexible schemas, or specific access patterns. Read replicas for read-heavy workloads. Sharding for write-heavy workloads that exceed one node.
Message queues decouple producers from consumers and absorb traffic spikes. Use for async work, fan-out events, and unreliable downstream dependencies. Always configure a dead-letter queue. SQS for AWS-native work; Kafka for high-throughput event streaming or replay.
CDNs cache static assets and edge-terminate TLS close to users. Reduces origin load and cuts latency for geographically distributed users. Use for images, JS/CSS, and any content with high read-to-write ratio.
API gateways enforce cross-cutting concerns - auth, rate limiting, request logging, TLS termination - at a single entry point. Never build a custom gateway; use Kong, Envoy, or a managed provider.
Common tasks
Design a URL shortener
Clarifying questions: Read-heavy or write-heavy? Need analytics? Custom slugs? Global or single-region?
Components:
- API service (stateless, horizontally scaled) behind L7 load balancer
- Key generation service - pre-generate Base62 short codes in batches and store in a pool; avoids hot write path
- Database - a relational DB works at moderate scale; switch to Cassandra for multi-region or >100k writes/sec
- Cache (Redis) - store short_code -> long_url mappings; TTL 24 hours; cache-aside
Redirect flow: Client hits CDN -> cache hit returns 301/302 -> cache miss reads DB -> populates cache -> returns redirect.
Scale signal: 100M URLs stored, 10B reads/day -> cache hit rate must be >99% to protect the DB.
Design a rate limiter
Algorithm choices:
- Token bucket (default) - allows bursts up to bucket capacity; fills at a constant rate. Best for user-facing APIs.
- Fixed window - simple counter per time window. Prone to burst at window edge.
- Sliding window log - exact, but memory-intensive.
- Sliding window counter - approximation using two fixed windows. Good balance.
Storage: Redis with atomic INCR and EXPIRE. Single Redis node is enough up to ~50k RPS per rule; use Redis Cluster for more.
Placement: In the API gateway (preferred) or as middleware. Always return
X-RateLimit-Remaining and Retry-After headers with 429 responses.
Distributed concern: With multiple gateway nodes, the counter must be centralized (Redis) - local counters undercount.
Design a notification system
Components:
- Notification API - accepts events from internal services
- Router service - reads user preferences and determines channels (push, email, SMS)
- Channel-specific workers (separate services) - dequeued from per-channel queues
- Template service - renders notification copy
- Delivery tracking - records sent/delivered/failed per notification
Queue design: One queue per channel (push-queue, email-queue, sms-queue). Isolates failure - SMS provider outage does not back up email delivery.
Critical path vs non-critical path:
- OTP and security alerts: synchronous, priority queue
- Marketing and social notifications: async, best-effort, can be batched
Design a chat system
Protocol: WebSockets for real-time bidirectional messaging. Long-polling as fallback for restrictive networks.
Storage split:
- Message history: Cassandra, keyed by (channel_id, timestamp). Append-only, high write throughput, easy time-range queries.
- User presence and metadata: Redis (in-memory, fast reads).
- User and channel info: PostgreSQL (relational, ACID).
Fanout: When a user sends a message, the server writes to the DB and then publishes to a pub/sub channel (Redis Pub/Sub or Kafka). Each recipient's connection server subscribes to relevant channels and pushes to the WebSocket.
Scale concern: Connection servers are stateful (WebSockets). Route users to the same connection server with consistent hashing. Use a service mesh for connection server discovery.
Choose between SQL vs NoSQL
Use this decision table:
| Need | Choose |
|---|---|
| ACID transactions across multiple entities | SQL |
| Complex joins and ad-hoc queries | SQL |
| Strict schema with referential integrity | SQL |
| Horizontal write scaling beyond single node | NoSQL (Cassandra, DynamoDB) |
| Flexible or evolving schema | NoSQL (MongoDB, DynamoDB) |
| Graph traversals | Graph DB (Neo4j) |
| Time-series data at high ingestion rate | TimescaleDB or InfluxDB |
| Key-value at very high throughput | Redis or DynamoDB |
Default: Start with PostgreSQL. It handles far more scale than most teams expect and its JSONB column covers flexible-schema needs up to moderate scale. Migrate to specialized stores when you have a measured bottleneck.
Estimate system capacity
Use the following rough constants in back-of-envelope estimates:
| Metric | Value |
|---|---|
| Seconds per day | ~86,400 (~100k rounded) |
| Bytes per ASCII character | 1 |
| Average tweet/post size | ~300 bytes |
| Average image (compressed) | ~300 KB |
| Average video (1 min, 720p) | ~50 MB |
| QPS from 1M DAU, 10 actions/day | ~115 QPS |
Process:
- Clarify scale (DAU, requests per user per day)
- Derive QPS:
(DAU * requests_per_day) / 86400 - Derive peak QPS:
average QPS * 2-3x - Derive storage:
writes_per_day * record_size * retention_days - Derive bandwidth:
peak QPS * average_response_size
State assumptions explicitly. Interviewers care about your reasoning, not the exact number.
Design caching strategy
Step 1 - Identify what to cache:
- Expensive reads that change infrequently (user profiles, product catalog)
- Computed aggregations (dashboard stats, leaderboards)
- Session tokens and auth lookups
Do NOT cache: frequently mutated data, financial balances, anything requiring strong consistency.
Step 2 - Choose pattern:
- Default: cache-aside with TTL
- Strong read-after-write: write-through
- High write throughput, loss acceptable: write-behind
Step 3 - Define invalidation:
- TTL expiry for most cases
- Explicit DELETE on write for cache-aside
- Never try to update a cached value in-place; DELETE then let the next read repopulate
Step 4 - Prevent stampede:
- Use a distributed lock (Redis SETNX) for high-traffic keys
- Add jitter to TTLs (base TTL +/- 10-20%) to spread expiry
Anti-patterns / common mistakes
| Mistake | Why it's wrong | What to do instead |
|---|---|---|
| Designing without clarifying requirements | You optimize for the wrong bottleneck and miss key constraints | Always spend 5 minutes on scope: scale, consistency needs, latency SLAs |
| Sharding before replication | Sharding is complex and expensive; replication + caching handles most read bottlenecks | Add read replicas and caching first; only shard when writes are the bottleneck |
| Shared database between services | Creates hidden coupling; one service's slow query can kill another | One database per service; expose data through APIs or events |
| Cache without invalidation plan | Stale reads cause data inconsistency; cache-DB drift grows silently | Define TTL and invalidation triggers before adding any cache |
| Ignoring the tail: all QPS estimates as average | p99 latency matters more than p50; a 2x peak multiplier is the minimum | Always model peak QPS (2-3x average) and design capacity for it |
| Single point of failure at every layer | Load balancer with no standby, single queue broker, one region | Identify SPOFs explicitly; add redundancy for any component whose failure kills the system |
Gotchas
CAP theorem is about partitions, not a free choice - You cannot "choose" to sacrifice partition tolerance. P is always present in distributed systems. The real choice is between C and A when a partition occurs. Framing it as a three-way trade-off is wrong.
Caching invalidation is the hard part, not caching itself - Most designs add Redis without defining when data becomes stale. The moment a cache-aside entry is written, define the exact condition that invalidates it. "We'll figure that out later" causes stale reads in production.
Read replicas have replication lag - Writes go to the primary; reads from replicas may be 10-100ms stale. If you route reads to replicas immediately after writes (e.g., "create, then fetch profile"), users will see the old version. Use read-after-write consistency or route critical reads to primary.
Consistent hashing does not eliminate hotspots - If one key receives dramatically more requests than others (celebrity user, viral post), consistent hashing still routes all requests for that key to the same shard. Solve with key-based sharding variants like adding a suffix, or cache at a higher layer.
Message queues do not guarantee exactly-once delivery - SQS standard queues deliver at-least-once; consumers must be idempotent. Kafka can deliver exactly-once within a single cluster but not across network boundaries. Design consumers to handle duplicate messages before relying on queue semantics.
References
For detailed frameworks and opinionated defaults, read the relevant file from the
references/ folder:
references/interview-framework.md- step-by-step interview process (RESHADED), time allocation, common follow-up questions, and how to communicate trade-offs
Only load the references file when the task requires it - it is long and will consume context.
References
interview-framework.md
System Design Interview Framework
A repeatable process for structuring system design interviews at FAANG and equivalent companies. The framework is called RESHADED and gives you a step-by-step structure so you never lose your place under pressure.
The RESHADED Framework
| Step | Letter | What you do |
|---|---|---|
| 1 | R - Requirements | Clarify functional and non-functional requirements |
| 2 | E - Estimation | Back-of-envelope capacity estimates |
| 3 | S - Storage | Choose data model and database type |
| 4 | H - High-level design | Draw the major components and data flow |
| 5 | A - APIs | Define the public-facing or internal API contract |
| 6 | D - Detail | Deep dive into 1-2 components the interviewer cares about |
| 7 | E - Evaluation | Identify bottlenecks, SPOFs, and trade-offs |
| 8 | D - Distinctive features | Scale, fault tolerance, or advanced features |
Step 1: Requirements (5 minutes)
Never start drawing boxes until you have nailed the requirements. Most failed interviews are failed in the first 5 minutes by jumping to solutions.
Functional requirements
Ask: "What should the system do? What are the core features?"
Focus on the minimum required - do not expand scope yourself. For a URL shortener:
- Users can shorten a URL
- Users can be redirected via the short URL
- (Optional) Users can see click analytics
Non-functional requirements
Ask these specifically - interviewers expect you to:
| Question | Why it matters |
|---|---|
| What is the expected scale? (DAU, QPS) | Determines if a single machine works or you need distribution |
| What is the read/write ratio? | Drives caching strategy and DB choice |
| What is the latency SLA? | Determines where caches are needed |
| Is strong consistency required or is eventual OK? | Drives CAP choice |
| What is the availability target? (99.9%? 99.99%?) | Determines redundancy level |
| Geographic distribution needed? | Determines if multi-region is in scope |
| Data retention period? | Affects storage estimates |
Functional vs non-functional checklist
- Core user actions defined (create, read, update, delete what?)
- Expected scale confirmed (DAU, peak QPS)
- Consistency requirement confirmed (strong vs eventual)
- Latency SLA noted
- Out-of-scope features explicitly named
Step 2: Estimation (5 minutes)
Do this out loud. Interviewers assess your ability to reason about numbers, not get them exactly right.
Reference constants
| Metric | Value |
|---|---|
| Seconds per day | 86,400 (use 100,000 to round up) |
| Bytes per ASCII char | 1 byte |
| Average URL | 100 bytes |
| Average user profile | 1 KB |
| Average tweet/post | 300 bytes |
| Average image (compressed JPEG) | 300 KB |
| Average HD video (1 minute) | 50 MB |
| 1 TB | 10^12 bytes |
| 1 PB | 10^15 bytes |
Estimation process
1. QPS: (DAU * actions_per_user_per_day) / 86,400
2. Peak QPS: average QPS * 2-3x
3. Storage: writes_per_day * record_size * retention_days
4. Bandwidth: peak QPS * average_response_size
5. Cache: storage * hot_data_fraction (typically 20% of data = 80% of reads)Example - URL shortener
Scale: 100M DAU, 1 write per 10 users per day, 100 reads per write
Writes: (100M / 10) / 86400 = 10M/day = ~116 writes/sec
Reads: 116 * 100 = 11,600 reads/sec
Storage: 10M records/day * 100 bytes * 365 days * 5 years = ~18 TB (5 years)
Cache: ~20% of recent URLs = 80% of reads -> cache ~1 TB of hot URLsState your assumptions explicitly: "I'm assuming 300 bytes per URL, 5 year retention."
Step 3: Storage (3-5 minutes)
Define the data model first, then pick the technology.
Data model
List the main entities and their core fields:
User: user_id (PK), email, created_at
URL: short_code (PK), original_url, user_id (FK), created_at, expires_at
Click: click_id (PK), short_code (FK), timestamp, ip, referrerDatabase selection decision table
| Requirement | Choice |
|---|---|
| ACID transactions, complex joins, well-defined schema | PostgreSQL (default) |
| Horizontal write scaling (>100K writes/sec) | Cassandra or DynamoDB |
| Flexible/evolving schema + JSON documents | MongoDB |
| Key-value at very high throughput | Redis or DynamoDB |
| Full-text search | Elasticsearch |
| Graph traversals (social networks) | Neo4j |
| Time-series (metrics, IoT) | InfluxDB or TimescaleDB |
Default: PostgreSQL. Change only with a specific measured reason.
Sharding strategy (if needed)
| Strategy | Best for | Risk |
|---|---|---|
| Range-based | Time-series data | Hot partitions if traffic is recent-biased |
| Hash-based | Even distribution needed | Cannot do range queries across shards |
| Directory-based | Complex routing logic | Directory itself becomes SPOF |
Step 4: High-Level Design (10 minutes)
Draw the system on a whiteboard (or describe components clearly). Include:
- Client (mobile, web, CLI)
- DNS - not usually a design concern, but note CDN here
- CDN - for static assets and geographically distributed reads
- Load balancer - L7 for HTTP services
- API servers - stateless, horizontally scalable
- Cache - Redis, positioned between API servers and database
- Database - primary + read replicas
- Message queue - if async processing is needed
- Workers - consumers of the queue
Describe the data flow for the two most critical use cases (e.g., write a URL, redirect a URL). Say what happens at each hop.
Step 5: APIs (3-5 minutes)
Define the external API contracts. Be specific about HTTP method, path, and payload shape.
POST /api/v1/urls
Body: { "original_url": "https://...", "custom_slug": "optional" }
Response 201: { "short_url": "https://sho.rt/abc123", "expires_at": "..." }
GET /{short_code}
Response 301: Location: https://original-url.com
Response 404: { "error": "not found" }
GET /api/v1/urls/{short_code}/stats
Response 200: { "clicks": 1234, "last_clicked_at": "..." }Note: use 301 (permanent) redirects to save server load via browser caching. Use 302 (temporary) if you need to track every click server-side.
Step 6: Detail (10 minutes)
Pick 1-2 components and go deep. Let the interviewer guide which one. Common deep-dive areas:
Deep dive: key generation
Naive approach: generate a random 6-char Base62 code on each write. Problem: hash collisions as the dataset grows.
Better approach: pre-generation pool
- A background worker generates Base62 codes offline and stores them in a
key_pooltable (status: unused/used). - On each write request, the API server picks one key from the pool (atomic SELECT + UPDATE to mark used).
- Keep two copies: used_keys and unused_keys for fast failover.
Benefit: no collision checking at write time; sub-millisecond key assignment.
Deep dive: caching strategy
Read path:
1. Check Redis: GET short_code
2. Hit: return URL, record impression asynchronously
3. Miss: query PostgreSQL, cache result with TTL=24h, return URL
Write path:
1. Insert to PostgreSQL
2. Do NOT pre-populate cache (lazy loading - cache-aside)
3. Hot URLs will self-populate after first redirectInvalidation: when a URL is deleted or expires, DEL short_code from Redis
and let the next read propagate naturally.
Step 7: Evaluation (5 minutes)
Walk through three questions:
1. Where are the single points of failure?
For each component, ask: "what happens if this dies?"
- Load balancer -> add standby LB with failover (AWS ELB handles this)
- Primary DB -> add replica; promote on failure (< 30 sec with auto-failover)
- Redis -> add replica; cache misses fall back to DB (system degrades gracefully)
- Message queue -> managed service (SQS/Kafka) has built-in replication
2. Where are the bottlenecks?
Use your Step 2 estimates:
- 11,600 reads/sec -> Redis at < 1ms is the right tool; DB alone can't handle this
- 116 writes/sec -> PostgreSQL handles this easily; no sharding needed
3. What trade-offs did you make?
Name at least two:
- "I chose 301 redirects over 302 to reduce server load, which means we can't track every click without client-side instrumentation."
- "I'm using eventual consistency for click analytics to avoid blocking the redirect flow on write."
Step 8: Distinctive Features (5 minutes)
Push to the next level if time allows. Pick one:
Multi-region active-active
- Route users to the nearest region via GeoDNS
- Replicate URL metadata with async replication (AP - eventual consistency)
- Use CRDT counters for click analytics to merge without conflicts
Fault tolerance
- Add a circuit breaker on the DB pool; serve a "please try again" page when the DB is unhealthy rather than queuing connections until OOM
- Use bulkhead pattern: analytics writes go through a separate connection pool; an analytics storm cannot exhaust the redirect connection pool
Hot key problem
- A viral URL can exceed one Redis node's throughput
- Solution: replicate the hot key to N Redis nodes; read from a random one
- Or: local in-process cache (LRU, 100 entries, 10-second TTL) on each API server
Time allocation
Total typical interview: 45-60 minutes.
| Step | Time |
|---|---|
| Requirements (R) | 5 min |
| Estimation (E) | 3-5 min |
| Storage (S) | 3-5 min |
| High-level design (H) | 10 min |
| APIs (A) | 3-5 min |
| Detail (D) | 10-15 min |
| Evaluation (E) | 5 min |
| Distinctive features (D) | 5 min |
| Buffer / interviewer questions | 5 min |
Rule: Never spend more than 15 minutes on high-level design without the interviewer's prompting. Move to detail early - that's where most marks are given.
Common follow-up questions and how to answer them
| Question | Key points to hit |
|---|---|
| "How does this scale to 10x traffic?" | Identify which component breaks first; add the right layer (cache, replica, shard) |
| "What happens if the database goes down?" | Describe read replica failover, circuit breaker behavior, graceful degradation |
| "How do you handle duplicate requests?" | Idempotency key on writes; atomic check-and-set in Redis |
| "How do you handle hot keys in the cache?" | Local in-process cache + jitter on TTLs |
| "What if a message is delivered twice?" | Idempotent consumers with a deduplication store |
| "How would you monitor this system?" | RED metrics per service, alerting on SLO burn rate, distributed tracing |
Common design patterns by problem type
| Problem type | Key patterns |
|---|---|
| URL shortener | Pre-generation key pool, cache-aside, 301 redirect |
| Feed (Twitter/Instagram) | Fan-out on write (small accounts) + fan-out on read (celebrities), Redis sorted sets |
| Typeahead / autocomplete | Trie in Redis, prefix hash, tiered caching |
| Distributed counter | Redis INCR, approximate counting (HyperLogLog), eventual consistency |
| Distributed lock | Redis SETNX with expiry, Redlock for multi-node |
| Leaderboard | Redis sorted sets (ZADD, ZRANGE) |
| Search | Elasticsearch with inverted index; Kafka for real-time indexing pipeline |
| Video/image upload | Direct S3 upload with presigned URL; metadata in PostgreSQL; CDN for delivery |
| Payment system | Idempotency keys, ACID transactions, CP database, event sourcing for audit |
Interview anti-patterns to avoid
- Jumping to microservices before the problem demands it - start with a monolith unless requirements clearly show independent scaling needs
- Designing in silence - narrate every decision; interviewers score your thinking, not just the diagram
- Over-engineering the happy path, ignoring failure modes - explicitly name what happens when each component fails
- Picking exotic tech to impress - using Cassandra for 1000 QPS is wrong; PostgreSQL is the right answer
- Refusing to make trade-offs - everything is a trade-off; say so, then commit to one option and justify it
- Ignoring the non-functional requirements you agreed on - if you said p99 < 100ms, every component decision must serve that constraint
Frequently Asked Questions
What is system-design?
Use this skill when designing distributed systems, architecting scalable services, preparing for system design interviews, or making infrastructure decisions. Triggers on load balancing, CAP theorem, sharding, replication, caching strategies, message queues, microservices architecture, database selection, rate limiting, and any task requiring high-level system architecture decisions.
How do I install system-design?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill system-design in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support system-design?
system-design works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.