technical-interviewing
Use this skill when designing coding challenges, structuring system design interviews, building interview rubrics, calibrating evaluation criteria, or creating hiring loops. Triggers on interview question design, coding assessment creation, system design prompt writing, rubric building, interviewer training, candidate evaluation, and any task requiring structured technical assessment.
operations interviewinghiringrubricscoding-challengessystem-designWhat is technical-interviewing?
Use this skill when designing coding challenges, structuring system design interviews, building interview rubrics, calibrating evaluation criteria, or creating hiring loops. Triggers on interview question design, coding assessment creation, system design prompt writing, rubric building, interviewer training, candidate evaluation, and any task requiring structured technical assessment.
technical-interviewing
technical-interviewing is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Designing coding challenges, structuring system design interviews, building interview rubrics, calibrating evaluation criteria, or creating hiring loops.
Quick Facts
| Field | Value |
|---|---|
| Category | operations |
| Version | 0.1.0 |
| Platforms | claude-code, gemini-cli, openai-codex |
| License | MIT |
How to Install
- Make sure you have Node.js installed on your machine.
- Run the following command in your terminal:
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill technical-interviewing- The technical-interviewing skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).
Overview
Technical interviewing is both a skill and a system. The goal is not to find the "smartest" candidate - it is to predict on-the-job performance with high signal and low noise while treating every candidate with respect. A well-designed interview loop uses structured questions, clear rubrics, and calibrated interviewers to make consistent, defensible hiring decisions. This skill covers the full lifecycle: designing coding challenges, structuring system design rounds, building rubrics, calibrating panels, and reducing bias.
Tags
interviewing hiring rubrics coding-challenges system-design
Platforms
- claude-code
- gemini-cli
- openai-codex
Related Skills
Pair technical-interviewing with these complementary skills:
Frequently Asked Questions
What is technical-interviewing?
Use this skill when designing coding challenges, structuring system design interviews, building interview rubrics, calibrating evaluation criteria, or creating hiring loops. Triggers on interview question design, coding assessment creation, system design prompt writing, rubric building, interviewer training, candidate evaluation, and any task requiring structured technical assessment.
How do I install technical-interviewing?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill technical-interviewing in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support technical-interviewing?
This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.
Maintainers
Generated from AbsolutelySkilled
SKILL.md
Technical Interviewing
Technical interviewing is both a skill and a system. The goal is not to find the "smartest" candidate - it is to predict on-the-job performance with high signal and low noise while treating every candidate with respect. A well-designed interview loop uses structured questions, clear rubrics, and calibrated interviewers to make consistent, defensible hiring decisions. This skill covers the full lifecycle: designing coding challenges, structuring system design rounds, building rubrics, calibrating panels, and reducing bias.
When to use this skill
Trigger this skill when the user:
- Wants to design a coding challenge or take-home assignment for a specific role
- Needs to create a system design interview question with follow-ups
- Asks to build a scoring rubric or evaluation criteria for interviews
- Wants to structure a full interview loop (phone screen through onsite)
- Needs to calibrate interviewers or run a calibration session
- Asks about reducing bias in technical assessments
- Wants to evaluate a candidate's performance against a rubric
- Needs interviewer training materials or shadow guides
Do NOT trigger this skill for:
- Preparing as a candidate for interviews (use system-design or algorithm skills)
- General HR hiring workflows not specific to technical assessment
Key principles
Structure over gut feel - Every question must have a rubric before it is used. "I'll know a good answer when I see it" is not a rubric. Define what strong, acceptable, and weak look like in advance. Structured interviews are 2x more predictive than unstructured ones.
Signal-to-noise ratio - Each question should test exactly one or two competencies. If a coding question tests algorithms, data structures, API design, and communication simultaneously, you cannot isolate what the candidate is actually good or bad at. Separate the signals.
Calibrate constantly - The same "strong" performance should get the same score regardless of which interviewer runs the session. Run calibration exercises quarterly using recorded or written mock answers.
Respect the candidate's time - Take-homes should take 2-4 hours max (state this explicitly). Onsite loops should not exceed 4-5 hours. Every minute of the candidate's time should produce meaningful signal.
Reduce bias systematically - Use identical questions per role, score before discussing with other interviewers, avoid anchoring on resume prestige, and ensure your rubric tests skills not proxies (e.g. "uses our preferred framework" is a proxy, not a skill).
Core concepts
The interview funnel
Every technical hiring loop follows a narrowing funnel. Each stage should have a clear purpose and avoid re-testing what was already assessed:
| Stage | Purpose | Duration | Signal |
|---|---|---|---|
| Resume screen | Baseline qualifications | 2-5 min | Experience match |
| Phone screen | Communication + baseline coding | 30-45 min | Can they code at all? |
| Technical deep-dive | Core competency for the role | 45-60 min | Domain strength |
| System design | Architecture thinking (senior+) | 45-60 min | Scope, trade-offs |
| Culture/values | Team fit, collaboration style | 30-45 min | Working style |
Question types
- Algorithmic - Data structures, complexity analysis. Best for junior/mid roles. Risk: over-indexes on contest skills vs real work.
- Practical coding - Build a small feature, debug existing code, extend an API. Better signal for day-to-day work.
- System design - Design a URL shortener, notification system, rate limiter. Best for senior+ roles. Tests breadth and trade-off reasoning.
- Code review - Review a PR with intentional issues. Tests reading skill and communication.
- Take-home - Larger project done asynchronously. Best signal but highest candidate time cost.
Rubric anatomy
Every rubric has four components:
- Competency - What you are testing (e.g. "API design")
- Levels - Typically 4: Strong Hire, Hire, No Hire, Strong No Hire
- Behavioral anchors - Concrete examples of what each level looks like
- Must-haves vs nice-to-haves - Which criteria are required vs bonus
Common tasks
Design a coding challenge
Start with the role requirements, not a clever problem. Work backward:
- Identify 1-2 core competencies the role needs daily
- Design a problem that requires those competencies to solve
- Create 3 difficulty tiers: base case, standard, extension
- Write the rubric before finalizing the problem
- Test-solve it yourself and time it (multiply by 1.5-2x for candidates)
Template:
PROBLEM: <Title>
LEVEL: Junior / Mid / Senior
TIME: <X> minutes
COMPETENCIES TESTED: <1-2 specific skills>
PROMPT:
<Clear problem statement with examples>
BASE CASE (must complete):
<Minimum viable solution criteria>
STANDARD (expected for hire):
<Additional requirements showing solid understanding>
EXTENSION (differentiates strong hire):
<Follow-up that tests depth or edge case thinking>
RUBRIC:
Strong Hire: Completes standard + extension, clean code, discusses trade-offs
Hire: Completes standard, reasonable code quality, handles prompts on edge cases
No Hire: Completes base only, significant code quality issues
Strong No Hire: Cannot complete base case, fundamental misunderstandingsCreate a system design question
Good system design questions are open-ended with clear scaling dimensions:
- Pick a system the candidate likely understands as a user
- Define initial constraints (users, QPS, data volume)
- Prepare 4-6 follow-up dimensions to probe depth
- Write what "good" looks like at each stage
Follow-up dimensions to prepare:
- Scale: "Now handle 10x the traffic"
- Reliability: "A database node goes down - what happens?"
- Consistency: "Two users edit the same document simultaneously"
- Cost: "The CEO says infrastructure costs are too high"
- Latency: "P99 latency must be under 200ms"
- Security: "How do you handle authentication and authorization?"
Build a scoring rubric
For each competency being assessed:
COMPETENCY: <Name>
WEIGHT: <High / Medium / Low>
STRONG HIRE (4):
- <Specific observable behavior>
- <Specific observable behavior>
HIRE (3):
- <Specific observable behavior>
- <Specific observable behavior>
NO HIRE (2):
- <Specific observable behavior>
STRONG NO HIRE (1):
- <Specific observable behavior>Always use behavioral anchors (what you observed), not trait labels ("smart", "passionate"). "Identified the race condition without prompting and proposed a lock-based solution" is a behavioral anchor. "Seemed smart" is not.
Structure a full interview loop
Map each stage to a unique competency. Never duplicate signals:
ROLE: <Title, Level>
TOTAL STAGES: <N>
Stage 1 - Phone Screen (45 min)
Interviewer type: Any engineer
Format: Practical coding
Tests: Baseline coding ability, communication
Question: <Specific question or question bank ID>
Stage 2 - Technical Deep-Dive (60 min)
Interviewer type: Domain expert
Format: Domain-specific coding
Tests: <Role-specific competency>
Question: <Specific question>
Stage 3 - System Design (60 min) [Senior+ only]
Interviewer type: Senior+ engineer
Format: Whiteboard / virtual whiteboard
Tests: Architecture thinking, trade-off reasoning
Question: <Specific question>
Stage 4 - Culture & Collaboration (45 min)
Interviewer type: Cross-functional partner
Format: Behavioral + scenario-based
Tests: Communication, conflict resolution, ownershipRun a calibration session
Calibration aligns interviewers on what each rubric level means:
- Select 3-4 real or mock candidate responses (anonymized)
- Have each interviewer score independently using the rubric
- Reveal scores simultaneously (avoid anchoring)
- Discuss disagreements - focus on which rubric criteria were interpreted differently
- Update rubric language where ambiguity caused divergence
- Document decisions as "calibration notes" appended to the rubric
Target: interviewers should agree within 1 point on a 4-point scale at least 80% of the time.
Design a take-home assignment
Take-homes must balance signal quality with respect for candidate time:
- State the expected time explicitly (2-4 hours)
- Provide a starter repo with boilerplate already set up
- Define submission format and evaluation criteria upfront
- Include a README template for candidates to explain their approach
- Grade with a rubric, not vibes
- Offer a live follow-up to discuss the submission (15-30 min)
Anti-patterns / common mistakes
| Mistake | Why it's wrong | What to do instead |
|---|---|---|
| No rubric before interviews | Every interviewer uses different criteria; inconsistent decisions | Write and distribute rubric before any candidate is interviewed |
| Asking trivia questions | Tests memorization, not ability; alienates strong candidates | Ask problems that require reasoning, not recall |
| "Culture fit" as veto | Becomes a proxy for demographic similarity | Define specific values and behaviors you are testing for |
| Same question for all levels | Junior and senior roles need different signal | Adjust complexity and expected depth per level |
| Discussing candidates before scoring | First opinion anchors everyone else | Score independently, then debrief |
| Marathon interviews (6+ hours) | Candidate fatigue degrades signal; disrespects their time | Cap at 4-5 hours including breaks |
| Only testing algorithms | Most roles never use graph traversal; poor signal for day-to-day work | Match question type to actual job tasks |
| No interviewer training | Untrained interviewers ask leading questions, give inconsistent hints | Run shadow sessions and calibration quarterly |
Gotchas
Rubrics written after interviewing are not rubrics - If interviewers define what "good" looks like after seeing a candidate's answer, they are post-hoc rationalizing, not evaluating. Write rubric anchors before the first candidate session, not after.
Hints are part of the rubric, not a kindness - Unscripted hints produce wildly different interviews across candidates. Standardize hints: define at what point in the problem you offer a hint, what the hint is, and score separately whether the candidate needed it.
Take-home time estimates are always underestimated by designers - When you build the take-home, you already know the answer. Multiply your time estimate by 2-3x for candidates approaching it cold. A 4-hour take-home that actually takes 8-10 hours destroys candidate experience and trust.
Debrief sequencing affects outcomes more than debrief content - If the hiring manager or a senior engineer speaks first in the debrief, everyone else's scores shift toward theirs. Use independent written submissions before any discussion to prevent anchoring.
"Culture fit" rejections require the same documentation as technical rejections - Vague "not a culture fit" is legally and ethically risky. If a candidate is rejected for collaboration or communication, document the specific observable behaviors from the rubric, not the general feeling.
References
For detailed guidance on specific topics, read the relevant file from
the references/ folder:
references/system-design-questions.md- Library of system design questions organized by level with expected discussion points and rubric anchorsreferences/coding-challenge-patterns.md- Coding challenge templates organized by competency signal (API design, data modeling, debugging, concurrency)references/rubric-calibration.md- Step-by-step calibration session guide with sample scoring exercises and facilitator script
References
coding-challenge-patterns.md
Coding Challenge Patterns
Organizing by signal, not by topic
Don't organize your question bank by data structure ("graph questions", "tree questions"). Organize by the competency you want to assess. This lets you pick the right question for the role, not just the right difficulty.
Pattern: API Design
Signal: Can the candidate design clean interfaces, handle edge cases, and think about consumers of their code?
Example: Design a key-value store API
Level: Mid
Time: 45 min
Prompt: Implement a key-value store class with get, set, delete, and
keys methods. Then extend it to support TTL (time-to-live) on keys.
Base case: Basic CRUD operations work correctly Standard: TTL works, expired keys are not returned Extension: Implement lazy vs active expiration, discuss trade-offs
Rubric anchors:
- Strong Hire: Clean API surface, handles edge cases (get expired key, delete nonexistent key), discusses memory implications of lazy expiration
- No Hire: API works but no thought given to edge cases or consumers
Example: Build a middleware pipeline
Level: Senior Time: 60 min Prompt: Implement a middleware system where functions can be chained and each can modify a request/response or short-circuit.
Base case: Sequential middleware execution Standard: Support for async middleware, error handling middleware Extension: Conditional middleware (route-based), middleware ordering
Pattern: Data Modeling
Signal: Can the candidate choose appropriate data structures and model relationships between entities?
Example: Design a permission system
Level: Mid-Senior Time: 45 min Prompt: Model a role-based access control system. Users belong to organizations, have roles, and roles grant permissions on resources.
Base case: User-role-permission model with basic check function Standard: Hierarchical roles (admin inherits editor permissions), resource scoping Extension: Attribute-based overrides, permission caching strategy
Rubric anchors:
- Strong Hire: Considers inheritance, discusses denormalization for performance, handles the "admin of org A should not see org B" case
- No Hire: Flat user-to-permission mapping, no consideration of scale
Example: Model an event calendar
Level: Mid Time: 45 min Prompt: Design data models for a calendar app supporting one-time events, recurring events, and event modifications (cancel one occurrence, change time of one occurrence).
Base case: One-time events with CRUD Standard: Recurring events with RRULE-style patterns Extension: Exceptions to recurring events (modify/cancel individual occurrences)
Pattern: Debugging & Code Reading
Signal: Can the candidate read unfamiliar code, identify issues, and reason about behavior?
Example: Fix the race condition
Level: Senior Time: 30 min Prompt: Present a 50-line function with a subtle race condition (e.g. check-then-act on a shared counter). Ask the candidate to identify the bug, explain the failure scenario, and fix it.
What to prepare:
- The buggy code (should look reasonable at first glance)
- 2-3 specific failure scenarios to discuss
- Multiple valid fixes (mutex, atomic operations, redesign)
Rubric anchors:
- Strong Hire: Identifies the race condition quickly, explains a concrete interleaving that causes failure, proposes multiple fixes with trade-offs
- Hire: Identifies the issue with some hints, proposes a working fix
- No Hire: Cannot identify the issue even with hints
Example: Review this pull request
Level: Mid-Senior Time: 30 min Prompt: Present a 100-line PR with 3-5 intentional issues of varying severity (one security issue, one logic bug, one style issue, one performance concern, one missing test).
Rubric anchors:
- Strong Hire: Catches security and logic issues, prioritizes feedback by severity
- Hire: Catches most issues, reasonable feedback quality
- No Hire: Only catches style issues, misses the security/logic bugs
Pattern: Concurrency & Async
Signal: Does the candidate understand parallel execution, synchronization, and async patterns?
Example: Implement a connection pool
Level: Senior Time: 60 min Prompt: Build a generic connection pool with max size, checkout/checkin, timeout on checkout, and health checking.
Base case: Fixed-size pool with blocking checkout Standard: Configurable max size, timeout, idle connection cleanup Extension: Health checking, connection recycling, metrics
Example: Build a rate-limited task queue
Level: Mid-Senior Time: 45 min Prompt: Implement a task queue that processes at most N tasks per second, with configurable concurrency.
Base case: Serial execution with rate limiting Standard: Concurrent execution within rate limit Extension: Priority levels, retry with backoff
General rubric template for coding challenges
| Dimension | Strong Hire (4) | Hire (3) | No Hire (2) | Strong No Hire (1) |
|---|---|---|---|---|
| Problem solving | Breaks down problem systematically, identifies edge cases early | Reasonable approach, handles main cases | Struggles with approach, needs significant hints | Cannot make progress even with hints |
| Code quality | Clean, readable, well-named, idiomatic | Functional with minor style issues | Disorganized, hard to follow | Does not produce working code |
| Communication | Thinks aloud clearly, explains trade-offs, asks good questions | Communicates approach, responds to prompts | Mostly silent, unclear explanations | Cannot articulate their thinking |
| Testing mindset | Proactively discusses test cases, boundary conditions | Tests when prompted, covers main cases | No consideration of testing | Does not understand what testing means in context |
| Extension ability | Elegantly extends to harder variant, code structure supports change | Can extend with some refactoring | Extension requires rewrite | Cannot extend beyond base case |
rubric-calibration.md
Rubric Calibration Guide
Why calibrate
Without calibration, interviewers develop personal definitions of "strong" and "weak." One interviewer's "Hire" is another's "Strong Hire." This inconsistency means your hiring bar depends on which interviewer the candidate gets - not on the candidate's actual ability. Calibration fixes this by aligning everyone on what each rubric level looks like in practice.
Target: Interviewers agree within 1 point on a 4-point scale at least 80% of the time after calibration.
Calibration session format
Frequency: Quarterly, or when onboarding 3+ new interviewers Duration: 90 minutes Participants: 4-8 interviewers who use the same question Facilitator: Interview program lead or senior interviewer
Pre-session preparation
- Select 3-4 candidate responses to calibrate on:
- 1 clearly strong response
- 1 clearly weak response
- 2 borderline responses (these generate the most useful discussion)
- Anonymize all responses (remove candidate names, companies, schools)
- Distribute the rubric being calibrated (not the responses) 1 week before
- Prepare scoring sheets with the rubric criteria pre-filled
Session agenda
0:00 - 0:05 Context setting
- State the goal: align on what each rubric level means
- Remind: no "right" answers - we're calibrating the rubric, not the candidates
0:05 - 0:15 Review rubric together
- Read each level's behavioral anchors aloud
- Ask: "Any criteria that are unclear or ambiguous?"
- Note ambiguous items (these are calibration opportunities)
0:15 - 0:30 Score Response #1 (clear strong)
- Everyone reads silently and scores independently (5 min)
- Facilitator collects scores simultaneously (raise fingers or digital poll)
- Discuss: "Why did you give this score? Which criteria did you weight?"
- Outcome: should be mostly aligned; if not, rubric needs clarification
0:30 - 0:50 Score Response #2 (borderline)
- Same process: read, score independently, reveal simultaneously
- This is where disagreements surface
- For each disagreement: "Which specific rubric criterion led to different scores?"
- Update rubric language where ambiguity caused divergence
0:50 - 1:10 Score Response #3 (borderline)
- Same process
- Focus on whether rubric updates from Response #2 help alignment
1:10 - 1:20 Score Response #4 (clear weak)
- Quick alignment check
- Discuss: "What separates No Hire from Strong No Hire?"
1:20 - 1:30 Wrap-up
- Summarize rubric changes decided during session
- Assign someone to update the canonical rubric document
- Schedule next calibration sessionFacilitator guidelines
Preventing anchoring
The biggest risk in calibration is anchoring - where one person's opinion influences everyone else.
Rules:
- Never ask "Who wants to go first?" - use simultaneous reveal
- If a senior person speaks first, explicitly ask others to share before responding
- Use anonymous digital polling if available (Slido, Google Forms)
- If someone says "I agree with [person]" - ask them to articulate their own reasoning independently
Handling persistent disagreement
If two interviewers consistently disagree by 2+ points:
- Identify which rubric criteria they weight differently
- Ask: "Which of these criteria is more predictive of on-the-job performance?"
- If the group can't resolve it, add specificity to the rubric (more behavioral anchors, clearer must-haves vs nice-to-haves)
- Document the disagreement and revisit with data after 1-2 hiring cycles
Common calibration discoveries
| Discovery | Resolution |
|---|---|
| "Clean code" means different things to different people | Add specific examples: "functions under 20 lines, no nested callbacks deeper than 2 levels" |
| Some interviewers penalize for asking questions | Clarify: questions show maturity, not weakness. Add to rubric as positive signal |
| Interviewers give bonus points for using specific tech | Remove technology-specific criteria. Test concepts, not brand loyalty |
| "Communication" is scored inconsistently | Split into sub-criteria: explains approach before coding, responds to hints, asks clarifying questions |
| Strong candidates who are nervous score low | Add explicit note: evaluate peak demonstrated ability, not average. Nerves are not signal |
New interviewer onboarding
Shadow program (2-4 weeks)
Week 1: Observe 2 interviews
- Shadow an experienced interviewer
- Take notes using the rubric
- After each: compare your score with the interviewer's and discuss gaps
Week 2: Reverse shadow 2 interviews
- You run the interview, experienced interviewer observes
- Experienced interviewer gives feedback on:
- Question delivery and pacing
- Hint-giving calibration (too many? too few?)
- Score accuracy vs rubric
Week 3-4: Solo with review
- Run interviews independently
- Submit scores with written justification
- Interview lead reviews justifications for first 3-4 interviews
- Graduate to independent when scores align within 1 point consistentlyCommon new interviewer mistakes
| Mistake | Coaching point |
|---|---|
| Giving too many hints | Let candidate struggle for 2-3 minutes before hinting. Struggling is signal. |
| Not giving enough hints | If stuck for 5+ minutes, a targeted hint prevents wasting the whole session |
| Asking leading questions | "Would you use a hash map here?" is leading. "How would you optimize lookup?" is not |
| Scoring based on personality | Introverts who code well should score the same as extroverts who code well |
| Comparing to themselves | "I would have done X" is not a rubric criterion. Only score against the rubric |
| Writing vague feedback | "Seemed okay" is not useful. Write specific observations: "Solved base case in 15 min, needed 2 hints for extension, did not discuss edge cases" |
Measuring calibration effectiveness
Track these metrics over time:
- Inter-rater reliability (IRR): Percentage of interviews where all interviewers agree within 1 point. Target: 80%+
- Score distribution: If one interviewer gives 90% "Hire" and another gives 50%, there is a calibration problem
- Offer-to-accept ratio by interviewer: If candidates who interview with specific interviewers accept less often, investigate the experience quality
- New hire performance correlation: Do interview scores predict 6-month performance ratings? If not, the rubric (not just calibration) needs work
system-design-questions.md
System Design Question Library
Question selection principles
- Match question complexity to candidate level (mid vs senior vs staff)
- Choose domains the candidate likely understands as a user
- Prepare 4-6 follow-up dimensions per question
- Write expected discussion points before using in interviews
Mid-level questions (45 min)
Design a URL shortener
Initial constraints: 100M URLs created/month, 10:1 read-to-write ratio
Expected discussion:
- Hashing strategy (MD5 truncation, base62 encoding, counter-based)
- Storage: key-value store for fast lookups
- Redirect: 301 vs 302 and caching implications
- Analytics: click counting, geographic data
Follow-ups:
- Custom aliases and collision handling
- Expiration and cleanup
- Rate limiting creation
Strong signal: Candidate discusses trade-offs between hash collision rate and URL length without prompting.
Design a paste bin
Initial constraints: 5M pastes/day, 10:1 read-to-write, max 10MB per paste
Expected discussion:
- Object storage for paste content vs database
- Content-addressable storage for deduplication
- Expiration policies (TTL-based)
- Access control (public, unlisted, private)
Follow-ups:
- Syntax highlighting as a service
- Versioning / edit history
- Abuse prevention (spam, malware)
Senior questions (60 min)
Design a notification system
Initial constraints: 50M users, supports push, email, SMS, in-app
Expected discussion:
- Channel abstraction and routing logic
- Priority queue for urgent vs batched notifications
- User preference storage and opt-out handling
- Template engine for personalization
- Delivery tracking and retry logic
Follow-ups:
- Rate limiting per user (no notification spam)
- Cross-channel deduplication
- Real-time in-app with WebSocket fallback to polling
- Digest mode (batch low-priority into daily summary)
Strong signal: Candidate separates the ingestion pipeline from delivery pipeline and discusses backpressure handling.
Design a rate limiter
Initial constraints: API gateway processing 100K req/sec, per-user limits
Expected discussion:
- Token bucket vs sliding window vs fixed window
- Distributed counting (Redis, consistent hashing)
- Header communication (X-RateLimit-Remaining, Retry-After)
- Differentiated limits by endpoint and user tier
Follow-ups:
- Distributed rate limiting across multiple data centers
- Graceful degradation under extreme load
- Rate limit key design (IP, API key, user ID, composite)
Design a chat application
Initial constraints: 10M DAU, 1:1 and group chat, message persistence
Expected discussion:
- WebSocket connection management and reconnection
- Message ordering (per-conversation sequence numbers)
- Fan-out: write-time vs read-time for group messages
- Presence system (online/offline/typing indicators)
- Message storage and retrieval pagination
Follow-ups:
- End-to-end encryption implications on server architecture
- Media message handling (images, files)
- Search across message history
- Read receipts at scale
Staff+ questions (60 min)
Design a distributed task scheduler
Initial constraints: 10M scheduled tasks, at-least-once execution, sub-second precision
Expected discussion:
- Task storage and indexing by execution time
- Partition strategy for horizontal scaling
- Leader election or leaderless coordination
- At-least-once vs exactly-once semantics
- Dead letter queue for failed tasks
- Clock skew handling across nodes
Follow-ups:
- Multi-region deployment with region-affinity
- Task dependency graphs (DAG execution)
- Dynamic priority adjustment
- Observability: how do you know a task was dropped?
Strong signal: Candidate proactively discusses failure modes and recovery without being asked.
Design a collaborative document editor
Initial constraints: Google Docs-style, 100 concurrent editors per document
Expected discussion:
- Conflict resolution: OT vs CRDT trade-offs
- Operation log and transformation pipeline
- Cursor and selection synchronization
- Document storage (snapshots + operation log)
- Permission model (owner, editor, viewer, commenter)
Follow-ups:
- Offline editing and sync
- Version history and rollback
- Comments and suggestions as separate CRDT
- Performance with very large documents (100K+ characters)
Rubric template for system design
| Competency | Strong Hire (4) | Hire (3) | No Hire (2) | Strong No Hire (1) |
|---|---|---|---|---|
| Requirements gathering | Asks clarifying questions, identifies core vs nice-to-have, states assumptions | Asks some questions, identifies main requirements | Jumps to solution without clarifying | Cannot articulate what the system should do |
| High-level design | Clear component diagram, explains data flow, justifies choices | Reasonable architecture with minor gaps | Vague or missing components, hand-wavy connections | No coherent design emerges |
| Deep-dive ability | Proactively dives into hardest component, discusses trade-offs in detail | Can deep-dive when prompted, shows reasonable depth | Stays surface-level even when prompted | Cannot explain any component in detail |
| Scalability | Identifies bottlenecks, proposes concrete solutions with numbers | Acknowledges scale challenges, proposes some solutions | Vague "just add more servers" without specifics | Does not consider scale |
| Trade-off reasoning | Articulates multiple options with pros/cons, makes justified choice | Sees some trade-offs when prompted | Binary thinking ("this is the right way") | Cannot articulate any trade-offs |
Frequently Asked Questions
What is technical-interviewing?
Use this skill when designing coding challenges, structuring system design interviews, building interview rubrics, calibrating evaluation criteria, or creating hiring loops. Triggers on interview question design, coding assessment creation, system design prompt writing, rubric building, interviewer training, candidate evaluation, and any task requiring structured technical assessment.
How do I install technical-interviewing?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill technical-interviewing in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support technical-interviewing?
technical-interviewing works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.