load-testing

Use this skill when load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress. Triggers on k6, Artillery, JMeter, load testing, stress testing, soak testing, spike testing, performance benchmarks, throughput testing, and any task requiring load or performance testing.

What is load-testing?

Quick Start

Open your terminal or command prompt
Run: npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing
Start your AI coding agent (Claude Code, Cursor, Gemini CLI, or any supported agent)
The load-testing skill is now active and ready to use

Overview Files

load-testing

load-testing is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress.

Quick Facts

Field	Value
Category	engineering
Version	0.1.0
Platforms	claude-code, gemini-cli, openai-codex
License	MIT

How to Install

Make sure you have Node.js installed on your machine.
Run the following command in your terminal:

npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing

The load-testing skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).

Overview

A practitioner's guide to load testing production services. This skill covers test design, k6 implementation, CI integration, results analysis, and capacity planning with an emphasis on when each test type is appropriate and what to measure. Designed for engineers who need to validate performance before and after launches.

Platforms

claude-code
gemini-cli
openai-codex

Related Skills

Pair load-testing with these complementary skills:

Frequently Asked Questions

What is load-testing?

How do I install load-testing?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support load-testing?

This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.

Maintainers

@maddhruv

Generated from AbsolutelySkilled

SKILL.md

Load Testing

When to use this skill

Trigger this skill when the user:

Writes a k6, Artillery, JMeter, or Gatling test script
Plans a load, stress, soak, or spike test campaign
Benchmarks API throughput or latency
Defines performance SLOs or pass/fail thresholds
Integrates load tests into CI/CD pipelines
Analyzes load test results to find bottlenecks
Capacity plans for an upcoming traffic event (launch, sale, campaign)

Do NOT trigger this skill for:

Unit or integration tests that don't involve concurrent load (use a testing skill)
Frontend performance (Lighthouse, Core Web Vitals - use a frontend performance skill)

Key principles

Test in production-like environments - A load test against a single-instance staging box with seeded data tells you nothing about your production fleet. Match CPU/memory ratios, replica counts, and dataset sizes. Synthetic data that doesn't reflect production cardinality produces misleading results.
Define pass/fail criteria before testing - Decide what "passing" means before you run the first request. "P95 latency < 300ms, error rate < 0.1%, RPS >= 500" is a pass/fail criterion. "It felt fast" is not. Set thresholds in code so tests fail automatically in CI.
Ramp up gradually - Never go from 0 to peak load instantly. A sudden spike obscures whether failure was caused by the ramp itself or sustained load. Use stages: warm up, ramp to target, hold steady, ramp down. A gradual ramp mirrors real traffic and gives infrastructure time to autoscale.
Test with realistic data and scenarios - A test that hits a single cached endpoint with the same user ID is not a load test; it is a cache benchmark. Use parameterized data (real user IDs, varied payloads), model the full user journey, and include think time between requests to simulate realistic concurrency.
Automate load tests in CI - Load tests only provide value if they run consistently. Gate every deployment with a smoke-level load test. Run full stress and soak tests on a schedule (nightly or pre-release). Fail the build on threshold violations. Trends over time catch regressions earlier than one-off runs.

Core concepts

Test types

Type	Goal	Duration	VU shape
Smoke	Verify the test script works; baseline sanity	1-2 min	1-5 VUs, constant
Load	Validate behavior at expected production traffic	15-30 min	Ramp to target, hold
Stress	Find the breaking point; measure degradation curve	30-60 min	Ramp beyond expected until failure
Soak	Detect memory leaks, connection pool exhaustion, drift	2-24 hours	Hold at 70-80% capacity
Spike	Simulate sudden traffic surge (marketing event, viral post)	10-20 min	Instant jump to 5-10x, then drop

Choose the test type based on what question you're trying to answer - not habit. Most teams only run load tests and miss soak and spike scenarios where real incidents happen.

Key metrics

Metric	What it measures	Typical target
RPS / throughput	Requests per second the system handles	Depends on expected traffic
P50 / P95 / P99 latency	Response time distribution	P99 < 2x your SLO
Error rate	% of requests returning 4xx/5xx	< 0.1% under load
Time to first byte (TTFB)	Server processing latency	Proxy for backend work
Checks passed %	Business logic assertions in the test	100% expected

Always track percentiles (p95, p99), not averages. An average of 100ms with a p99 of 5000ms means 1 in 100 users waits 5 seconds - that is a bad service.

Think time

Think time (or "sleep") is the pause between requests a virtual user makes to simulate a real user reading a page or filling a form. Without think time, virtual users fire requests as fast as possible, which does not reflect real traffic patterns and saturates the system unrealistically. Use sleep(randomBetween(1, 3)) to add variance.

Virtual users vs RPS

Virtual users (VUs) model concurrent users - each VU executes the full scenario loop. RPS is a result of VU count, think time, and iteration duration.

Open vs closed workload models:

Closed (VU-based): Fixed pool of VUs, each completes a request before starting the next. System naturally caps throughput. Best for session-based applications.
Open (arrival rate): New requests arrive at a fixed rate regardless of system state. Queues build under saturation. Best for stateless APIs and microservices.

k6 supports both: vus/duration for closed, constantArrivalRate/ramping ArrivalRate executors for open.

Common tasks

Write a basic load test

// k6 basic load test - smoke then load
import http from 'k6/http';
import { sleep, check } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 10 },  // ramp up
    { duration: '1m',  target: 10 },  // hold
    { duration: '15s', target: 0 },   // ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<300'],   // 95% of requests under 300ms
    http_req_failed:   ['rate<0.01'],   // less than 1% errors
  },
};

export default function () {
  const res = http.get('https://api.example.com/health');

  check(res, {
    'status is 200':       (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  sleep(1);
}

Run with: k6 run script.js. Add --out json=results.json to export raw data.

Implement ramping scenarios - stages

// k6 staged ramp - warm up, load, stress, cool down
import http from 'k6/http';
import { sleep, check } from 'k6';

export const options = {
  stages: [
    { duration: '2m',  target: 20  },  // warm up to expected load
    { duration: '5m',  target: 20  },  // hold at expected load
    { duration: '2m',  target: 100 },  // ramp to stress level
    { duration: '5m',  target: 100 },  // hold under stress
    { duration: '2m',  target: 200 },  // push further
    { duration: '3m',  target: 200 },  // hold to find saturation point
    { duration: '2m',  target: 0   },  // ramp down
  ],
  thresholds: {
    http_req_duration: ['p(99)<1000'],
    http_req_failed:   ['rate<0.05'],
  },
};

export default function () {
  http.get('https://api.example.com/products');
  sleep(Math.random() * 2 + 1);  // think time: 1-3s
}

Watch metrics during the stress phase. The point where p99 latency inflects upward or error rate climbs is your saturation point.

Test API endpoints with checks and thresholds

// k6 with structured checks and per-endpoint thresholds
import http from 'k6/http';
import { check, group, sleep } from 'k6';

export const options = {
  vus: 50,
  duration: '5m',
  thresholds: {
    'http_req_duration{endpoint:list}':   ['p(95)<200'],
    'http_req_duration{endpoint:detail}': ['p(95)<400'],
    'http_req_failed':                    ['rate<0.01'],
    'checks':                             ['rate>0.99'],
  },
};

const BASE_URL = 'https://api.example.com';

export default function () {
  group('list products', () => {
    const res = http.get(`${BASE_URL}/products`, {
      tags: { endpoint: 'list' },
    });
    check(res, {
      'list: status 200':    (r) => r.status === 200,
      'list: has items':     (r) => JSON.parse(r.body).items.length > 0,
    });
  });

  sleep(1);

  group('product detail', () => {
    const res = http.get(`${BASE_URL}/products/42`, {
      tags: { endpoint: 'detail' },
    });
    check(res, {
      'detail: status 200': (r) => r.status === 200,
      'detail: has price':  (r) => JSON.parse(r.body).price !== undefined,
    });
  });

  sleep(Math.random() * 2 + 1);
}

Tag requests by endpoint so thresholds and dashboards are segmented - aggregate p95 across all endpoints hides slow outliers.

Simulate realistic user journeys

// k6 multi-step user journey with shared data
import http from 'k6/http';
import { check, sleep } from 'k6';
import { SharedArray } from 'k6/data';

// Load test data once, shared across VUs
const users = new SharedArray('users', () =>
  JSON.parse(open('./data/users.json'))
);

export const options = {
  stages: [
    { duration: '1m', target: 30 },
    { duration: '3m', target: 30 },
    { duration: '1m', target: 0  },
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],
    http_req_failed:   ['rate<0.01'],
  },
};

export default function () {
  const user = users[Math.floor(Math.random() * users.length)];

  // Step 1: Login
  const loginRes = http.post('https://api.example.com/auth/login', JSON.stringify({
    email:    user.email,
    password: user.password,
  }), { headers: { 'Content-Type': 'application/json' } });

  check(loginRes, { 'login: status 200': (r) => r.status === 200 });
  const token = JSON.parse(loginRes.body).token;
  const authHeaders = { headers: { Authorization: `Bearer ${token}` } };

  sleep(1);

  // Step 2: Browse catalog
  const listRes = http.get('https://api.example.com/products', authHeaders);
  check(listRes, { 'browse: status 200': (r) => r.status === 200 });

  sleep(Math.random() * 3 + 1);  // user reads the list

  // Step 3: Add to cart
  const cartRes = http.post('https://api.example.com/cart', JSON.stringify({
    product_id: 42, quantity: 1,
  }), { ...authHeaders, headers: { ...authHeaders.headers, 'Content-Type': 'application/json' } });

  check(cartRes, { 'cart: status 201': (r) => r.status === 201 });
  sleep(2);
}

Use SharedArray to avoid loading large data files per-VU. Model real think time between steps - a user takes seconds between actions, not milliseconds.

Stress test to find breaking point

// k6 stress test with open arrival rate model
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  scenarios: {
    stress: {
      executor:          'ramping-arrival-rate',
      startRate:         10,          // 10 req/s at start
      timeUnit:          '1s',
      preAllocatedVUs:   50,
      maxVUs:            500,
      stages: [
        { duration: '2m', target: 50  },   // ramp to 50 req/s
        { duration: '3m', target: 100 },   // ramp to 100 req/s
        { duration: '3m', target: 200 },   // ramp to 200 req/s - find saturation
        { duration: '2m', target: 50  },   // check recovery
      ],
    },
  },
  thresholds: {
    // Test continues even on failure - we want to observe breakdown
    http_req_duration: [{ threshold: 'p(95)<2000', abortOnFail: false }],
    http_req_failed:   [{ threshold: 'rate<0.10',  abortOnFail: false }],
  },
};

export default function () {
  const res = http.get('https://api.example.com/search?q=laptop');
  check(res, { 'status 200': (r) => r.status === 200 });
  sleep(0.5);
}

Use abortOnFail: false during stress tests - you want to observe the degradation curve, not abort at the first threshold breach. The breaking point is the RPS where error rate exceeds tolerance or latency becomes unusable.

Set up k6 in CI/CD

# .github/workflows/load-test.yml
name: Load Test

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 2 * * *'  # nightly soak test

jobs:
  smoke-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install k6
        run: |
          sudo gpg -k
          sudo gpg --no-default-keyring \
            --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
            --keyserver hkp://keyserver.ubuntu.com:80 \
            --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
          echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] \
            https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
          sudo apt-get update && sudo apt-get install k6

      - name: Run smoke test
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          K6_CLOUD_TOKEN: ${{ secrets.K6_CLOUD_TOKEN }}
        run: k6 run --env BASE_URL=$BASE_URL tests/smoke.js

      - name: Upload results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: k6-results
          path: results.json

Gate PRs on smoke tests (1-5 VUs, 2 min). Run full load tests on merge to main. Run soak tests nightly. Keep load tests in tests/load/ and treat them like production code - review them, version them, maintain them.

Analyze results and identify bottlenecks

After a k6 run, the summary output shows key metrics. Here is how to read it:

scenarios: (100.00%) 1 scenario, 50 max VUs, 6m30s max duration
default: 50 looping VUs for 6m0s (gracefulStop: 30s)

checks.........................: 99.34%  12841 out of 12921
data_received..................: 48 MB   130 kB/s
data_sent......................: 2.4 MB  6.6 kB/s
http_req_blocked...............: avg=1.2ms    p(95)=2.1ms    p(99)=250ms
http_req_duration..............: avg=142ms    p(95)=389ms    p(99)=1204ms
http_req_failed................: 0.52%   67 out of 12921
http_reqs......................: 12921   35.89/s

Read the results in this order:

Error rate - http_req_failed above 0.1% needs investigation first
P99 vs p95 gap - a large gap (e.g., p95=389ms, p99=1204ms) signals high tail latency, often from slow DB queries, GC pauses, or lock contention
http_req_blocked - high p99 here means connection pool exhaustion or DNS issues, not application latency
Checks passed % - below 100% means business logic failures under load
Throughput (req/s) - compare to your expected traffic to confirm headroom

Bottleneck identification checklist:

Symptom	Likely cause	Next step
Error rate climbs at X VUs	Thread/connection saturation	Profile CPU and connection pool
P99 diverges from p95 at scale	GC pauses or lock contention	Heap profiling, slow query logs
`http_req_blocked` spikes	Connection pool exhausted	Increase pool size or reduce VUs
Latency grows linearly with VUs	No caching on hot path	Add caching, check indexes
Error rate recovers after ramp-down	Temporary saturation, no leak	System is resilient, note max VUs

Anti-patterns

Anti-pattern	Why it's wrong	What to do instead
Testing against production with no traffic shielding	Unexpected degradation hits real users	Test in a production-like staging environment or use a dark traffic approach
Using averages to judge performance	Average hides the worst 5-10% of requests that real users experience	Always track and gate on p95 and p99
No think time between steps	Generates unrealistically high RPS; stresses network, not application logic	Add `sleep(randomBetween(1, 3))` between logical steps
Single hardcoded test data record	Hits the same cache key every time; measures cache, not system	Parameterize with a pool of realistic IDs and payloads
Treating load tests as one-off checks	Regressions silently reintroduce themselves after each deploy	Automate in CI with defined thresholds; fail the build on violations
Running load tests with no resource monitoring	Test results show latency but not why - you cannot fix what you cannot see	Correlate k6 results with CPU, memory, DB slow logs, and APM traces

Gotchas

k6 VU-based (closed) model produces misleadingly low RPS at high think times - If your scenario has 5 seconds of think time and you run 50 VUs, your max throughput is 50/5 = 10 RPS. This feels like the system is underloaded when it is actually VU-constrained. Use the ramping-arrival-rate executor to control RPS directly when benchmarking throughput capacity.
http_req_blocked spikes are invisible in aggregate dashboards - Aggregate p95 latency can look healthy while p99 http_req_blocked (connection pool wait time) is 2-3 seconds, indicating connection exhaustion. Always check http_req_blocked and http_req_connecting separately from http_req_duration before declaring a test passing.
Shared test data loaded with open() per-VU causes OOM on large datasets - Loading a large JSON file with open('./data/users.json') at the top level of the default function runs once per VU, not once per run. Use SharedArray to load data once and share it across all VUs without duplicating memory.
Threshold failures abort the test before you see the full breakdown curve - During stress tests, setting abortOnFail: true on latency thresholds stops the test the moment it crosses the boundary, preventing you from seeing how the system degrades at higher load. Use abortOnFail: false for stress and spike tests; reserve abort behavior for smoke tests in CI.
Load testing authenticated endpoints requires token refresh logic - Tokens generated in setup() expire during long soak tests (2-24 hours). VUs that use an expired token receive 401s that inflate error rates without revealing the real cause. Implement token refresh in the VU loop or generate tokens with a lifetime longer than the test duration.

References

For detailed comparisons and implementation patterns, read the relevant file from the references/ folder:

references/tool-comparison.md - k6 vs Artillery vs JMeter vs Gatling: when to use each, scripting model, CI integration, and ecosystem

Only load a references file if the current task requires it - they will consume context.

References

tool-comparison.md

Load Testing Tool Comparison

Choosing the right tool shapes how you write tests, integrate with CI, and analyze results. This reference covers the four most common open-source load testing tools. The short answer: k6 for most teams, Artillery for Node.js shops, JMeter for legacy enterprise environments, Gatling for JVM teams needing code-first precision.

Summary table

Dimension	k6	Artillery	JMeter	Gatling
Language	JavaScript (ES6+)	YAML + JS hooks	GUI / XML	Scala / Java
Scripting model	Code-first	Config-first, code optional	GUI-first	Code-first
Runtime	Go binary	Node.js	JVM	JVM
Protocol support	HTTP, WebSocket, gRPC, Kafka	HTTP, WebSocket, gRPC, Socket.io	HTTP, FTP, JDBC, JMS, LDAP, SMTP	HTTP, WebSocket, gRPC, JMS
CI/CD fit	Excellent - single binary, exit code	Good - npm install	Moderate - heavy JVM, headless mode	Good - Maven/Gradle plugin
Memory footprint	Low - Go goroutines	Moderate - Node.js event loop	High - one JVM thread per VU	Moderate - Akka actors
Cloud execution	k6 Cloud (Grafana)	Artillery Cloud	BlazeMeter, OctoPerf	Gatling Enterprise
Reporting	Built-in summary + JSON + Grafana	HTML report + JSON	HTML report + plugins	HTML report built-in
License	AGPL-3.0 (OSS) / commercial cloud	MPL-2.0 (OSS) / commercial cloud	Apache 2.0	Apache 2.0 (OSS) / commercial
Learning curve	Low-Medium	Low	High (GUI complexity)	High (Scala required)

k6

Best for: Teams that want code-first tests, low resource usage, and tight CI integration. The default recommendation for most modern engineering teams.

Strengths

Single static binary, no runtime dependencies to install in CI
Tests are JavaScript - familiar to most web engineers
Built-in thresholds with non-zero exit codes for CI gating
Extremely low memory per VU (Go goroutines vs JVM threads)
Native Grafana dashboard integration via k6 output
Active ecosystem: k6 browser extension, xk6 plugins for Kafka, Redis, SQL

Weaknesses

JavaScript, not full Node.js - no require('fs'), no npm packages by default
No built-in HTML report (use k6 Web Dashboard or export to Grafana)
Distributed execution requires k6 Cloud or manual orchestration with k6 operator
Stateful protocol testing (e.g., JDBC, LDAP) not natively supported

When to choose k6

New greenfield project with no legacy tooling
CI/CD-first workflow (GitHub Actions, GitLab CI, CircleCI)
Team writes JavaScript or TypeScript
Microservices with HTTP or gRPC APIs
You want to version-control tests alongside application code

Quick start

brew install k6          # macOS
choco install k6         # Windows
sudo apt-get install k6  # Debian/Ubuntu

k6 run script.js
k6 run --vus 50 --duration 2m script.js
k6 run --out json=results.json script.js

Artillery

Best for: Node.js teams, YAML-centric workflows, or scenarios requiring rich plugin hooks in JavaScript.

Strengths

YAML-first: non-engineers can read and contribute to test scenarios
Built on Node.js - full npm ecosystem available in custom processors
Strong WebSocket and Socket.io support out of the box
Artillery Probe for synthetic monitoring reuses the same test format
HTML reports generated by default with artillery report

Weaknesses

Node.js runtime: higher memory per VU than k6; struggles at very high concurrency
YAML can become unwieldy for complex logic - pushed into JS processors
CI binary is larger (npm install) compared to k6 single binary
Open-source version has fewer cloud execution options

When to choose Artillery

Team is heavily Node.js and wants tests in the same ecosystem
Stakeholders need to read/review test scenarios without learning code
Testing real-time apps with complex WebSocket or Socket.io flows
You want YAML-driven scenario composition with JS escape hatches

Example scenario

# artillery-test.yml
config:
  target: "https://api.example.com"
  phases:
    - duration: 60
      arrivalRate: 10
      name: Warm up
    - duration: 120
      arrivalRate: 50
      name: Sustained load
  defaults:
    headers:
      Content-Type: application/json

scenarios:
  - name: Browse products
    flow:
      - get:
          url: /products
          expect:
            - statusCode: 200
      - think: 2
      - get:
          url: /products/{{ productId }}
          expect:
            - statusCode: 200

npm install -g artillery
artillery run artillery-test.yml
artillery report artillery-report.json --output report.html

JMeter

Best for: Enterprise environments with existing JMeter infrastructure, teams testing non-HTTP protocols (JDBC, LDAP, JMS, FTP), or scenarios that require the GUI for test recording.

Strengths

Broadest protocol support of any tool: HTTP, FTP, JDBC, LDAP, SMTP, JMS, SOAP
GUI test recorder captures browser traffic with no scripting
Rich plugin ecosystem (JMeter Plugins Manager)
Mature tooling around reports, BlazeMeter cloud execution, and enterprise support
Widely known - easier to hire engineers with JMeter experience in some markets

Weaknesses

XML-based test plans: poor diff/review experience in version control
GUI is the primary authoring tool - code-first workflows are cumbersome
Heavy JVM footprint: each virtual user is a thread; scales poorly past ~1000 VUs per machine (use distributed mode with multiple injectors for high VU counts)
Steep learning curve for complex scenarios
Default reports are dated; requires plugins or external dashboards for useful output

When to choose JMeter

You must test JDBC, LDAP, FTP, or JMS protocols
Existing enterprise standard mandates JMeter
Large legacy test suite already exists in JMeter format
Team needs GUI recording to capture complex flows without scripting

Running headless in CI

# Install JMeter
wget https://archive.apache.org/dist/jmeter/binaries/apache-jmeter-5.6.3.tgz
tar -xzf apache-jmeter-5.6.3.tgz

# Run headless (no GUI)
./apache-jmeter-5.6.3/bin/jmeter \
  -n \
  -t test-plan.jmx \
  -l results.jtl \
  -e \
  -o report-dir/

For CI at scale, use the JMeter Maven Plugin or Taurus (a YAML wrapper over JMeter that improves the CI experience significantly).

Gatling

Best for: JVM teams (Java, Scala, Kotlin) who want highly expressive, code-first load tests with precise scenario modeling and accurate high-concurrency simulation.

Strengths

Akka-based actor model: very efficient at high concurrency, better than JMeter threads
DSL is expressive and type-safe (Scala) or Java 17+ friendly
Best-in-class HTML reports with detailed percentile charts and request breakdowns
First-class Maven and Gradle integration
Gatling Enterprise adds distributed execution and CI reporting without extra tooling

Weaknesses

Scala required for the full DSL - significant learning curve for non-JVM engineers
Java SDK is available but less expressive than the Scala version
Slower iteration cycle: JVM compile-test loop vs k6's interpreted JS
Smaller community than k6 or JMeter

When to choose Gatling

Team is Java/Scala/Kotlin and already on the JVM
You need the best HTML reports and percentile breakdown out of the box
Testing complex stateful scenarios where type safety matters
Integration with Maven/Gradle build lifecycle is a requirement

Example simulation (Java SDK)

// GatlingSimulation.java
import io.gatling.javaapi.core.*;
import io.gatling.javaapi.http.*;
import static io.gatling.javaapi.core.CoreDsl.*;
import static io.gatling.javaapi.http.HttpDsl.*;

public class GatlingSimulation extends Simulation {

  HttpProtocolBuilder httpProtocol = http
    .baseUrl("https://api.example.com")
    .acceptHeader("application/json");

  ScenarioBuilder scn = scenario("Browse products")
    .exec(http("list products").get("/products").check(status().is(200)))
    .pause(1, 3)
    .exec(http("product detail").get("/products/42").check(status().is(200)));

  {
    setUp(
      scn.injectOpen(
        rampUsers(50).during(60),
        constantUsersPerSec(50).during(300)
      )
    ).protocols(httpProtocol)
     .assertions(
       global().responseTime().percentile(95).lt(500),
       global().failedRequests().percent().lt(1.0)
     );
  }
}

Decision guide

Does the team use Java or Scala?
  YES -> Gatling (best JVM experience and reports)
  NO  ->
    Do you need non-HTTP protocols (JDBC, LDAP, SMTP)?
      YES -> JMeter
      NO  ->
        Is the team Node.js-first or do stakeholders need YAML scenarios?
          YES -> Artillery
          NO  -> k6 (recommended default)

Feature comparison by use case

Use case	Recommended tool	Notes
Greenfield CI/CD integration	k6	Single binary, exit codes, fast loop
Non-HTTP protocol (JDBC, LDAP)	JMeter	Only mature option for these protocols
Complex WebSocket / Socket.io	Artillery	Best Socket.io support
JVM monolith or Java team	Gatling	Type-safe DSL, Maven plugin
YAML-first for non-engineers	Artillery	Readable scenario files
Very high VU count (1000+) single node	k6 or Gatling	Go goroutines / Akka actors
Legacy enterprise, existing test suite	JMeter	Migration cost is high
Best HTML reports out of the box	Gatling	Most detailed built-in reporting
Kafka / gRPC / Redis extensions	k6 + xk6	Extension ecosystem

Frequently Asked Questions

What is load-testing?

How do I install load-testing?

Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing in your terminal. The skill will be immediately available in your AI coding agent.

What AI agents support load-testing?

load-testing works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.

Is load-testing free?

Yes, load-testing is completely free and open source under the MIT license. Install it with a single command and start using it immediately.

What is the difference between load-testing and similar tools?

load-testing is an AI agent skill that teaches your coding agent specialized software engineering knowledge. Unlike standalone tools, it integrates directly into claude-code, gemini-cli, openai-codex and other AI agents.

Can I use load-testing with Cursor or Windsurf?

load-testing works with any AI coding agent that supports the skills protocol, including Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and 40+ more.

load-testing

What is load-testing?

Quick Start

load-testing

Quick Facts

How to Install

Overview

Tags

Platforms

Related Skills

Frequently Asked Questions

What is load-testing?

How do I install load-testing?

What AI agents support load-testing?

Maintainers

SKILL.md

Load Testing

When to use this skill

Key principles

Core concepts

Test types

Key metrics

Think time

Virtual users vs RPS

Common tasks

Write a basic load test

Implement ramping scenarios - stages

Test API endpoints with checks and thresholds

Simulate realistic user journeys

Stress test to find breaking point

Set up k6 in CI/CD

Analyze results and identify bottlenecks

Anti-patterns

Gotchas

References

References

tool-comparison.md

Load Testing Tool Comparison

Summary table

k6

Strengths

Weaknesses

When to choose k6

Quick start

Artillery

Strengths

Weaknesses

When to choose Artillery

Example scenario

JMeter

Strengths

Weaknesses

When to choose JMeter

Running headless in CI

Gatling

Strengths

Weaknesses

When to choose Gatling

Example simulation (Java SDK)

Decision guide

Feature comparison by use case

Frequently Asked Questions

What is load-testing?

How do I install load-testing?

What AI agents support load-testing?

Is load-testing free?

What is the difference between load-testing and similar tools?

Can I use load-testing with Cursor or Windsurf?