load-testing
Use this skill when load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress. Triggers on k6, Artillery, JMeter, load testing, stress testing, soak testing, spike testing, performance benchmarks, throughput testing, and any task requiring load or performance testing.
engineering load-testingk6performancebenchmarkingstress-testingcapacityWhat is load-testing?
Use this skill when load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress. Triggers on k6, Artillery, JMeter, load testing, stress testing, soak testing, spike testing, performance benchmarks, throughput testing, and any task requiring load or performance testing.
load-testing
load-testing is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress.
Quick Facts
| Field | Value |
|---|---|
| Category | engineering |
| Version | 0.1.0 |
| Platforms | claude-code, gemini-cli, openai-codex |
| License | MIT |
How to Install
- Make sure you have Node.js installed on your machine.
- Run the following command in your terminal:
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing- The load-testing skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).
Overview
A practitioner's guide to load testing production services. This skill covers test design, k6 implementation, CI integration, results analysis, and capacity planning with an emphasis on when each test type is appropriate and what to measure. Designed for engineers who need to validate performance before and after launches.
Tags
load-testing k6 performance benchmarking stress-testing capacity
Platforms
- claude-code
- gemini-cli
- openai-codex
Related Skills
Pair load-testing with these complementary skills:
Frequently Asked Questions
What is load-testing?
Use this skill when load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress. Triggers on k6, Artillery, JMeter, load testing, stress testing, soak testing, spike testing, performance benchmarks, throughput testing, and any task requiring load or performance testing.
How do I install load-testing?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support load-testing?
This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.
Maintainers
Generated from AbsolutelySkilled
SKILL.md
Load Testing
A practitioner's guide to load testing production services. This skill covers test design, k6 implementation, CI integration, results analysis, and capacity planning with an emphasis on when each test type is appropriate and what to measure. Designed for engineers who need to validate performance before and after launches.
When to use this skill
Trigger this skill when the user:
- Writes a k6, Artillery, JMeter, or Gatling test script
- Plans a load, stress, soak, or spike test campaign
- Benchmarks API throughput or latency
- Defines performance SLOs or pass/fail thresholds
- Integrates load tests into CI/CD pipelines
- Analyzes load test results to find bottlenecks
- Capacity plans for an upcoming traffic event (launch, sale, campaign)
Do NOT trigger this skill for:
- Unit or integration tests that don't involve concurrent load (use a testing skill)
- Frontend performance (Lighthouse, Core Web Vitals - use a frontend performance skill)
Key principles
Test in production-like environments - A load test against a single-instance staging box with seeded data tells you nothing about your production fleet. Match CPU/memory ratios, replica counts, and dataset sizes. Synthetic data that doesn't reflect production cardinality produces misleading results.
Define pass/fail criteria before testing - Decide what "passing" means before you run the first request. "P95 latency < 300ms, error rate < 0.1%, RPS >= 500" is a pass/fail criterion. "It felt fast" is not. Set thresholds in code so tests fail automatically in CI.
Ramp up gradually - Never go from 0 to peak load instantly. A sudden spike obscures whether failure was caused by the ramp itself or sustained load. Use stages: warm up, ramp to target, hold steady, ramp down. A gradual ramp mirrors real traffic and gives infrastructure time to autoscale.
Test with realistic data and scenarios - A test that hits a single cached endpoint with the same user ID is not a load test; it is a cache benchmark. Use parameterized data (real user IDs, varied payloads), model the full user journey, and include think time between requests to simulate realistic concurrency.
Automate load tests in CI - Load tests only provide value if they run consistently. Gate every deployment with a smoke-level load test. Run full stress and soak tests on a schedule (nightly or pre-release). Fail the build on threshold violations. Trends over time catch regressions earlier than one-off runs.
Core concepts
Test types
| Type | Goal | Duration | VU shape |
|---|---|---|---|
| Smoke | Verify the test script works; baseline sanity | 1-2 min | 1-5 VUs, constant |
| Load | Validate behavior at expected production traffic | 15-30 min | Ramp to target, hold |
| Stress | Find the breaking point; measure degradation curve | 30-60 min | Ramp beyond expected until failure |
| Soak | Detect memory leaks, connection pool exhaustion, drift | 2-24 hours | Hold at 70-80% capacity |
| Spike | Simulate sudden traffic surge (marketing event, viral post) | 10-20 min | Instant jump to 5-10x, then drop |
Choose the test type based on what question you're trying to answer - not habit. Most teams only run load tests and miss soak and spike scenarios where real incidents happen.
Key metrics
| Metric | What it measures | Typical target |
|---|---|---|
| RPS / throughput | Requests per second the system handles | Depends on expected traffic |
| P50 / P95 / P99 latency | Response time distribution | P99 < 2x your SLO |
| Error rate | % of requests returning 4xx/5xx | < 0.1% under load |
| Time to first byte (TTFB) | Server processing latency | Proxy for backend work |
| Checks passed % | Business logic assertions in the test | 100% expected |
Always track percentiles (p95, p99), not averages. An average of 100ms with a p99 of 5000ms means 1 in 100 users waits 5 seconds - that is a bad service.
Think time
Think time (or "sleep") is the pause between requests a virtual user makes to simulate
a real user reading a page or filling a form. Without think time, virtual users fire
requests as fast as possible, which does not reflect real traffic patterns and saturates
the system unrealistically. Use sleep(randomBetween(1, 3)) to add variance.
Virtual users vs RPS
Virtual users (VUs) model concurrent users - each VU executes the full scenario loop. RPS is a result of VU count, think time, and iteration duration.
Open vs closed workload models:
- Closed (VU-based): Fixed pool of VUs, each completes a request before starting the next. System naturally caps throughput. Best for session-based applications.
- Open (arrival rate): New requests arrive at a fixed rate regardless of system state. Queues build under saturation. Best for stateless APIs and microservices.
k6 supports both: vus/duration for closed, constantArrivalRate/ramping ArrivalRate executors for open.
Common tasks
Write a basic load test
// k6 basic load test - smoke then load
import http from 'k6/http';
import { sleep, check } from 'k6';
export const options = {
stages: [
{ duration: '30s', target: 10 }, // ramp up
{ duration: '1m', target: 10 }, // hold
{ duration: '15s', target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ['p(95)<300'], // 95% of requests under 300ms
http_req_failed: ['rate<0.01'], // less than 1% errors
},
};
export default function () {
const res = http.get('https://api.example.com/health');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}Run with: k6 run script.js. Add --out json=results.json to export raw data.
Implement ramping scenarios - stages
// k6 staged ramp - warm up, load, stress, cool down
import http from 'k6/http';
import { sleep, check } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 20 }, // warm up to expected load
{ duration: '5m', target: 20 }, // hold at expected load
{ duration: '2m', target: 100 }, // ramp to stress level
{ duration: '5m', target: 100 }, // hold under stress
{ duration: '2m', target: 200 }, // push further
{ duration: '3m', target: 200 }, // hold to find saturation point
{ duration: '2m', target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ['p(99)<1000'],
http_req_failed: ['rate<0.05'],
},
};
export default function () {
http.get('https://api.example.com/products');
sleep(Math.random() * 2 + 1); // think time: 1-3s
}Watch metrics during the stress phase. The point where p99 latency inflects upward or error rate climbs is your saturation point.
Test API endpoints with checks and thresholds
// k6 with structured checks and per-endpoint thresholds
import http from 'k6/http';
import { check, group, sleep } from 'k6';
export const options = {
vus: 50,
duration: '5m',
thresholds: {
'http_req_duration{endpoint:list}': ['p(95)<200'],
'http_req_duration{endpoint:detail}': ['p(95)<400'],
'http_req_failed': ['rate<0.01'],
'checks': ['rate>0.99'],
},
};
const BASE_URL = 'https://api.example.com';
export default function () {
group('list products', () => {
const res = http.get(`${BASE_URL}/products`, {
tags: { endpoint: 'list' },
});
check(res, {
'list: status 200': (r) => r.status === 200,
'list: has items': (r) => JSON.parse(r.body).items.length > 0,
});
});
sleep(1);
group('product detail', () => {
const res = http.get(`${BASE_URL}/products/42`, {
tags: { endpoint: 'detail' },
});
check(res, {
'detail: status 200': (r) => r.status === 200,
'detail: has price': (r) => JSON.parse(r.body).price !== undefined,
});
});
sleep(Math.random() * 2 + 1);
}Tag requests by endpoint so thresholds and dashboards are segmented - aggregate p95 across all endpoints hides slow outliers.
Simulate realistic user journeys
// k6 multi-step user journey with shared data
import http from 'k6/http';
import { check, sleep } from 'k6';
import { SharedArray } from 'k6/data';
// Load test data once, shared across VUs
const users = new SharedArray('users', () =>
JSON.parse(open('./data/users.json'))
);
export const options = {
stages: [
{ duration: '1m', target: 30 },
{ duration: '3m', target: 30 },
{ duration: '1m', target: 0 },
],
thresholds: {
http_req_duration: ['p(95)<500'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const user = users[Math.floor(Math.random() * users.length)];
// Step 1: Login
const loginRes = http.post('https://api.example.com/auth/login', JSON.stringify({
email: user.email,
password: user.password,
}), { headers: { 'Content-Type': 'application/json' } });
check(loginRes, { 'login: status 200': (r) => r.status === 200 });
const token = JSON.parse(loginRes.body).token;
const authHeaders = { headers: { Authorization: `Bearer ${token}` } };
sleep(1);
// Step 2: Browse catalog
const listRes = http.get('https://api.example.com/products', authHeaders);
check(listRes, { 'browse: status 200': (r) => r.status === 200 });
sleep(Math.random() * 3 + 1); // user reads the list
// Step 3: Add to cart
const cartRes = http.post('https://api.example.com/cart', JSON.stringify({
product_id: 42, quantity: 1,
}), { ...authHeaders, headers: { ...authHeaders.headers, 'Content-Type': 'application/json' } });
check(cartRes, { 'cart: status 201': (r) => r.status === 201 });
sleep(2);
}Use SharedArray to avoid loading large data files per-VU. Model real think time
between steps - a user takes seconds between actions, not milliseconds.
Stress test to find breaking point
// k6 stress test with open arrival rate model
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
scenarios: {
stress: {
executor: 'ramping-arrival-rate',
startRate: 10, // 10 req/s at start
timeUnit: '1s',
preAllocatedVUs: 50,
maxVUs: 500,
stages: [
{ duration: '2m', target: 50 }, // ramp to 50 req/s
{ duration: '3m', target: 100 }, // ramp to 100 req/s
{ duration: '3m', target: 200 }, // ramp to 200 req/s - find saturation
{ duration: '2m', target: 50 }, // check recovery
],
},
},
thresholds: {
// Test continues even on failure - we want to observe breakdown
http_req_duration: [{ threshold: 'p(95)<2000', abortOnFail: false }],
http_req_failed: [{ threshold: 'rate<0.10', abortOnFail: false }],
},
};
export default function () {
const res = http.get('https://api.example.com/search?q=laptop');
check(res, { 'status 200': (r) => r.status === 200 });
sleep(0.5);
}Use abortOnFail: false during stress tests - you want to observe the degradation
curve, not abort at the first threshold breach. The breaking point is the RPS where
error rate exceeds tolerance or latency becomes unusable.
Set up k6 in CI/CD
# .github/workflows/load-test.yml
name: Load Test
on:
push:
branches: [main]
schedule:
- cron: '0 2 * * *' # nightly soak test
jobs:
smoke-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install k6
run: |
sudo gpg -k
sudo gpg --no-default-keyring \
--keyring /usr/share/keyrings/k6-archive-keyring.gpg \
--keyserver hkp://keyserver.ubuntu.com:80 \
--recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] \
https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update && sudo apt-get install k6
- name: Run smoke test
env:
BASE_URL: ${{ secrets.STAGING_URL }}
K6_CLOUD_TOKEN: ${{ secrets.K6_CLOUD_TOKEN }}
run: k6 run --env BASE_URL=$BASE_URL tests/smoke.js
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: k6-results
path: results.jsonGate PRs on smoke tests (1-5 VUs, 2 min). Run full load tests on merge to main.
Run soak tests nightly. Keep load tests in tests/load/ and treat them like
production code - review them, version them, maintain them.
Analyze results and identify bottlenecks
After a k6 run, the summary output shows key metrics. Here is how to read it:
scenarios: (100.00%) 1 scenario, 50 max VUs, 6m30s max duration
default: 50 looping VUs for 6m0s (gracefulStop: 30s)
checks.........................: 99.34% 12841 out of 12921
data_received..................: 48 MB 130 kB/s
data_sent......................: 2.4 MB 6.6 kB/s
http_req_blocked...............: avg=1.2ms p(95)=2.1ms p(99)=250ms
http_req_duration..............: avg=142ms p(95)=389ms p(99)=1204ms
http_req_failed................: 0.52% 67 out of 12921
http_reqs......................: 12921 35.89/sRead the results in this order:
- Error rate -
http_req_failedabove 0.1% needs investigation first - P99 vs p95 gap - a large gap (e.g., p95=389ms, p99=1204ms) signals high tail latency, often from slow DB queries, GC pauses, or lock contention
http_req_blocked- high p99 here means connection pool exhaustion or DNS issues, not application latency- Checks passed % - below 100% means business logic failures under load
- Throughput (req/s) - compare to your expected traffic to confirm headroom
Bottleneck identification checklist:
| Symptom | Likely cause | Next step |
|---|---|---|
| Error rate climbs at X VUs | Thread/connection saturation | Profile CPU and connection pool |
| P99 diverges from p95 at scale | GC pauses or lock contention | Heap profiling, slow query logs |
http_req_blocked spikes |
Connection pool exhausted | Increase pool size or reduce VUs |
| Latency grows linearly with VUs | No caching on hot path | Add caching, check indexes |
| Error rate recovers after ramp-down | Temporary saturation, no leak | System is resilient, note max VUs |
Anti-patterns
| Anti-pattern | Why it's wrong | What to do instead |
|---|---|---|
| Testing against production with no traffic shielding | Unexpected degradation hits real users | Test in a production-like staging environment or use a dark traffic approach |
| Using averages to judge performance | Average hides the worst 5-10% of requests that real users experience | Always track and gate on p95 and p99 |
| No think time between steps | Generates unrealistically high RPS; stresses network, not application logic | Add sleep(randomBetween(1, 3)) between logical steps |
| Single hardcoded test data record | Hits the same cache key every time; measures cache, not system | Parameterize with a pool of realistic IDs and payloads |
| Treating load tests as one-off checks | Regressions silently reintroduce themselves after each deploy | Automate in CI with defined thresholds; fail the build on violations |
| Running load tests with no resource monitoring | Test results show latency but not why - you cannot fix what you cannot see | Correlate k6 results with CPU, memory, DB slow logs, and APM traces |
Gotchas
k6 VU-based (closed) model produces misleadingly low RPS at high think times - If your scenario has 5 seconds of think time and you run 50 VUs, your max throughput is 50/5 = 10 RPS. This feels like the system is underloaded when it is actually VU-constrained. Use the
ramping-arrival-rateexecutor to control RPS directly when benchmarking throughput capacity.http_req_blockedspikes are invisible in aggregate dashboards - Aggregate p95 latency can look healthy while p99http_req_blocked(connection pool wait time) is 2-3 seconds, indicating connection exhaustion. Always checkhttp_req_blockedandhttp_req_connectingseparately fromhttp_req_durationbefore declaring a test passing.Shared test data loaded with
open()per-VU causes OOM on large datasets - Loading a large JSON file withopen('./data/users.json')at the top level of the default function runs once per VU, not once per run. UseSharedArrayto load data once and share it across all VUs without duplicating memory.Threshold failures abort the test before you see the full breakdown curve - During stress tests, setting
abortOnFail: trueon latency thresholds stops the test the moment it crosses the boundary, preventing you from seeing how the system degrades at higher load. UseabortOnFail: falsefor stress and spike tests; reserve abort behavior for smoke tests in CI.Load testing authenticated endpoints requires token refresh logic - Tokens generated in
setup()expire during long soak tests (2-24 hours). VUs that use an expired token receive 401s that inflate error rates without revealing the real cause. Implement token refresh in the VU loop or generate tokens with a lifetime longer than the test duration.
References
For detailed comparisons and implementation patterns, read the relevant file from
the references/ folder:
references/tool-comparison.md- k6 vs Artillery vs JMeter vs Gatling: when to use each, scripting model, CI integration, and ecosystem
Only load a references file if the current task requires it - they will consume context.
References
tool-comparison.md
Load Testing Tool Comparison
Choosing the right tool shapes how you write tests, integrate with CI, and analyze results. This reference covers the four most common open-source load testing tools. The short answer: k6 for most teams, Artillery for Node.js shops, JMeter for legacy enterprise environments, Gatling for JVM teams needing code-first precision.
Summary table
| Dimension | k6 | Artillery | JMeter | Gatling |
|---|---|---|---|---|
| Language | JavaScript (ES6+) | YAML + JS hooks | GUI / XML | Scala / Java |
| Scripting model | Code-first | Config-first, code optional | GUI-first | Code-first |
| Runtime | Go binary | Node.js | JVM | JVM |
| Protocol support | HTTP, WebSocket, gRPC, Kafka | HTTP, WebSocket, gRPC, Socket.io | HTTP, FTP, JDBC, JMS, LDAP, SMTP | HTTP, WebSocket, gRPC, JMS |
| CI/CD fit | Excellent - single binary, exit code | Good - npm install | Moderate - heavy JVM, headless mode | Good - Maven/Gradle plugin |
| Memory footprint | Low - Go goroutines | Moderate - Node.js event loop | High - one JVM thread per VU | Moderate - Akka actors |
| Cloud execution | k6 Cloud (Grafana) | Artillery Cloud | BlazeMeter, OctoPerf | Gatling Enterprise |
| Reporting | Built-in summary + JSON + Grafana | HTML report + JSON | HTML report + plugins | HTML report built-in |
| License | AGPL-3.0 (OSS) / commercial cloud | MPL-2.0 (OSS) / commercial cloud | Apache 2.0 | Apache 2.0 (OSS) / commercial |
| Learning curve | Low-Medium | Low | High (GUI complexity) | High (Scala required) |
k6
Best for: Teams that want code-first tests, low resource usage, and tight CI integration. The default recommendation for most modern engineering teams.
Strengths
- Single static binary, no runtime dependencies to install in CI
- Tests are JavaScript - familiar to most web engineers
- Built-in thresholds with non-zero exit codes for CI gating
- Extremely low memory per VU (Go goroutines vs JVM threads)
- Native Grafana dashboard integration via k6 output
- Active ecosystem: k6 browser extension, xk6 plugins for Kafka, Redis, SQL
Weaknesses
- JavaScript, not full Node.js - no
require('fs'), no npm packages by default - No built-in HTML report (use k6 Web Dashboard or export to Grafana)
- Distributed execution requires k6 Cloud or manual orchestration with k6 operator
- Stateful protocol testing (e.g., JDBC, LDAP) not natively supported
When to choose k6
- New greenfield project with no legacy tooling
- CI/CD-first workflow (GitHub Actions, GitLab CI, CircleCI)
- Team writes JavaScript or TypeScript
- Microservices with HTTP or gRPC APIs
- You want to version-control tests alongside application code
Quick start
brew install k6 # macOS
choco install k6 # Windows
sudo apt-get install k6 # Debian/Ubuntu
k6 run script.js
k6 run --vus 50 --duration 2m script.js
k6 run --out json=results.json script.jsArtillery
Best for: Node.js teams, YAML-centric workflows, or scenarios requiring rich plugin hooks in JavaScript.
Strengths
- YAML-first: non-engineers can read and contribute to test scenarios
- Built on Node.js - full npm ecosystem available in custom processors
- Strong WebSocket and Socket.io support out of the box
- Artillery Probe for synthetic monitoring reuses the same test format
- HTML reports generated by default with
artillery report
Weaknesses
- Node.js runtime: higher memory per VU than k6; struggles at very high concurrency
- YAML can become unwieldy for complex logic - pushed into JS processors
- CI binary is larger (npm install) compared to k6 single binary
- Open-source version has fewer cloud execution options
When to choose Artillery
- Team is heavily Node.js and wants tests in the same ecosystem
- Stakeholders need to read/review test scenarios without learning code
- Testing real-time apps with complex WebSocket or Socket.io flows
- You want YAML-driven scenario composition with JS escape hatches
Example scenario
# artillery-test.yml
config:
target: "https://api.example.com"
phases:
- duration: 60
arrivalRate: 10
name: Warm up
- duration: 120
arrivalRate: 50
name: Sustained load
defaults:
headers:
Content-Type: application/json
scenarios:
- name: Browse products
flow:
- get:
url: /products
expect:
- statusCode: 200
- think: 2
- get:
url: /products/{{ productId }}
expect:
- statusCode: 200npm install -g artillery
artillery run artillery-test.yml
artillery report artillery-report.json --output report.htmlJMeter
Best for: Enterprise environments with existing JMeter infrastructure, teams testing non-HTTP protocols (JDBC, LDAP, JMS, FTP), or scenarios that require the GUI for test recording.
Strengths
- Broadest protocol support of any tool: HTTP, FTP, JDBC, LDAP, SMTP, JMS, SOAP
- GUI test recorder captures browser traffic with no scripting
- Rich plugin ecosystem (JMeter Plugins Manager)
- Mature tooling around reports, BlazeMeter cloud execution, and enterprise support
- Widely known - easier to hire engineers with JMeter experience in some markets
Weaknesses
- XML-based test plans: poor diff/review experience in version control
- GUI is the primary authoring tool - code-first workflows are cumbersome
- Heavy JVM footprint: each virtual user is a thread; scales poorly past ~1000 VUs per machine (use distributed mode with multiple injectors for high VU counts)
- Steep learning curve for complex scenarios
- Default reports are dated; requires plugins or external dashboards for useful output
When to choose JMeter
- You must test JDBC, LDAP, FTP, or JMS protocols
- Existing enterprise standard mandates JMeter
- Large legacy test suite already exists in JMeter format
- Team needs GUI recording to capture complex flows without scripting
Running headless in CI
# Install JMeter
wget https://archive.apache.org/dist/jmeter/binaries/apache-jmeter-5.6.3.tgz
tar -xzf apache-jmeter-5.6.3.tgz
# Run headless (no GUI)
./apache-jmeter-5.6.3/bin/jmeter \
-n \
-t test-plan.jmx \
-l results.jtl \
-e \
-o report-dir/For CI at scale, use the JMeter Maven Plugin or Taurus (a YAML wrapper over JMeter that improves the CI experience significantly).
Gatling
Best for: JVM teams (Java, Scala, Kotlin) who want highly expressive, code-first load tests with precise scenario modeling and accurate high-concurrency simulation.
Strengths
- Akka-based actor model: very efficient at high concurrency, better than JMeter threads
- DSL is expressive and type-safe (Scala) or Java 17+ friendly
- Best-in-class HTML reports with detailed percentile charts and request breakdowns
- First-class Maven and Gradle integration
- Gatling Enterprise adds distributed execution and CI reporting without extra tooling
Weaknesses
- Scala required for the full DSL - significant learning curve for non-JVM engineers
- Java SDK is available but less expressive than the Scala version
- Slower iteration cycle: JVM compile-test loop vs k6's interpreted JS
- Smaller community than k6 or JMeter
When to choose Gatling
- Team is Java/Scala/Kotlin and already on the JVM
- You need the best HTML reports and percentile breakdown out of the box
- Testing complex stateful scenarios where type safety matters
- Integration with Maven/Gradle build lifecycle is a requirement
Example simulation (Java SDK)
// GatlingSimulation.java
import io.gatling.javaapi.core.*;
import io.gatling.javaapi.http.*;
import static io.gatling.javaapi.core.CoreDsl.*;
import static io.gatling.javaapi.http.HttpDsl.*;
public class GatlingSimulation extends Simulation {
HttpProtocolBuilder httpProtocol = http
.baseUrl("https://api.example.com")
.acceptHeader("application/json");
ScenarioBuilder scn = scenario("Browse products")
.exec(http("list products").get("/products").check(status().is(200)))
.pause(1, 3)
.exec(http("product detail").get("/products/42").check(status().is(200)));
{
setUp(
scn.injectOpen(
rampUsers(50).during(60),
constantUsersPerSec(50).during(300)
)
).protocols(httpProtocol)
.assertions(
global().responseTime().percentile(95).lt(500),
global().failedRequests().percent().lt(1.0)
);
}
}Decision guide
Does the team use Java or Scala?
YES -> Gatling (best JVM experience and reports)
NO ->
Do you need non-HTTP protocols (JDBC, LDAP, SMTP)?
YES -> JMeter
NO ->
Is the team Node.js-first or do stakeholders need YAML scenarios?
YES -> Artillery
NO -> k6 (recommended default)Feature comparison by use case
| Use case | Recommended tool | Notes |
|---|---|---|
| Greenfield CI/CD integration | k6 | Single binary, exit codes, fast loop |
| Non-HTTP protocol (JDBC, LDAP) | JMeter | Only mature option for these protocols |
| Complex WebSocket / Socket.io | Artillery | Best Socket.io support |
| JVM monolith or Java team | Gatling | Type-safe DSL, Maven plugin |
| YAML-first for non-engineers | Artillery | Readable scenario files |
| Very high VU count (1000+) single node | k6 or Gatling | Go goroutines / Akka actors |
| Legacy enterprise, existing test suite | JMeter | Migration cost is high |
| Best HTML reports out of the box | Gatling | Most detailed built-in reporting |
| Kafka / gRPC / Redis extensions | k6 + xk6 | Extension ecosystem |
Frequently Asked Questions
What is load-testing?
Use this skill when load testing services, benchmarking API performance, planning capacity, or identifying bottlenecks under stress. Triggers on k6, Artillery, JMeter, load testing, stress testing, soak testing, spike testing, performance benchmarks, throughput testing, and any task requiring load or performance testing.
How do I install load-testing?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill load-testing in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support load-testing?
load-testing works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.