docker-kubernetes
Use this skill when containerizing applications, writing Dockerfiles, deploying to Kubernetes, creating Helm charts, or configuring service mesh. Triggers on Docker, Kubernetes, k8s, containers, pods, deployments, services, ingress, Helm, Istio, container orchestration, and any task requiring container or cluster management.
infra dockerkubernetescontainershelmorchestrationdevopsWhat is docker-kubernetes?
Use this skill when containerizing applications, writing Dockerfiles, deploying to Kubernetes, creating Helm charts, or configuring service mesh. Triggers on Docker, Kubernetes, k8s, containers, pods, deployments, services, ingress, Helm, Istio, container orchestration, and any task requiring container or cluster management.
docker-kubernetes
docker-kubernetes is a production-ready AI agent skill for claude-code, gemini-cli, openai-codex. Containerizing applications, writing Dockerfiles, deploying to Kubernetes, creating Helm charts, or configuring service mesh.
Quick Facts
| Field | Value |
|---|---|
| Category | infra |
| Version | 0.1.0 |
| Platforms | claude-code, gemini-cli, openai-codex |
| License | MIT |
How to Install
- Make sure you have Node.js installed on your machine.
- Run the following command in your terminal:
npx skills add AbsolutelySkilled/AbsolutelySkilled --skill docker-kubernetes- The docker-kubernetes skill is now available in your AI coding agent (Claude Code, Gemini CLI, OpenAI Codex, etc.).
Overview
A practical guide to containerizing applications and running them reliably in Kubernetes. This skill covers the full lifecycle from writing a production-ready Dockerfile to deploying with Helm, configuring traffic with Ingress, and debugging cluster issues. The emphasis is on correctness and operability - containers that are small, secure, and observable; Kubernetes workloads that self-heal, scale, and fail gracefully. Designed for engineers who know the basics and need opinionated guidance on production patterns.
Tags
docker kubernetes containers helm orchestration devops
Platforms
- claude-code
- gemini-cli
- openai-codex
Related Skills
Pair docker-kubernetes with these complementary skills:
Frequently Asked Questions
What is docker-kubernetes?
Use this skill when containerizing applications, writing Dockerfiles, deploying to Kubernetes, creating Helm charts, or configuring service mesh. Triggers on Docker, Kubernetes, k8s, containers, pods, deployments, services, ingress, Helm, Istio, container orchestration, and any task requiring container or cluster management.
How do I install docker-kubernetes?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill docker-kubernetes in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support docker-kubernetes?
This skill works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.
Maintainers
Generated from AbsolutelySkilled
SKILL.md
Docker & Kubernetes
A practical guide to containerizing applications and running them reliably in Kubernetes. This skill covers the full lifecycle from writing a production-ready Dockerfile to deploying with Helm, configuring traffic with Ingress, and debugging cluster issues. The emphasis is on correctness and operability - containers that are small, secure, and observable; Kubernetes workloads that self-heal, scale, and fail gracefully. Designed for engineers who know the basics and need opinionated guidance on production patterns.
When to use this skill
Trigger this skill when the user:
- Writes or reviews a Dockerfile (any language or runtime)
- Deploys or configures a Kubernetes workload (Deployment, StatefulSet, DaemonSet)
- Sets up Kubernetes networking (Services, Ingress, NetworkPolicy)
- Creates or maintains a Helm chart or values file
- Configures health probes, resource limits, or autoscaling (HPA/VPA)
- Debugs a failing pod (CrashLoopBackOff, OOMKilled, ImagePullBackOff)
- Configures a service mesh (Istio, Linkerd) or needs mTLS between services
Do NOT trigger this skill for:
- Cloud-provider infrastructure provisioning (use a Terraform/IaC skill instead)
- CI/CD pipeline authoring (use a CI/CD skill - container builds are a small part)
Key principles
One process per container - A container should do exactly one thing. Sidecar patterns (logging agents, proxies) are valid, but the main container must not run multiple application processes. This preserves independent restartability and clean signal handling.
Immutable infrastructure - Never patch a running container. Update the image tag, redeploy. Mutations to running pods are invisible to version control and create snowflakes. Pin image tags in production; never use
latest.Declarative configuration - All cluster state lives in YAML checked into git.
kubectl applyis the only allowed mutation path.kubectl editon a live cluster is a debugging tool, not a deployment method.Minimal base images - Use
alpine,distroless, or language-specific slim images. Fewer packages = smaller attack surface = faster pulls. Multi-stage builds eliminate build tooling from the final image.Health checks always - Every Deployment must define liveness and readiness probes. Without them, Kubernetes cannot distinguish a booting pod from a hung one, and will route traffic to pods that cannot serve it.
Core concepts
Docker layers and caching
Each RUN, COPY, and ADD instruction creates a layer. Layers are cached by
content hash. Cache is invalidated at the first changed layer and all layers after
it. Ordering matters: put rarely-changing instructions (installing OS packages) before
frequently-changing ones (copying application source). Copy dependency manifests and
install before copying source code.
Kubernetes object model
Pod -> smallest schedulable unit (one or more containers sharing network/storage)
|
Deployment -> manages ReplicaSets; handles rollouts and rollbacks
|
Service -> stable virtual IP and DNS name that routes to healthy pod IPs
|
Ingress -> HTTP/HTTPS routing rules from outside the cluster into ServicesNamespaces provide soft isolation within a cluster. Use them to separate environments (staging, production) or teams. ResourceQuotas and NetworkPolicies scope to namespaces.
ConfigMaps and Secrets
- ConfigMap: non-sensitive configuration (feature flags, URLs, log levels). Mount as env vars or volume files.
- Secret: sensitive values (passwords, tokens, TLS certs). Stored base64-encoded in etcd (encrypt etcd at rest in production). Never bake secrets into images.
Common tasks
Write a production Dockerfile (multi-stage, Node.js)
# ---- build stage ----
FROM node:20-alpine AS builder
WORKDIR /app
# Copy manifests first - cached until dependencies change
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts
COPY . .
RUN npm run build
# ---- runtime stage ----
FROM node:20-alpine AS runtime
ENV NODE_ENV=production
WORKDIR /app
# Non-root user for security
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json ./
USER appuser
EXPOSE 3000
# Use exec form to receive signals correctly
CMD ["node", "dist/server.js"]Key decisions: alpine base, non-root user, npm ci (reproducible installs),
multi-stage to exclude dev dependencies, exec-form CMD for proper PID 1 signal
handling.
Create a Kubernetes Deployment + Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
namespace: production
labels:
app: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api-server
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api-server
image: registry.example.com/api-server:1.4.2 # pinned tag, never latest
ports:
- containerPort: 3000
envFrom:
- configMapRef:
name: api-config
- secretRef:
name: api-secrets
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
readinessProbe:
httpGet:
path: /healthz/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz/live
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: api-server
---
apiVersion: v1
kind: Service
metadata:
name: api-server
namespace: production
spec:
selector:
app: api-server
ports:
- port: 80
targetPort: 3000
type: ClusterIPConfigure Ingress with TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
secretName: api-tls-cert # cert-manager populates this
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-server
port:
number: 80Write a Helm chart
Minimal chart structure and key files:
Chart.yaml
apiVersion: v2
name: api-server
description: API server Helm chart
type: application
version: 0.1.0 # chart version
appVersion: "1.4.2" # application image versionvalues.yaml
replicaCount: 3
image:
repository: registry.example.com/api-server
tag: "" # defaults to .Chart.AppVersion
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: true
host: api.example.com
tlsSecretName: api-tls-cert
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
autoscaling:
enabled: false
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70templates/deployment.yaml (excerpt)
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
replicas: {{ .Values.replicaCount }}Deploy with: helm upgrade --install api-server ./api-server -f values.prod.yaml -n production
Set up health checks (liveness, readiness, startup probes)
startupProbe:
httpGet:
path: /healthz/startup
port: 3000
failureThreshold: 30 # allow up to 30 * 10s = 5 min for slow starts
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3 # remove from LB after 3 failures
livenessProbe:
httpGet:
path: /healthz/live
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
failureThreshold: 3 # restart after 3 failuresRules:
- startup probe - use for slow-starting containers; disables liveness/readiness until it passes
- readiness probe - gates traffic routing; use for dependency checks (DB connected?)
- liveness probe - gates pod restart; only check self (not downstream services)
- Never use the same endpoint for readiness and liveness if they have different semantics
Configure resource limits and HPA
resources:
requests:
cpu: "100m" # scheduler uses this for placement
memory: "128Mi"
limits:
cpu: "500m" # throttled at this ceiling
memory: "256Mi" # OOMKilled if exceeded
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80Rule of thumb: set requests based on measured p50 usage, limits at 3-5x requests
for CPU (CPU is compressible), 1.5-2x for memory (memory is not compressible).
Debug a CrashLoopBackOff pod
Follow this sequence in order:
# 1. Get pod status and events
kubectl get pod <pod-name> -n <namespace>
kubectl describe pod <pod-name> -n <namespace> # read Events section
# 2. Check current logs
kubectl logs <pod-name> -n <namespace>
# 3. Check previous container logs (the one that crashed)
kubectl logs <pod-name> -n <namespace> --previous
# 4. Check resource pressure on the node
kubectl top pod <pod-name> -n <namespace>
kubectl top node
# 5. If image issue, check image pull events in describe output
# 6. Run interactively with a debug shell
kubectl debug -it <pod-name> -n <namespace> --image=busybox --target=<container-name>Common causes:
- Application crashes on startup - check logs
--previous - Missing env var or secret - check
describeEvents for missing volume mounts - OOMKilled - increase memory limit or fix memory leak
- Liveness probe too aggressive - increase
initialDelaySeconds
Error handling
| Error | Cause | Fix |
|---|---|---|
CrashLoopBackOff |
Container exits repeatedly; k8s backs off restart | Check logs --previous, fix application crash or missing config |
ImagePullBackOff |
kubelet cannot pull the image | Verify image name/tag, registry credentials (imagePullSecrets), network access |
OOMKilled |
Container exceeded memory limit | Increase memory limit or profile and fix memory leak |
Pending (pod) |
No node satisfies scheduling constraints | Check node resources (kubectl top node), taints/tolerations, node selectors |
0/N nodes available |
Affinity/anti-affinity or resource pressure | Relax topologySpreadConstraints or add nodes |
CreateContainerConfigError |
Referenced Secret or ConfigMap does not exist | Create the missing resource or fix the reference name |
Gotchas
Shell-form CMD (
CMD node server.js) doesn't receive signals - Shell form wraps the command in/bin/sh -c, makingshPID 1. When Kubernetes sendsSIGTERMduring pod shutdown,shreceives it but may not forward it to your process. This causes the pod to hang until theterminationGracePeriodSecondstimeout expires. Always use exec form:CMD ["node", "server.js"].Liveness probe failure restarts the pod regardless of cause - If the liveness probe checks an endpoint that depends on a downstream service (database, external API), a downstream outage will restart all your pods in a cascade. Liveness probes should only check the process itself, not external dependencies. Use readiness probes for dependency checks.
kubectl applyon a running Deployment withlatestimage tag doesn't trigger a rollout - If the image tag hasn't changed, Kubernetes considers the spec unchanged and doesn't pull a new image. Always use a unique tag per build (git SHA or build number).imagePullPolicy: Alwaysis a workaround but masks the root problem.ConfigMap and Secret updates don't automatically reload running pods - Changing a ConfigMap or Secret that is mounted as an env var has no effect until pods are restarted. Either trigger a rolling restart (
kubectl rollout restart deployment/name) or use a file-mounted volume (which does receive live updates, with propagation delay).Resource limits without requests can cause scheduling failures - Kubernetes uses
requestsfor pod placement decisions. If you set onlylimitswith norequests, the scheduler defaultsrequeststo equallimits. This can cause nodes to appear full when they have spare capacity, leading toPendingpods.
References
For quick kubectl command reference during live debugging, load:
references/kubectl-cheatsheet.md- essential kubectl commands by resource type
Load the cheatsheet when actively running kubectl commands or diagnosing cluster state. It is a quick-reference card, not a tutorial - skip it for conceptual questions.
References
kubectl-cheatsheet.md
kubectl Cheatsheet
Quick reference for essential kubectl commands organized by resource type.
All commands accept -n <namespace> to target a specific namespace and
-A / --all-namespaces to search across all namespaces.
Context and cluster
kubectl config get-contexts # list all contexts
kubectl config use-context <context-name> # switch active context
kubectl config current-context # show active context
kubectl cluster-info # cluster endpoint and DNS
kubectl get nodes # list nodes and status
kubectl get nodes -o wide # include IP, OS, kernel version
kubectl top node # CPU/memory usage per nodePods
# List
kubectl get pods -n <ns> # all pods in namespace
kubectl get pods -n <ns> -o wide # include node and IP
kubectl get pods -A --field-selector=status.phase=Pending
# Inspect
kubectl describe pod <pod> -n <ns> # full spec + events (read Events section first)
kubectl get pod <pod> -n <ns> -o yaml # full YAML manifest
# Logs
kubectl logs <pod> -n <ns> # current container logs
kubectl logs <pod> -n <ns> --previous # logs from last crashed container
kubectl logs <pod> -n <ns> -c <container> # specific container in multi-container pod
kubectl logs <pod> -n <ns> -f # follow/stream logs
kubectl logs <pod> -n <ns> --tail=100 # last 100 lines
# Execute
kubectl exec -it <pod> -n <ns> -- /bin/sh # interactive shell
kubectl exec <pod> -n <ns> -- env # dump environment variables
# Debug
kubectl debug -it <pod> -n <ns> --image=busybox --target=<container> # ephemeral debug container
kubectl debug node/<node-name> -it --image=ubuntu # debug at node level
# Delete / restart
kubectl delete pod <pod> -n <ns> # pod restarts via Deployment controller
kubectl rollout restart deployment/<name> -n <ns> # graceful rolling restartDeployments
kubectl get deployments -n <ns>
kubectl describe deployment <name> -n <ns>
# Rollout
kubectl rollout status deployment/<name> -n <ns> # watch rollout progress
kubectl rollout history deployment/<name> -n <ns> # revision history
kubectl rollout undo deployment/<name> -n <ns> # roll back to previous revision
kubectl rollout undo deployment/<name> -n <ns> --to-revision=3
# Scale
kubectl scale deployment/<name> --replicas=5 -n <ns>
# Image update (prefer updating YAML in git; this is a quick override)
kubectl set image deployment/<name> <container>=<image>:<tag> -n <ns>Services and Endpoints
kubectl get services -n <ns>
kubectl get endpoints <service-name> -n <ns> # verify pods are registered
kubectl describe service <name> -n <ns>
# Port-forward to test a service locally
kubectl port-forward svc/<service-name> 8080:80 -n <ns>
kubectl port-forward pod/<pod-name> 8080:3000 -n <ns>Ingress
kubectl get ingress -n <ns>
kubectl describe ingress <name> -n <ns> # shows rules and TLS status
kubectl get ingress -n <ns> -o yaml # full manifest with annotationsConfigMaps and Secrets
kubectl get configmaps -n <ns>
kubectl describe configmap <name> -n <ns>
kubectl get configmap <name> -n <ns> -o yaml
kubectl get secrets -n <ns>
kubectl describe secret <name> -n <ns> # shows keys but not values
kubectl get secret <name> -n <ns> -o jsonpath='{.data.<key>}' | base64 -d # decode a value
# Create from literal (prefer declarative YAML in production)
kubectl create configmap <name> --from-literal=KEY=VALUE -n <ns>
kubectl create secret generic <name> --from-literal=KEY=VALUE -n <ns>Resource usage and events
kubectl top pods -n <ns> # CPU/memory per pod
kubectl top pods -n <ns> --sort-by=memory
kubectl get events -n <ns> --sort-by='.lastTimestamp' # recent events, newest last
kubectl get events -n <ns> --field-selector=reason=OOMKillingNamespaces and RBAC
kubectl get namespaces
kubectl create namespace <name>
kubectl get rolebindings -n <ns>
kubectl get clusterrolebindings
kubectl auth can-i create deployments -n <ns> # check your permissions
kubectl auth can-i create deployments -n <ns> --as=<service-account>Apply, diff, and dry-run
kubectl apply -f <file.yaml> # apply declarative manifest
kubectl apply -f <directory>/ # apply all YAML in a directory
kubectl apply -k <kustomize-dir>/ # apply Kustomize overlay
kubectl diff -f <file.yaml> # show diff vs live cluster state
kubectl apply -f <file.yaml> --dry-run=server # server-side dry run (validates fully)
kubectl apply -f <file.yaml> --dry-run=client # client-side dry run (basic validation)
kubectl delete -f <file.yaml> # delete resources defined in fileHelm
helm list -n <ns> # installed releases
helm status <release> -n <ns> # release status and notes
helm history <release> -n <ns> # revision history
helm upgrade --install <release> <chart> -f values.yaml -n <ns>
helm upgrade --install <release> <chart> -f values.yaml -n <ns> --dry-run
helm rollback <release> <revision> -n <ns>
helm uninstall <release> -n <ns>
helm template <release> <chart> -f values.yaml # render templates locally
helm lint <chart-dir>/ # lint chart for errorsCommon diagnostic workflow for a broken pod
# 1. What state is it in?
kubectl get pod <pod> -n <ns>
# 2. Why? Read the Events section at the bottom
kubectl describe pod <pod> -n <ns>
# 3. What did the process print before dying?
kubectl logs <pod> -n <ns> --previous
# 4. Is it a resource issue?
kubectl top pod <pod> -n <ns>
# 5. Can I reproduce it interactively?
kubectl run debug-shell --rm -it --image=alpine -n <ns> -- /bin/sh Frequently Asked Questions
What is docker-kubernetes?
Use this skill when containerizing applications, writing Dockerfiles, deploying to Kubernetes, creating Helm charts, or configuring service mesh. Triggers on Docker, Kubernetes, k8s, containers, pods, deployments, services, ingress, Helm, Istio, container orchestration, and any task requiring container or cluster management.
How do I install docker-kubernetes?
Run npx skills add AbsolutelySkilled/AbsolutelySkilled --skill docker-kubernetes in your terminal. The skill will be immediately available in your AI coding agent.
What AI agents support docker-kubernetes?
docker-kubernetes works with claude-code, gemini-cli, openai-codex. Install it once and use it across any supported AI coding agent.