Kubernetes Fundamentals for DevOps
The Kubernetes mental model that makes everything else click: desired state and the control loop, the control plane and nodes, and the core objects - pods, deployments, services, config. Plus the kubectl commands and the debugging workflow you actually use.
Kubernetes looks enormous from the outside, and most tutorials make it worse by drowning you in YAML before the idea lands. But the whole system rests on one simple loop: you declare the state you want, and a set of controllers works continuously to make reality match. Once that clicks, pods, deployments, services, and the rest stop being trivia to memorize and become obvious consequences of that one idea. This guide builds that mental model, then gives you the core objects, the kubectl commands, and the debugging workflow that real operations actually need.
The one idea: desired state and the control loop
Forget the architecture diagram for a minute. The thing that makes Kubernetes Kubernetes is declarative desired state reconciled by a control loop.
You do not tell Kubernetes "start a container here, then start another there." You hand it a description: "I want 3 replicas of this image, exposed on port 8080." Kubernetes stores that desired state and then runs controllers that loop forever asking one question: does reality match what was asked for? If a pod dies and the count drops to 2, the controller notices the gap and starts a new one. If you change the desired count to 5, it starts two more. Nobody issued a "start pod" command - the loop closed the gap.
This is why Kubernetes is self-healing, and it is why the same mechanism handles deploys, scaling, and recovery. They are all the same operation: change the desired state (or let reality drift), and let the controllers reconcile. Every object you will learn is just a different shape of desired state with a controller watching it. Hold onto this; everything below is a special case of it.
The architecture, in two halves
A Kubernetes cluster splits into the control plane (the brain that makes decisions) and the worker nodes (the machines that run your containers). You rarely touch most of these directly, but knowing what each does turns "the cluster is broken" from a mystery into a checklist.
Control plane
- kube-apiserver - the front door. Everything (you,
kubectl, every controller, every node) talks to the cluster through the API server. It validates requests and is the only component that reads and writes etcd. If the API server is down, the cluster is effectively unmanageable (though running pods keep running). - etcd - the database. A consistent key-value store holding the entire cluster state: every object, its desired spec, and its last-known status. etcd is the source of truth. Back it up; lose it and you lose the cluster.
- kube-scheduler - the placer. When a new pod has no node assigned, the scheduler picks one based on resource requests, affinity rules, taints, and what will actually fit. It decides where, not whether.
- kube-controller-manager - the loops. This runs the controllers that do the reconciling - the Deployment controller, the ReplicaSet controller, the node controller, and many more. This is where "desired state" becomes action.
- cloud-controller-manager - the cloud glue (on managed clusters). Talks to the cloud provider to create load balancers, attach volumes, and manage node lifecycle.
Worker nodes
Each node (a VM or physical machine) runs three things:
- kubelet - the node agent. It takes the pod specs assigned to its node and makes sure those containers are actually running and healthy, reporting status back to the API server. The kubelet is the hands.
- kube-proxy - the network plumber. Programs the node's networking (iptables/IPVS rules) so that traffic to a Service gets routed to the right pods, on any node.
- container runtime - the thing that actually runs containers (containerd is now standard; Docker as a runtime was removed in 1.24). The kubelet talks to it through the CRI interface.
The flow of a single deploy ties it together: you kubectl apply a Deployment -> API server validates and writes it to etcd -> the Deployment controller sees it and creates a ReplicaSet -> the ReplicaSet controller creates Pod objects -> the scheduler assigns each pod to a node -> that node's kubelet tells containerd to pull the image and start the container -> kube-proxy wires up the networking. Every arrow is a controller closing a gap.
Pods: the smallest unit
A pod is the smallest thing Kubernetes schedules. The key surprise for newcomers: a pod is not a container - it is a wrapper around one or more containers that share a network namespace (same IP, same localhost) and can share storage volumes.
Most pods hold exactly one container. The multi-container pattern exists for tightly coupled helpers - a sidecar that ships logs, a proxy, or an init container that runs setup before the main container starts. The rule of thumb: containers go in the same pod only when they genuinely must live and die together and share local resources.
The second thing to internalize: pods are ephemeral and disposable. They get a unique IP, but when a pod dies it is gone - a replacement is a brand-new pod with a new IP, not a restart of the old one. You almost never create bare pods yourself. You declare a higher-level object (a Deployment) and let it manage pods for you, precisely because individual pods are cattle, not pets. This is also why you need Services: pod IPs are not stable, so you need a stable address in front of them.
apiVersion: v1
kind: Pod
metadata:
name: web
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.25
ports:
- containerPort: 80
You can run that, but in practice you would never deploy a raw pod - if the node fails, nothing brings it back. That job belongs to a controller.
Deployments: how you actually run workloads
A Deployment is the object you use for nearly every stateless service. It is desired state for "keep N identical pods running, from this image, and manage changes to them safely." Under the hood a Deployment manages a ReplicaSet (which keeps the replica count correct), and the ReplicaSet manages the pods. You interact with the Deployment; the layers below are mostly invisible.
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: myapp:1.4.0
ports:
- containerPort: 8080
The Deployment earns its keep at change time. When you update the image (myapp:1.4.0 -> myapp:1.5.0) and apply, the Deployment does a rolling update: it brings up new pods, waits for them to be ready, and tears down old ones gradually, so the service never fully goes down. If the new version is broken, you roll back to the previous ReplicaSet in one command. Scaling is just changing replicas. Self-healing comes free, because the ReplicaSet controller is always reconciling the count.
kubectl apply -f web.yaml # create or update from desired state
kubectl set image deploy/web web=myapp:1.5.0 # trigger a rolling update
kubectl rollout status deploy/web # watch the rollout progress
kubectl rollout undo deploy/web # roll back to the previous version
kubectl scale deploy/web --replicas=5 # scale up or down
For workloads that are not stateless web services, you will meet siblings of the Deployment: StatefulSet (stable network identity and storage per pod - databases, queues), DaemonSet (one pod on every node - log collectors, node agents), and Job/CronJob (run to completion, once or on a schedule). They all follow the same desired-state pattern; Deployment is just the one you reach for most.
Labels and selectors: the glue
Before Services make sense, you need the mechanism that connects objects: labels and selectors. Labels are arbitrary key-value tags on objects (app: web, tier: frontend, env: prod). Selectors are queries over those labels.
This is how Kubernetes wires loosely-coupled objects together without hard references. A Deployment owns the pods whose labels match its selector. A Service sends traffic to the pods whose labels match its selector. Nothing is connected by name or IP - everything is connected by "which pods match this label query." It is the quiet core of how the whole system composes.
kubectl get pods -l app=web # all pods labelled app=web
kubectl get pods -l 'env in (prod,staging)' # set-based selector
kubectl label pod web tier=frontend # add a label
Get the labels and selectors right and the rest connects itself; get them subtly wrong (a Service selector that matches no pods) and you get the classic "the service returns nothing but every pod is healthy" bug.
Services: a stable address for ephemeral pods
Pods come and go with new IPs every time, so you cannot point clients at a pod IP. A Service is a stable virtual IP and DNS name that load-balances across whatever pods currently match its selector. Pods churn underneath; the Service address never changes. This is the indirection that makes the disposable-pod model usable.
There are three Service types you need:
- ClusterIP (default) - a stable internal IP, reachable only inside the cluster. This is how your services talk to each other. Most Services are ClusterIP.
- NodePort - opens a fixed port on every node's IP, forwarding to the Service. Crude external access, mostly for dev or behind something else.
- LoadBalancer - asks the cloud provider for a real external load balancer pointing at the Service. This is how you expose something to the internet on a managed cluster.
apiVersion: v1
kind: Service
metadata:
name: web
spec:
selector:
app: web # routes to pods with this label
ports:
- port: 80 # the Service port
targetPort: 8080 # the container port
Inside the cluster, that Service is reachable by DNS at web (same namespace) or web.default.svc.cluster.local (fully qualified). The selector is everything: it is what makes the Service follow the pods. For HTTP traffic from outside, you usually do not give every service its own LoadBalancer - you put an Ingress (or the newer Gateway API) in front, a single entry point that routes by hostname and path to many backend Services. Think of the Service as L4 (IP/port) and Ingress as L7 (HTTP routing).
Configuration: ConfigMaps and Secrets
You do not bake configuration into images - you inject it at runtime, so the same image runs in dev, staging, and prod. Kubernetes gives you two objects for this:
- ConfigMap - non-sensitive config (feature flags, URLs, tuning values), as env vars or mounted files.
- Secret - the same idea for sensitive data (passwords, tokens, keys). Be clear-eyed about one thing: a Secret is only base64-encoded by default, not encrypted. Real protection comes from enabling encryption at rest in etcd and tight RBAC, or an external secrets manager. Treat "it is in a Secret" as "it is separated and access-controlled," not "it is safe by magic."
spec:
containers:
- name: web
image: myapp:1.5.0
env:
- name: LOG_LEVEL
valueFrom:
configMapKeyRef: { name: web-config, key: log_level }
- name: DB_PASSWORD
valueFrom:
secretKeyRef: { name: web-secrets, key: db_password }
The principle is the twelve-factor one: config lives outside the artifact, injected per environment. The image is immutable; the ConfigMap and Secret change what it does.
Health checks: probes
Kubernetes can only keep your app healthy if it knows what "healthy" means, and that is what probes are for. You define them per container:
- livenessProbe - "is this still alive?" If it fails, the kubelet restarts the container. Use it to recover from deadlocks and hung states.
- readinessProbe - "is this ready for traffic?" If it fails, the pod is pulled out of Service load-balancing but not restarted. Use it for warmup, or to shed traffic while a dependency is down.
- startupProbe - "has it finished booting yet?" Protects slow-starting apps so the liveness probe does not kill them before they are up.
livenessProbe:
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet: { path: /ready, port: 8080 }
The readiness/liveness distinction is a classic interview question and a real source of outages. The common mistake is pointing a liveness probe at a deep health check that also fails when a downstream dependency is down - now Kubernetes restarts a perfectly fine pod in a loop because the database is slow. Keep liveness shallow (is the process itself wedged?) and put dependency checks in readiness.
Resources: requests and limits
Every container should declare what it needs, because that is how the scheduler decides where pods fit and how the node protects itself under pressure.
- requests - the amount the scheduler reserves. A pod is only placed on a node with enough unreserved CPU and memory to cover its requests. This is the number that drives scheduling.
- limits - the hard ceiling. A container over its memory limit is killed (OOMKilled); over its CPU limit it is throttled, not killed (CPU is compressible, memory is not).
resources:
requests: { cpu: "100m", memory: "128Mi" }
limits: { cpu: "500m", memory: "256Mi" }
100m is 0.1 of a CPU core; 128Mi is 128 mebibytes. Set requests too high and you waste capacity (pods reserve more than they use and the cluster looks full while idle); set them too low and pods get packed too tightly and fight over resources. Getting requests roughly right is one of the highest-leverage things you can do for both cost and stability. And remember the asymmetry: blowing the memory limit kills the pod (look for OOMKilled), blowing the CPU limit just slows it down.
Namespaces
Namespaces carve a cluster into virtual sub-clusters - logical scopes for names, resource quotas, and access control. Use them to separate teams or environments (team-a, staging, prod) within one physical cluster. Names only have to be unique within a namespace, RBAC rules and ResourceQuotas apply per namespace, and DNS includes the namespace (web.prod.svc.cluster.local). A handful of system namespaces exist out of the box, most importantly kube-system (the control-plane and add-on pods). Namespaces are an organizational and security boundary, not a hard network one - isolating traffic between them still needs NetworkPolicies.
kubectl: the working set
kubectl is how you talk to the API server, and a small set of commands covers almost everything. The pattern is consistent: kubectl <verb> <resource> <name>.
# See what is there
kubectl get pods # pods in the current namespace
kubectl get pods -A # across ALL namespaces
kubectl get pods -o wide # add node, IP, and more columns
kubectl get deploy,svc # multiple resource types at once
kubectl get pods -w # watch live as state changes
# Understand one object (the most useful debugging command)
kubectl describe pod web # events, status, why it is not scheduling
kubectl logs web # a container's stdout/stderr
kubectl logs web -f --tail=100 # follow, last 100 lines
kubectl logs web --previous # logs from the PREVIOUS crashed container
# Change things
kubectl apply -f manifest.yaml # declarative create/update (prefer this)
kubectl delete -f manifest.yaml # remove what the manifest defined
kubectl scale deploy/web --replicas=4
kubectl rollout restart deploy/web # restart all pods (e.g. to reload config)
# Get inside / reach a pod
kubectl exec -it web -- /bin/sh # a shell in the container
kubectl port-forward svc/web 8080:80 # tunnel a Service to your laptop
# Context (which cluster/namespace am I aiming at?)
kubectl config current-context
kubectl config set-context --current --namespace=prod
Two habits matter. First, prefer kubectl apply -f (declarative - you describe desired state in files you can version) over imperative kubectl create/edit for anything real; it is the whole point of Kubernetes and it keeps your cluster reproducible from git. Second, kubectl describe and kubectl logs --previous are the two commands you will lean on hardest when something is broken - describe shows the events that explain why, and --previous recovers the logs of a container that already crashed.
The debugging workflow
Most "my pod is broken" situations resolve to a handful of states, and the read of kubectl get pods plus kubectl describe tells you which. The method: look at the STATUS, then describe for the events, then logs for the application's own story.
kubectl get pods # what STATUS is it stuck in?
kubectl describe pod <name> # read the Events at the bottom - they explain it
kubectl logs <name> --previous # if it crashed, what did it say on the way out?
The common states and what they mean:
- Pending - the pod is not scheduled onto a node yet. Almost always insufficient resources (no node has enough unreserved CPU/memory for its requests) or an unsatisfiable constraint (taint, node selector, affinity).
describesays it plainly: "0/3 nodes available: insufficient memory." - ImagePullBackOff / ErrImagePull - the node cannot pull the image. A typo in the image name or tag, a private registry with no pull secret, or the tag does not exist.
describeshows the exact pull error. - CrashLoopBackOff - the container starts, exits or crashes, and Kubernetes keeps restarting it with growing back-off. The app itself is failing - a missing env var, a bad config, a failed dependency on boot, or a liveness probe killing it. This is where
kubectl logs <name> --previousis essential: it shows what the dying container printed before it exited. - Running but not Ready (
0/1) - the container is up but its readiness probe is failing, so it gets no traffic. Check the readiness endpoint and what it depends on. - Terminating (stuck) - usually a finalizer waiting on cleanup, or a pod that will not honor SIGTERM within its grace period.
That sequence - STATUS, then events, then logs - resolves the overwhelming majority of incidents. The mistake is jumping straight to deleting and recreating the pod; the controller just makes another one in the same broken state, because you changed reality, not the desired state or the actual bug.
The shape of it
Everything here is one idea wearing different hats: you declare desired state, and controllers reconcile reality to match. Pods are the disposable unit; you never manage them directly, you declare a Deployment and let it keep the right number running and roll out changes safely. Services give those churning pods a stable address, wired up by labels and selectors rather than names or IPs. ConfigMaps and Secrets inject configuration so one image runs everywhere. Probes tell Kubernetes what healthy means; requests and limits tell the scheduler what your pods need. And when something breaks, the read is always STATUS -> describe for events -> logs for the app's side of the story. Internalize the control loop and the rest stops being a pile of YAML and becomes a system you can reason about.