Linux Containers (Docker/Podman)
What Are Linux Containers?
Containers package an application with its dependencies into an isolated userspace environment sharing the host kernel. Unlike VMs, containers start in milliseconds and share the kernel — isolation comes from namespaces (PID, network, mount) and cgroups (CPU, memory limits).
Linux container runtime stack:
| Component | Role |
|---|---|
| containerd / CRI-O | High-level runtime (used by Kubernetes) |
| runc | Low-level OCI runtime creating containers |
| Docker / Podman | CLI and daemon (or daemonless) for developers and ops |
# Verify kernel support
uname -r
grep -E 'namespace| cgroup' /proc/config.gz 2>/dev/null || true
ls /sys/fs/cgroup/
Docker Essentials
Install Docker Engine on Ubuntu:
sudo apt update
sudo apt install -y docker.io # distro package
# Or official: https://docs.docker.com/engine/install/ubuntu/
sudo systemctl enable --now docker
sudo usermod -aG docker "$USER" # run without sudo (log out/in)
docker version
docker info
Core workflow:
# Pull and run
docker pull nginx:1.25-alpine
docker run -d --name web -p 8080:80 nginx:1.25-alpine
# Inspect
docker ps
docker ps -a
docker logs web
docker logs -f --tail 100 web
docker exec -it web sh
# Stop and remove
docker stop web
docker rm web
Images and Dockerfiles
# Dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
USER nobody
EXPOSE 8000
CMD ["gunicorn", "-b", "0.0.0.0:8000", "app:application"]
Build and tag:
docker build -t myapp:1.0.0 .
docker build --no-cache -t myapp:1.0.0 .
docker images
docker rmi myapp:1.0.0
docker image prune -a # remove unused images
Best practices: pin base image digests, use multi-stage builds, run as non-root, minimize layers.
Volumes and Persistence
# Named volume
docker volume create pgdata
docker run -d --name db -v pgdata:/var/lib/postgresql/data postgres:16
# Bind mount (dev only — path dependency)
docker run -d -v /opt/app/config:/app/config:ro myapp:1.0.0
docker volume ls
docker volume inspect pgdata
Production: use named volumes or external storage (NFS, EBS) — not host bind mounts for databases.
Networking
docker network ls
docker network create app-net
docker run -d --name api --network app-net myapp:1.0.0
docker run -d --name proxy --network app-net -p 443:443 nginx
# DNS: containers resolve each other by name on user-defined networks
docker exec api ping -c 2 proxy
Default bridge (docker0) uses NAT. User-defined bridges enable container DNS and isolation.
Docker Compose
Multi-container applications:
# docker-compose.yml
services:
web:
image: myapp:1.0.0
ports:
- "8080:8000"
environment:
DATABASE_URL: postgres://db:5432/app
depends_on:
- db
restart: unless-stopped
db:
image: postgres:16-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
secrets:
- db_password
volumes:
pgdata:
secrets:
db_password:
file: ./secrets/db_password.txt
docker compose up -d
docker compose ps
docker compose logs -f web
docker compose down
docker compose down -v # remove volumes too
Podman — Daemonless Alternative
Podman is rootless-friendly and Docker CLI-compatible:
sudo apt install podman podman-compose # or dnf install podman
podman run -d --name web -p 8080:80 docker.io/library/nginx:alpine
podman ps
podman generate systemd --new --name web > ~/.config/systemd/user/web.service
loginctl enable-linger "$USER" # user services after logout
systemctl --user enable --now web.service
Rootless containers map UIDs — watch volume permission issues on bind mounts.
Resource Limits
docker run -d --memory=512m --cpus=1.5 --pids-limit=100 myapp:1.0.0
# Inspect cgroup limits
docker inspect web --format '{{.HostConfig.Memory}}'
Without limits, one container can exhaust host memory and trigger the OOM killer on unrelated processes.
Registry and Security
docker login registry.example.com
docker tag myapp:1.0.0 registry.example.com/myapp:1.0.0
docker push registry.example.com/myapp:1.0.0
# Scan for CVEs
docker scout cves myapp:1.0.0 # Docker Scout
trivy image myapp:1.0.0 # Aqua Trivy
Never run :latest in production — pin versions and digests.
Best Practices
| Practice | Reason |
|---|---|
| One process per container | Clean lifecycle, proper signal handling |
| Non-root USER in Dockerfile | Limits container breakout impact |
| Read-only root filesystem | --read-only + tmpfs for /tmp |
| Health checks | HEALTHCHECK or compose healthcheck |
| Log to stdout/stderr | Docker captures; ship with log driver |
Common Mistakes
| Mistake | Consequence |
|---|---|
| Storing data in container layer | Lost on container delete |
--net=host unnecessarily |
Breaks network isolation |
| Secrets in ENV or Dockerfile | Leaked in image layers and inspect |
| Ignoring SIGTERM | docker stop waits then SIGKILL — corrupt state |
Troubleshooting
Container exits immediately:
docker logs container_name
docker run --rm -it myapp:1.0.0 /bin/sh # interactive debug
Permission denied on volume (rootless):
podman unshare chown -R 1000:1000 /path/to/volume
Port already in use:
ss -tlnp | grep :8080
docker ps --filter publish=8080
Disk full from images:
docker system df
docker system prune -a --volumes # destructive — review first
Production Scenario
A CI/CD pipeline builds and deploys containers:
- Build — multi-stage Dockerfile, Trivy scan fails on critical CVEs
- Push — tagged
registry.example.com/myapp:${GIT_SHA}to private registry - Deploy — Ansible pulls image on each node, updates systemd unit wrapping
docker run - Rolling update — drain node from LB, stop old container, start new, health check, rejoin
- Observability — Prometheus cAdvisor metrics, logs to Loki via Docker log driver
Kubernetes eventually orchestrates the same containers — but Docker/Podman skills remain essential for local dev, debugging node issues, and understanding what K8s runs underneath.
Containers are Linux processes with extra isolation — master images, volumes, networking, and resource limits before scaling to orchestrators.