Containers package applications with their dependencies for consistent deployment across environments. AWS offers two orchestration platforms: ECS (AWS-native, simpler) and EKS (managed Kubernetes, portable). Both integrate with AWS networking, IAM, and load balancing for production workloads.

ECS vs EKS Decision Guide

Criteria ECS EKS
Kubernetes required No Yes
Learning curve Lower Higher (K8s concepts)
Portability AWS-only Multi-cloud K8s
Control plane cost Free ~$0.10/hour per cluster
AWS integration Native Via controllers
Best for AWS-centric teams K8s expertise, multi-cloud

Amazon ECS Fundamentals

  Cluster → Service → Task Definition → Container(s)
                         ↓
                   Task (running instance)
  
Concept Description
Cluster Logical grouping of tasks/services
Task Definition Blueprint (image, CPU, memory, env vars)
Task Running instance of a task definition
Service Maintains desired count of tasks, handles deployments

ECS with Fargate (Serverless Containers)

No EC2 instances to manage — AWS runs containers on your behalf:

  # Create cluster
aws ecs create-cluster --cluster-name production

# Register task definition
aws ecs register-task-definition \
    --family myapp \
    --network-mode awsvpc \
    --requires-compatibilities FARGATE \
    --cpu 256 --memory 512 \
    --execution-role-arn arn:aws:iam::123:role/ecsTaskExecutionRole \
    --task-role-arn arn:aws:iam::123:role/ecsTaskRole \
    --container-definitions '[{
        "name": "myapp",
        "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:latest",
        "portMappings": [{"containerPort": 8080, "protocol": "tcp"}],
        "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
                "awslogs-group": "/ecs/myapp",
                "awslogs-region": "us-east-1",
                "awslogs-stream-prefix": "ecs"
            }
        },
        "healthCheck": {
            "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
            "interval": 30,
            "timeout": 5,
            "retries": 3
        }
    }]'

# Create service behind ALB
aws ecs create-service \
    --cluster production \
    --service-name myapp-service \
    --task-definition myapp \
    --desired-count 2 \
    --launch-type FARGATE \
    --network-configuration "awsvpcConfiguration={subnets=[subnet-private-1a,subnet-private-1b],securityGroups=[sg-app],assignPublicIp=DISABLED}" \
    --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123:targetgroup/myapp/xxx,containerName=myapp,containerPort=8080"
  

Amazon EKS Setup

  # Create EKS cluster (eksctl simplifies this)
eksctl create cluster \
    --name production \
    --region us-east-1 \
    --nodegroup-name standard-workers \
    --node-type m7i.large \
    --nodes 3 \
    --nodes-min 2 \
    --nodes-max 10 \
    --managed

# Configure kubectl
aws eks update-kubeconfig --name production --region us-east-1
kubectl get nodes
  

Deploy Application to EKS

  # deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      serviceAccountName: myapp-sa
      containers:
      - name: myapp
        image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 250m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: myapp-service
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "external"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
  - port: 443
    targetPort: 8080
  
  kubectl apply -f deployment.yaml
kubectl get pods -n production
kubectl logs -f deployment/myapp -n production
  

ECR — Container Registry

  # Create repository
aws ecr create-repository --repository-name myapp --image-scanning-configuration scanOnPush=true

# Build and push
aws ecr get-login-password --region us-east-1 | \
    docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

docker build -t myapp .
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.0
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.0
  

Enable scan on push for vulnerability detection. Use lifecycle policies to expire untagged images.

Auto Scaling

ECS Service Auto Scaling

  aws application-autoscaling register-scalable-target \
    --service-namespace ecs \
    --resource-id service/production/myapp-service \
    --scalable-dimension ecs:service:DesiredCount \
    --min-capacity 2 --max-capacity 20

aws application-autoscaling put-scaling-policy \
    --service-namespace ecs \
    --resource-id service/production/myapp-service \
    --scalable-dimension ecs:service:DesiredCount \
    --policy-name cpu-scaling \
    --policy-type TargetTrackingScaling \
    --target-tracking-scaling-policy-configuration '{
        "TargetValue": 70.0,
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
        },
        "ScaleInCooldown": 300,
        "ScaleOutCooldown": 60
    }'
  

EKS Cluster Autoscaler

  # Install cluster autoscaler (Helm)
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
    --set autoDiscovery.clusterName=production \
    --set awsRegion=us-east-1
  

IAM for Containers

Role Purpose
Task Execution Role Pull ECR images, write CloudWatch Logs
Task Role Application permissions (S3, DynamoDB, etc.)
EKS Pod Identity / IRSA Kubernetes service account → IAM role mapping
  # EKS IRSA (IAM Roles for Service Accounts)
eksctl create iamserviceaccount \
    --cluster production \
    --namespace production \
    --name myapp-sa \
    --attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
    --approve
  

Real-World Scenario: Microservices on ECS Fargate

Service CPU/Memory Tasks ALB Path
api-gateway 512/1024 2 /api/*
user-service 256/512 2 /api/users/*
order-service 512/1024 3 /api/orders/*
worker 256/512 1-5 (SQS scaling) Internal

Each service has its own task definition, service, and IAM task role. Shared ALB routes by path prefix.

ECS vs EKS vs Lambda vs EC2

Workload Best Choice
Long-running HTTP API ECS Fargate or EKS
Batch processing ECS on Spot or EKS Jobs
Event-driven (< 15 min) Lambda
Legacy monolith EC2 or ECS EC2 launch type
Multi-cloud requirement EKS
Simplest AWS-native ECS Fargate

Common Mistakes

  1. Running EKS without understanding Kubernetes — operational complexity is high
  2. No resource limits — one container can starve others on the node
  3. Latest tag in production — pin to specific image digests or semver tags
  4. Missing health checks — unhealthy containers keep receiving traffic
  5. Single task/service — no redundancy; minimum 2 tasks across AZs
  6. Over-provisioned Fargate tasks — start small, scale based on metrics

Troubleshooting

Issue Diagnosis Fix
Task fails to start Check stopped task reason in ECS console Verify ECR image, execution role, CPU/memory
EKS pods pending Insufficient node capacity Scale node group or enable cluster autoscaler
Cannot pull image ECR permissions or network Execution role needs ECR pull; check VPC endpoints
High latency Under-provisioned tasks/nodes Scale out; check ALB target response time
OOM killed Memory limit too low Increase memory in task definition/limits

Best Practices

  • Use Fargate unless you need GPU, bare-metal, or cost optimization with Spot EC2
  • Deploy minimum 2 tasks across 2 AZs for every production service
  • Pin container images to specific tags, not latest
  • Implement liveness and readiness probes (EKS) or health checks (ECS)
  • Use IRSA (EKS) or task roles (ECS) for AWS API access — no hardcoded keys
  • Enable Container Insights for CloudWatch monitoring
  • Use blue/green deployments via CodeDeploy for zero-downtime updates
  • Scan images in ECR on every push

Next: Advanced Networking.