Amazon ECS & Container Services
A comprehensive deep dive into Amazon ECS and container services — launch types, task definitions, services, networking modes, IAM roles, ECR, Fargate, auto scaling, rolling updates, blue/green deployments, and DVA-C02 exam essentials.
What is Amazon ECS?
Amazon Elastic Container Service (ECS) is a fully managed container orchestrator that runs Docker containers on AWS. It handles scheduling, placement, health checking, and rolling updates — you provide the container images and task definitions.
Core mental model: ECS has two halves. Fargate (serverless) — you define CPU/memory, AWS provisions the compute. EC2 launch type — you manage the EC2 instances, ECS schedules containers onto them. Both use the same task definitions, services, and cluster APIs.
Launch Types
| EC2 | Fargate | |
|---|---|---|
| Instance management | You provision + patch EC2 | AWS manages compute |
| Pricing | EC2 instance hours | Per vCPU + memory per second |
| Control | Full OS access, GPU support | No OS access |
| Startup time | Fast (agent already running) | Slightly slower (cold provision) |
| Spot support | EC2 Spot instances | Fargate Spot (up to 70% discount) |
| Use case | Long-running, cost-optimized at scale | Simplicity, unpredictable load, batch |
Core Concepts
Cluster
A logical grouping of capacity (EC2 instances or Fargate compute) and services.
1# Create a cluster
2aws ecs create-cluster --cluster-name my-cluster --capacity-providers FARGATE FARGATE_SPOT --default-capacity-provider-strategy capacityProvider=FARGATE,weight=1,base=1 capacityProvider=FARGATE_SPOT,weight=3Task Definition
A task definition is the blueprint for a container group — immutable and versioned. It specifies images, CPU/memory, ports, environment variables, IAM roles, log config, and volumes.
1{
2 "family": "myapp",
3 "networkMode": "awsvpc",
4 "requiresCompatibilities": ["FARGATE"],
5 "cpu": "512",
6 "memory": "1024",
7 "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
8 "taskRoleArn": "arn:aws:iam::123456789012:role/myapp-task-role",
9 "containerDefinitions": [
10 {
11 "name": "app",
12 "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:latest",
13 "essential": true,
14 "portMappings": [{ "containerPort": 8080, "protocol": "tcp" }],
15 "environment": [
16 { "name": "NODE_ENV", "value": "production" }
17 ],
18 "secrets": [
19 { "name": "DB_PASSWORD", "valueFrom": "arn:aws:ssm:us-east-1:123456789012:parameter/myapp/prod/db-password" },
20 { "name": "API_KEY", "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:myapp/api-key" }
21 ],
22 "logConfiguration": {
23 "logDriver": "awslogs",
24 "options": {
25 "awslogs-group": "/ecs/myapp",
26 "awslogs-region": "us-east-1",
27 "awslogs-stream-prefix": "app"
28 }
29 },
30 "healthCheck": {
31 "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
32 "interval": 30,
33 "timeout": 5,
34 "retries": 3,
35 "startPeriod": 60
36 },
37 "cpu": 256,
38 "memory": 512,
39 "memoryReservation": 256
40 },
41 {
42 "name": "nginx",
43 "image": "nginx:1.25-alpine",
44 "essential": false,
45 "portMappings": [{ "containerPort": 80, "protocol": "tcp" }],
46 "dependsOn": [{ "containerName": "app", "condition": "HEALTHY" }]
47 }
48 ]
49}Service
A service maintains a desired number of running tasks, replaces failed tasks, integrates with load balancers, and manages rolling deployments.
1# Create a service
2aws ecs create-service --cluster my-cluster --service-name myapp --task-definition myapp:5 --desired-count 3 --launch-type FARGATE --network-configuration "awsvpcConfiguration={
3 subnets=[subnet-abc,subnet-def],
4 securityGroups=[sg-123],
5 assignPublicIp=DISABLED
6 }" --load-balancers "targetGroupArn=arn:aws:...,containerName=nginx,containerPort=80" --deployment-configuration "minimumHealthyPercent=100,maximumPercent=200" --health-check-grace-period-seconds 60Networking Modes
| Mode | Available on | Isolation | Use case |
|---|---|---|---|
awsvpc | Fargate + EC2 | Each task gets its own ENI + private IP + security group | Recommended — task-level network isolation |
bridge | EC2 only | Docker bridge network, dynamic port mapping | Legacy EC2 workloads |
host | EC2 only | Task shares the EC2 host network namespace | High performance (no NAT overhead) |
none | EC2 only | No external network access | Batch jobs with no network |
awsvpc Mode (Deep Dive)
Every task has its own IP address and security group — just like an EC2 instance. You can restrict DB access at the security group level per task rather than per instance.
IAM Roles
Two separate IAM roles are needed per task:
| Role | Attached to | Purpose |
|---|---|---|
| Task Execution Role | ECS Agent (infrastructure) | Pull ECR images, write CloudWatch logs, fetch SSM/Secrets Manager values at startup |
| Task Role | Running container | Your application's AWS permissions (S3, DynamoDB, SQS, etc.) |
1# Minimum Task Execution Role policy
2aws iam attach-role-policy --role-name ecsTaskExecutionRole --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicyAmazon ECR (Elastic Container Registry)
1# Authenticate Docker to ECR
2aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
3
4# Create a repository
5aws ecr create-repository --repository-name myapp --image-scanning-configuration scanOnPush=true --encryption-configuration encryptionType=KMS,kmsKey=alias/ecr-key
6
7# Build, tag, and push
8docker build -t myapp:v1.2.3 .
9docker tag myapp:v1.2.3 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.3
10docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.3
11
12# Apply lifecycle policy — keep only last 10 images
13aws ecr put-lifecycle-policy --repository-name myapp --lifecycle-policy-text '{
14 "rules": [{
15 "rulePriority": 1,
16 "description": "Keep last 10 images",
17 "selection": { "tagStatus": "any", "countType": "imageCountMoreThan", "countNumber": 10 },
18 "action": { "type": "expire" }
19 }]
20 }'ECR Public Gallery: public.ecr.aws — pull public images without authentication (rate-limit free for AWS accounts).
ECS Service Deployments
Rolling Update (default)
Key settings:
- minimumHealthyPercent: Min % of desired tasks that must be running during update (e.g., 100 = no capacity loss)
- maximumPercent: Max % of desired tasks that can be running during update (e.g., 200 = double the fleet temporarily)
Blue/Green with CodeDeploy
ECS Blue/Green uses two target groups on the ALB. CodeDeploy shifts production traffic from the blue (current) to the green (new) task set. Test traffic is available on a separate port before the shift.
Auto Scaling
ECS services support three scaling mechanisms:
1# Register scalable target
2aws application-autoscaling register-scalable-target --service-namespace ecs --scalable-dimension ecs:service:DesiredCount --resource-id service/my-cluster/myapp --min-capacity 2 --max-capacity 20
3
4# Target tracking — maintain 70% CPU utilization
5aws application-autoscaling put-scaling-policy --policy-name cpu-target-tracking --service-namespace ecs --scalable-dimension ecs:service:DesiredCount --resource-id service/my-cluster/myapp --policy-type TargetTrackingScaling --target-tracking-scaling-policy-configuration '{
6 "TargetValue": 70.0,
7 "PredefinedMetricSpecification": {
8 "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
9 },
10 "ScaleOutCooldown": 60,
11 "ScaleInCooldown": 300
12 }'| Scaling type | Trigger | Use case |
|---|---|---|
| Target Tracking | Keep metric at target value (CPU, memory, ALB request count) | Most workloads |
| Step Scaling | CloudWatch alarm → add/remove N tasks per breach step | Precise control |
| Scheduled Scaling | Cron expression | Predictable traffic patterns |
Fargate Specifics
CPU and Memory Combinations
| vCPU | Valid Memory |
|---|---|
| 0.25 | 0.5 GB, 1 GB, 2 GB |
| 0.5 | 1 – 4 GB (1 GB increments) |
| 1 | 2 – 8 GB (1 GB increments) |
| 2 | 4 – 16 GB (1 GB increments) |
| 4 | 8 – 30 GB (1 GB increments) |
| 8 | 16 – 60 GB (4 GB increments) |
| 16 | 32 – 120 GB (8 GB increments) |
Fargate Spot
Fargate Spot uses spare AWS capacity at up to 70% discount. Tasks can be interrupted with a 2-minute warning. Best for:
- Batch jobs that can be retried
- Dev/test workloads
- Fault-tolerant background processing
1# Mix Fargate and Fargate Spot using capacity provider strategy
2aws ecs create-service --service-name myapp --capacity-provider-strategy capacityProvider=FARGATE,weight=1,base=2 capacityProvider=FARGATE_SPOT,weight=3
3# base=2: always start 2 tasks on standard Fargate
4# weight ratio 1:3: beyond base, 1 task Fargate per 3 Fargate SpotECS Anywhere (On-Premises)
Register on-premises servers or VMs as ECS external instances — run ECS tasks on your own hardware managed from the same ECS API.
1# Generate activation key for on-premises server
2aws ssm create-activation --iam-role AmazonEC2RunCommandRoleForManagedInstances --registration-limit 10
3
4# On the on-premises server:
5curl -o ecs-anywhere-install.sh https://amazon-ecs-agent.s3.amazonaws.com/ecs-anywhere-install-latest.sh
6bash ecs-anywhere-install.sh --region us-east-1 --cluster my-cluster --activation-id <activation-id> --activation-code <activation-code>Task Placement (EC2 Launch Type Only)
When ECS places tasks on EC2 instances, it evaluates placement constraints and strategies:
| Strategy | Behavior |
|---|---|
binpack | Pack tasks onto fewest instances (minimize cost) |
spread | Distribute evenly across AZs, instances (maximize availability) |
random | Place randomly |
1{
2 "placementStrategy": [
3 { "type": "spread", "field": "attribute:ecs.availability-zone" },
4 { "type": "binpack", "field": "memory" }
5 ],
6 "placementConstraints": [
7 { "type": "distinctInstance" },
8 { "type": "memberOf", "expression": "attribute:ecs.instance-type =~ t3.*" }
9 ]
10}DVA-C02 Quick Reference
| Topic | Key Fact |
|---|---|
| Task Execution Role purpose | Pull ECR images, write CloudWatch logs, fetch SSM/Secrets Manager |
| Task Role purpose | Application AWS permissions (S3, DynamoDB, SQS, etc.) |
| Fargate networking mode | awsvpc only — each task gets its own ENI + SG |
| EC2 networking modes | awsvpc (recommended), bridge, host, none |
| Task definition is | Immutable + versioned — new deploy = new revision |
| ECS Service rolling update | minimumHealthyPercent + maximumPercent control capacity |
| Blue/Green on ECS | CodeDeploy + two ALB target groups |
| ECR auth command | `aws ecr get-login-password |
| ECR lifecycle policy | Expire old images automatically |
| Fargate Spot interrupt notice | 2-minute warning before termination |
| ECS Service auto scaling | Target Tracking, Step Scaling, Scheduled |
| ECS Anywhere | Run ECS tasks on on-premises servers |
| Task placement: binpack | Pack onto fewest instances — minimize cost |
| Task placement: spread | Distribute across AZs — maximize availability |
| Secrets in task definition | secrets field — fetched at task start by execution role |
| Container dependency | dependsOn with HEALTHY condition |
| Health check grace period | Prevents ALB from killing new tasks before they boot |
| ECS vs EKS | ECS: simpler, AWS-native; EKS: Kubernetes API, portable |
Practice Questions4
Q1. A developer runs a containerized application on ECS Fargate. They want to pass the database connection string securely to the container at runtime without embedding it in the Docker image. What is the recommended approach?
Select one answer before revealing.
Q2. A developer needs to run a one-time data migration job in ECS. The job should start, run to completion, and stop. No long-running service is needed. Which ECS launch type and run mode is appropriate?
Select one answer before revealing.
Q3. An ECS Fargate task is failing to pull its Docker image from ECR with "CannotPullContainerError: access denied." The task definition references the correct ECR image URI. What is the most likely cause?
Select one answer before revealing.
Q4. An ECS service is running 10 tasks behind an Application Load Balancer. The developer wants new tasks to receive traffic only after passing a health check, and failed health checks to remove tasks from the ALB target group. Which ECS feature handles this?
Select one answer before revealing.