/Amazon ECS & Container Services
Concept
Medium

Amazon ECS & Container Services

9 min read·ECSFargateECRContainersDVA-C02

A comprehensive deep dive into Amazon ECS and container services — launch types, task definitions, services, networking modes, IAM roles, ECR, Fargate, auto scaling, rolling updates, blue/green deployments, and DVA-C02 exam essentials.


What is Amazon ECS?

Amazon Elastic Container Service (ECS) is a fully managed container orchestrator that runs Docker containers on AWS. It handles scheduling, placement, health checking, and rolling updates — you provide the container images and task definitions.

Core mental model: ECS has two halves. Fargate (serverless) — you define CPU/memory, AWS provisions the compute. EC2 launch type — you manage the EC2 instances, ECS schedules containers onto them. Both use the same task definitions, services, and cluster APIs.

Rendering diagram…

Launch Types

EC2Fargate
Instance managementYou provision + patch EC2AWS manages compute
PricingEC2 instance hoursPer vCPU + memory per second
ControlFull OS access, GPU supportNo OS access
Startup timeFast (agent already running)Slightly slower (cold provision)
Spot supportEC2 Spot instancesFargate Spot (up to 70% discount)
Use caseLong-running, cost-optimized at scaleSimplicity, unpredictable load, batch

Core Concepts

Cluster

A logical grouping of capacity (EC2 instances or Fargate compute) and services.

bash
1# Create a cluster
2aws ecs create-cluster   --cluster-name my-cluster   --capacity-providers FARGATE FARGATE_SPOT   --default-capacity-provider-strategy     capacityProvider=FARGATE,weight=1,base=1     capacityProvider=FARGATE_SPOT,weight=3

Task Definition

A task definition is the blueprint for a container group — immutable and versioned. It specifies images, CPU/memory, ports, environment variables, IAM roles, log config, and volumes.

json
1{
2  "family": "myapp",
3  "networkMode": "awsvpc",
4  "requiresCompatibilities": ["FARGATE"],
5  "cpu": "512",
6  "memory": "1024",
7  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
8  "taskRoleArn": "arn:aws:iam::123456789012:role/myapp-task-role",
9  "containerDefinitions": [
10    {
11      "name": "app",
12      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:latest",
13      "essential": true,
14      "portMappings": [{ "containerPort": 8080, "protocol": "tcp" }],
15      "environment": [
16        { "name": "NODE_ENV", "value": "production" }
17      ],
18      "secrets": [
19        { "name": "DB_PASSWORD", "valueFrom": "arn:aws:ssm:us-east-1:123456789012:parameter/myapp/prod/db-password" },
20        { "name": "API_KEY",     "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:myapp/api-key" }
21      ],
22      "logConfiguration": {
23        "logDriver": "awslogs",
24        "options": {
25          "awslogs-group": "/ecs/myapp",
26          "awslogs-region": "us-east-1",
27          "awslogs-stream-prefix": "app"
28        }
29      },
30      "healthCheck": {
31        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
32        "interval": 30,
33        "timeout": 5,
34        "retries": 3,
35        "startPeriod": 60
36      },
37      "cpu": 256,
38      "memory": 512,
39      "memoryReservation": 256
40    },
41    {
42      "name": "nginx",
43      "image": "nginx:1.25-alpine",
44      "essential": false,
45      "portMappings": [{ "containerPort": 80, "protocol": "tcp" }],
46      "dependsOn": [{ "containerName": "app", "condition": "HEALTHY" }]
47    }
48  ]
49}

Service

A service maintains a desired number of running tasks, replaces failed tasks, integrates with load balancers, and manages rolling deployments.

bash
1# Create a service
2aws ecs create-service   --cluster my-cluster   --service-name myapp   --task-definition myapp:5   --desired-count 3   --launch-type FARGATE   --network-configuration "awsvpcConfiguration={
3    subnets=[subnet-abc,subnet-def],
4    securityGroups=[sg-123],
5    assignPublicIp=DISABLED
6  }"   --load-balancers "targetGroupArn=arn:aws:...,containerName=nginx,containerPort=80"   --deployment-configuration "minimumHealthyPercent=100,maximumPercent=200"   --health-check-grace-period-seconds 60

Networking Modes

ModeAvailable onIsolationUse case
awsvpcFargate + EC2Each task gets its own ENI + private IP + security groupRecommended — task-level network isolation
bridgeEC2 onlyDocker bridge network, dynamic port mappingLegacy EC2 workloads
hostEC2 onlyTask shares the EC2 host network namespaceHigh performance (no NAT overhead)
noneEC2 onlyNo external network accessBatch jobs with no network

awsvpc Mode (Deep Dive)

Rendering diagram…

Every task has its own IP address and security group — just like an EC2 instance. You can restrict DB access at the security group level per task rather than per instance.


IAM Roles

Two separate IAM roles are needed per task:

Rendering diagram…
RoleAttached toPurpose
Task Execution RoleECS Agent (infrastructure)Pull ECR images, write CloudWatch logs, fetch SSM/Secrets Manager values at startup
Task RoleRunning containerYour application's AWS permissions (S3, DynamoDB, SQS, etc.)
bash
1# Minimum Task Execution Role policy
2aws iam attach-role-policy   --role-name ecsTaskExecutionRole   --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

Amazon ECR (Elastic Container Registry)

Rendering diagram…
bash
1# Authenticate Docker to ECR
2aws ecr get-login-password --region us-east-1 |   docker login --username AWS --password-stdin   123456789012.dkr.ecr.us-east-1.amazonaws.com
3
4# Create a repository
5aws ecr create-repository   --repository-name myapp   --image-scanning-configuration scanOnPush=true   --encryption-configuration encryptionType=KMS,kmsKey=alias/ecr-key
6
7# Build, tag, and push
8docker build -t myapp:v1.2.3 .
9docker tag myapp:v1.2.3 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.3
10docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.3
11
12# Apply lifecycle policy — keep only last 10 images
13aws ecr put-lifecycle-policy   --repository-name myapp   --lifecycle-policy-text '{
14    "rules": [{
15      "rulePriority": 1,
16      "description": "Keep last 10 images",
17      "selection": { "tagStatus": "any", "countType": "imageCountMoreThan", "countNumber": 10 },
18      "action": { "type": "expire" }
19    }]
20  }'

ECR Public Gallery: public.ecr.aws — pull public images without authentication (rate-limit free for AWS accounts).


ECS Service Deployments

Rolling Update (default)

Rendering diagram…

Key settings:

  • minimumHealthyPercent: Min % of desired tasks that must be running during update (e.g., 100 = no capacity loss)
  • maximumPercent: Max % of desired tasks that can be running during update (e.g., 200 = double the fleet temporarily)

Blue/Green with CodeDeploy

Rendering diagram…

ECS Blue/Green uses two target groups on the ALB. CodeDeploy shifts production traffic from the blue (current) to the green (new) task set. Test traffic is available on a separate port before the shift.


Auto Scaling

ECS services support three scaling mechanisms:

bash
1# Register scalable target
2aws application-autoscaling register-scalable-target   --service-namespace ecs   --scalable-dimension ecs:service:DesiredCount   --resource-id service/my-cluster/myapp   --min-capacity 2   --max-capacity 20
3
4# Target tracking — maintain 70% CPU utilization
5aws application-autoscaling put-scaling-policy   --policy-name cpu-target-tracking   --service-namespace ecs   --scalable-dimension ecs:service:DesiredCount   --resource-id service/my-cluster/myapp   --policy-type TargetTrackingScaling   --target-tracking-scaling-policy-configuration '{
6    "TargetValue": 70.0,
7    "PredefinedMetricSpecification": {
8      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
9    },
10    "ScaleOutCooldown": 60,
11    "ScaleInCooldown": 300
12  }'
Scaling typeTriggerUse case
Target TrackingKeep metric at target value (CPU, memory, ALB request count)Most workloads
Step ScalingCloudWatch alarm → add/remove N tasks per breach stepPrecise control
Scheduled ScalingCron expressionPredictable traffic patterns

Fargate Specifics

CPU and Memory Combinations

vCPUValid Memory
0.250.5 GB, 1 GB, 2 GB
0.51 – 4 GB (1 GB increments)
12 – 8 GB (1 GB increments)
24 – 16 GB (1 GB increments)
48 – 30 GB (1 GB increments)
816 – 60 GB (4 GB increments)
1632 – 120 GB (8 GB increments)

Fargate Spot

Fargate Spot uses spare AWS capacity at up to 70% discount. Tasks can be interrupted with a 2-minute warning. Best for:

  • Batch jobs that can be retried
  • Dev/test workloads
  • Fault-tolerant background processing
bash
1# Mix Fargate and Fargate Spot using capacity provider strategy
2aws ecs create-service   --service-name myapp   --capacity-provider-strategy     capacityProvider=FARGATE,weight=1,base=2     capacityProvider=FARGATE_SPOT,weight=3
3# base=2: always start 2 tasks on standard Fargate
4# weight ratio 1:3: beyond base, 1 task Fargate per 3 Fargate Spot

ECS Anywhere (On-Premises)

Register on-premises servers or VMs as ECS external instances — run ECS tasks on your own hardware managed from the same ECS API.

bash
1# Generate activation key for on-premises server
2aws ssm create-activation   --iam-role AmazonEC2RunCommandRoleForManagedInstances   --registration-limit 10
3
4# On the on-premises server:
5curl -o ecs-anywhere-install.sh https://amazon-ecs-agent.s3.amazonaws.com/ecs-anywhere-install-latest.sh
6bash ecs-anywhere-install.sh   --region us-east-1   --cluster my-cluster   --activation-id <activation-id>   --activation-code <activation-code>

Task Placement (EC2 Launch Type Only)

When ECS places tasks on EC2 instances, it evaluates placement constraints and strategies:

StrategyBehavior
binpackPack tasks onto fewest instances (minimize cost)
spreadDistribute evenly across AZs, instances (maximize availability)
randomPlace randomly
json
1{
2  "placementStrategy": [
3    { "type": "spread", "field": "attribute:ecs.availability-zone" },
4    { "type": "binpack", "field": "memory" }
5  ],
6  "placementConstraints": [
7    { "type": "distinctInstance" },
8    { "type": "memberOf", "expression": "attribute:ecs.instance-type =~ t3.*" }
9  ]
10}

DVA-C02 Quick Reference

TopicKey Fact
Task Execution Role purposePull ECR images, write CloudWatch logs, fetch SSM/Secrets Manager
Task Role purposeApplication AWS permissions (S3, DynamoDB, SQS, etc.)
Fargate networking modeawsvpc only — each task gets its own ENI + SG
EC2 networking modesawsvpc (recommended), bridge, host, none
Task definition isImmutable + versioned — new deploy = new revision
ECS Service rolling updateminimumHealthyPercent + maximumPercent control capacity
Blue/Green on ECSCodeDeploy + two ALB target groups
ECR auth command`aws ecr get-login-password
ECR lifecycle policyExpire old images automatically
Fargate Spot interrupt notice2-minute warning before termination
ECS Service auto scalingTarget Tracking, Step Scaling, Scheduled
ECS AnywhereRun ECS tasks on on-premises servers
Task placement: binpackPack onto fewest instances — minimize cost
Task placement: spreadDistribute across AZs — maximize availability
Secrets in task definitionsecrets field — fetched at task start by execution role
Container dependencydependsOn with HEALTHY condition
Health check grace periodPrevents ALB from killing new tasks before they boot
ECS vs EKSECS: simpler, AWS-native; EKS: Kubernetes API, portable

Practice Questions4

medium

Q1. A developer runs a containerized application on ECS Fargate. They want to pass the database connection string securely to the container at runtime without embedding it in the Docker image. What is the recommended approach?


Select one answer before revealing.

easy

Q2. A developer needs to run a one-time data migration job in ECS. The job should start, run to completion, and stop. No long-running service is needed. Which ECS launch type and run mode is appropriate?


Select one answer before revealing.

medium

Q3. An ECS Fargate task is failing to pull its Docker image from ECR with "CannotPullContainerError: access denied." The task definition references the correct ECR image URI. What is the most likely cause?


Select one answer before revealing.

medium

Q4. An ECS service is running 10 tasks behind an Application Load Balancer. The developer wants new tasks to receive traffic only after passing a health check, and failed health checks to remove tasks from the ALB target group. Which ECS feature handles this?


Select one answer before revealing.