/Amazon SQS & Message Queuing
Concept
Medium

Amazon SQS & Message Queuing

8 min read·SQSMessagingDecouplingDVA-C02

A comprehensive deep dive into Amazon SQS — Standard vs FIFO queues, visibility timeout, polling, DLQs, Lambda integration, access policies, and message queuing patterns for the DVA-C02 exam.


What is Amazon SQS?

Amazon SQS (Simple Queue Service) is a fully managed message queuing service that decouples the components of distributed systems. Producers send messages to a queue; consumers poll and process them independently — neither needs to know about the other.

Core mental model: SQS is a buffer. If your producer is faster than your consumer, messages accumulate safely in the queue instead of being lost. Consumers scale independently to drain the queue at their own pace.

When to use SQS:

  • Decouple a monolith into independently scalable services
  • Absorb traffic spikes (queue as shock absorber)
  • Ensure no messages are lost if a consumer crashes
  • Fan-out with SNS → multiple SQS queues

Standard vs FIFO Queues

FeatureStandard QueueFIFO Queue
ThroughputUnlimited300 TPS (3,000 with batching)
Message orderingBest-effort (not guaranteed)Strict FIFO per MessageGroupId
Delivery guaranteeAt-least-once (duplicates possible)Exactly-once (deduplication built-in)
Deduplication✅ 5-minute deduplication window
URL suffix.amazonaws.com/account/QueueName.amazonaws.com/account/QueueName.fifo
Use caseHigh-throughput, order not criticalFinancial transactions, inventory updates

Exam tip: "Exactly-once" and "strict ordering" always point to FIFO. "Unlimited throughput" always points to Standard.

Rendering diagram…

Message Lifecycle

Understanding the exact lifecycle of a message is critical for the exam.

Rendering diagram…
  1. Producer sends a message → message is available in the queue
  2. Consumer calls ReceiveMessage → message becomes in-flight (invisible to other consumers) for the duration of the visibility timeout
  3. Consumer processes successfully → calls DeleteMessage → message is gone
  4. Consumer fails or crashes → visibility timeout expires → message becomes available again for redelivery
  5. After maxReceiveCount failed deliveries → message moves to the DLQ

Key Configuration Settings

SettingRangeDefaultNotes
Visibility Timeout0s – 12 hours30 secondsHow long message is hidden after receive
Message Retention1 min – 14 days4 daysHow long undelivered messages are kept
Max Message Size1 byte – 256 KB256 KBUse S3 + pointer for larger payloads
Delay Seconds0 – 900s0Delay before message becomes available
Receive Message Wait Time0 – 20s0Long polling wait time
Max Receive Count1 – 1,000Deliveries before DLQ routing

Visibility Timeout Deep Dive

The visibility timeout is the most commonly misunderstood SQS concept.

javascript
1import { SQSClient, ReceiveMessageCommand, DeleteMessageCommand, ChangeMessageVisibilityCommand } from '@aws-sdk/client-sqs';
2
3const sqs = new SQSClient({ region: 'us-east-1' });
4const QUEUE_URL = process.env.QUEUE_URL;
5
6const { Messages } = await sqs.send(new ReceiveMessageCommand({
7  QueueUrl: QUEUE_URL,
8  MaxNumberOfMessages: 10,     // up to 10 per call
9  WaitTimeSeconds: 20,         // long polling — wait up to 20s for messages
10  VisibilityTimeout: 60,       // override queue default for this receive call
11}));
12
13for (const message of Messages ?? []) {
14  try {
15    // If processing will take longer than the visibility timeout,
16    // extend it before it expires to prevent duplicate processing
17    await sqs.send(new ChangeMessageVisibilityCommand({
18      QueueUrl: QUEUE_URL,
19      ReceiptHandle: message.ReceiptHandle,
20      VisibilityTimeout: 120,  // extend by another 2 minutes
21    }));
22
23    await processMessage(JSON.parse(message.Body));
24
25    // Only delete AFTER successful processing
26    await sqs.send(new DeleteMessageCommand({
27      QueueUrl: QUEUE_URL,
28      ReceiptHandle: message.ReceiptHandle,
29    }));
30  } catch (err) {
31    console.error('Processing failed — message will reappear after timeout:', err);
32    // Do NOT delete — let it become visible again for retry
33  }
34}

Common mistake: Deleting the message before processing is complete. If your code crashes after delete, the message is gone forever. Always delete after successful processing.


Short Polling vs Long Polling

ModeBehaviourCostLatency
Short PollingReturns immediately, even if queue is emptyHigher (more API calls)Low
Long PollingWaits up to 20s for a message to arriveLower (fewer empty responses)Slightly higher

Long polling is always preferred in production. Enable it by setting WaitTimeSeconds to 1–20 on ReceiveMessage, or set ReceiveMessageWaitTimeSeconds on the queue itself.

bash
1# Enable long polling at the queue level (applies to all consumers)
2aws sqs set-queue-attributes \
3  --queue-url https://sqs.us-east-1.amazonaws.com/123/MyQueue \
4  --attributes ReceiveMessageWaitTimeSeconds=20

Dead-Letter Queues (DLQ)

A DLQ is a separate queue that receives messages which could not be processed successfully after maxReceiveCount attempts.

bash
1# Create the DLQ first
2aws sqs create-queue --queue-name MyQueue-DLQ
3
4# Get DLQ ARN
5DLQ_ARN=$(aws sqs get-queue-attributes \
6  --queue-url https://sqs.us-east-1.amazonaws.com/123/MyQueue-DLQ \
7  --attribute-names QueueArn \
8  --query Attributes.QueueArn --output text)
9
10# Attach DLQ to source queue with maxReceiveCount = 3
11aws sqs set-queue-attributes \
12  --queue-url https://sqs.us-east-1.amazonaws.com/123/MyQueue \
13  --attributes '{
14    "RedrivePolicy": "{
15      \"deadLetterTargetArn\": \"'$DLQ_ARN'\",
16      \"maxReceiveCount\": \"3\"
17    }"
18  }'

DLQ rules:

  • Source queue and DLQ must be the same type (Standard → Standard DLQ, FIFO → FIFO DLQ)
  • DLQ must be in the same AWS account and region
  • Messages in DLQ retain their original MessageId and body
  • Set DLQ retention longer than the source queue so you have time to investigate

DLQ Redrive (Message Recovery)

After fixing the consumer bug, move messages from the DLQ back to the source queue:

bash
1aws sqs start-message-move-task \
2  --source-arn arn:aws:sqs:us-east-1:123:MyQueue-DLQ \
3  --destination-arn arn:aws:sqs:us-east-1:123:MyQueue

FIFO Queue Details

FIFO queues guarantee strict ordering and exactly-once processing within a message group.

javascript
1// Sending to a FIFO queue — two required attributes
2await sqs.send(new SendMessageCommand({
3  QueueUrl: 'https://sqs.us-east-1.amazonaws.com/123/Orders.fifo',
4  MessageBody: JSON.stringify({ orderId: 'ORD-001', action: 'PLACE' }),
5  MessageGroupId: 'customer-cust_A',        // all messages for cust_A are ordered
6  MessageDeduplicationId: 'ORD-001-PLACE',  // deduplication key (5-min window)
7}));
AttributePurpose
MessageGroupIdGroups related messages for strict ordering. Different groups process independently (parallel).
MessageDeduplicationIdPrevents duplicate processing within 5-minute window. Alternatively, enable content-based deduplication (SHA-256 hash of body).

Throughput limits:

  • 300 TPS without batching
  • 3,000 TPS with SendMessageBatch (up to 10 messages per batch)
  • 300 distinct MessageGroupId values can be in-flight simultaneously

Delay Queues & Per-Message Delays

Delay Queue: Set DelaySeconds (0–900s) on the queue — all new messages are invisible for that duration before becoming available.

Per-message delay: Override at send time with DelaySeconds parameter. Only available for Standard queues (FIFO queues do not support per-message delays).

javascript
1// Send a message that won't be visible for 5 minutes
2await sqs.send(new SendMessageCommand({
3  QueueUrl: QUEUE_URL,
4  MessageBody: JSON.stringify({ reminder: 'follow-up email', userId: 'usr_01' }),
5  DelaySeconds: 300,
6}));

SQS with Lambda (Event Source Mapping)

Lambda can poll SQS automatically via an event source mapping — no polling code needed in your function.

Rendering diagram…
javascript
1// Lambda receives a batch of SQS messages
2exports.handler = async (event) => {
3  const failures = [];
4
5  for (const record of event.Records) {
6    try {
7      const body = JSON.parse(record.body);
8      await processOrder(body);
9    } catch (err) {
10      console.error(`Failed to process ${record.messageId}:`, err);
11      // Report this message as failed — others in batch succeed
12      failures.push({ itemIdentifier: record.messageId });
13    }
14  }
15
16  // Return failed message IDs — Lambda will NOT delete them
17  // Requires FunctionResponseTypes: ['ReportBatchItemFailures'] on the ESM
18  return { batchItemFailures: failures };
19};

Event source mapping settings:

SettingRangeNotes
Batch size1 – 10,000Messages per Lambda invocation
Batch window0 – 300sWait to fill batch before invoking
Max concurrency2 – 1,000Limit simultaneous Lambda invocations
ReportBatchItemFailureson/offEnable partial batch success

Without ReportBatchItemFailures: If any message fails, the entire batch is retried — including messages that already succeeded, causing duplicate processing.


Message Attributes & Metadata

Attach structured metadata to messages without changing the body:

javascript
1await sqs.send(new SendMessageCommand({
2  QueueUrl: QUEUE_URL,
3  MessageBody: JSON.stringify({ orderId: 'ORD-001' }),
4  MessageAttributes: {
5    EventType: {
6      DataType: 'String',
7      StringValue: 'ORDER_PLACED',
8    },
9    Priority: {
10      DataType: 'Number',
11      StringValue: '1',
12    },
13    TraceId: {
14      DataType: 'String',
15      StringValue: 'trace-abc-123',
16    },
17  },
18}));

Up to 10 message attributes per message. Attributes count toward the 256 KB size limit.


Access Control — Queue Policies

SQS queues use resource-based policies to control cross-account or cross-service access:

json
1{
2  "Version": "2012-10-17",
3  "Statement": [
4    {
5      "Effect": "Allow",
6      "Principal": { "Service": "sns.amazonaws.com" },
7      "Action": "sqs:SendMessage",
8      "Resource": "arn:aws:sqs:us-east-1:123:MyQueue",
9      "Condition": {
10        "ArnEquals": {
11          "aws:SourceArn": "arn:aws:sns:us-east-1:123:MyTopic"
12        }
13      }
14    }
15  ]
16}

DVA-C02 Quick Reference

TopicKey Fact
Standard queue deliveryAt-least-once (duplicates possible)
FIFO queue deliveryExactly-once
Standard queue orderingBest-effort (not guaranteed)
FIFO queue orderingStrict per MessageGroupId
FIFO throughput300 TPS (3,000 with batching)
Default visibility timeout30 seconds
Max visibility timeout12 hours
Default message retention4 days
Max message retention14 days
Max message size256 KB
Max delay seconds900 seconds (15 min)
Long polling max wait20 seconds
DLQ same type requiredStandard → Standard DLQ, FIFO → FIFO DLQ
Per-message delayStandard only (not FIFO)
Lambda batch size1 – 10,000
Partial batch failureReportBatchItemFailures + return batchItemFailures
FIFO deduplication window5 minutes
Extend visibility in-flightChangeMessageVisibility
Recover DLQ messagesStartMessageMoveTask (redrive)

Practice Questions7

medium

Q1. A Lambda function processes messages from an SQS Standard Queue. After processing, some messages reappear in the queue and are processed multiple times. The developer has confirmed the Lambda is completing successfully. What is the most likely cause?


Select one answer before revealing.

medium

Q2. A developer is building an order processing system. Each customer's orders must be processed in the exact sequence they were placed. A customer may place multiple orders simultaneously. Which SQS configuration satisfies this requirement?


Select one answer before revealing.

easy

Q3. An SQS consumer receives a message, begins processing, but crashes before completing. The message visibility timeout is 30 seconds and the processing typically takes 45 seconds. What happens to the message?


Select one answer before revealing.

medium

Q4. A developer wants to delay all messages in an SQS queue by 90 seconds before they become available to consumers. Individual messages should be able to override this delay. Which SQS features should be used?


Select one answer before revealing.

hard

Q5. A Lambda function processes SQS messages in batches of 10. Occasionally, 2 messages in a batch fail while the other 8 succeed. Currently, the entire batch is returned to the queue on any failure. How should the developer fix this so that only the failed messages are retried?


Select one answer before revealing.

medium

Q6. A team wants to fan out a single SQS message to multiple processing pipelines simultaneously. Each pipeline has its own SQS queue. What is the recommended architecture?


Select one answer before revealing.

hard

Q7. A developer notices that some SQS messages are failing repeatedly and piling up, preventing new messages from being processed in a FIFO queue. What configuration prevents this?


Select one answer before revealing.