Amazon SQS & Message Queuing
A comprehensive deep dive into Amazon SQS — Standard vs FIFO queues, visibility timeout, polling, DLQs, Lambda integration, access policies, and message queuing patterns for the DVA-C02 exam.
What is Amazon SQS?
Amazon SQS (Simple Queue Service) is a fully managed message queuing service that decouples the components of distributed systems. Producers send messages to a queue; consumers poll and process them independently — neither needs to know about the other.
Core mental model: SQS is a buffer. If your producer is faster than your consumer, messages accumulate safely in the queue instead of being lost. Consumers scale independently to drain the queue at their own pace.
When to use SQS:
- Decouple a monolith into independently scalable services
- Absorb traffic spikes (queue as shock absorber)
- Ensure no messages are lost if a consumer crashes
- Fan-out with SNS → multiple SQS queues
Standard vs FIFO Queues
| Feature | Standard Queue | FIFO Queue |
|---|---|---|
| Throughput | Unlimited | 300 TPS (3,000 with batching) |
| Message ordering | Best-effort (not guaranteed) | Strict FIFO per MessageGroupId |
| Delivery guarantee | At-least-once (duplicates possible) | Exactly-once (deduplication built-in) |
| Deduplication | ❌ | ✅ 5-minute deduplication window |
| URL suffix | .amazonaws.com/account/QueueName | .amazonaws.com/account/QueueName.fifo |
| Use case | High-throughput, order not critical | Financial transactions, inventory updates |
Exam tip: "Exactly-once" and "strict ordering" always point to FIFO. "Unlimited throughput" always points to Standard.
Message Lifecycle
Understanding the exact lifecycle of a message is critical for the exam.
- Producer sends a message → message is available in the queue
- Consumer calls
ReceiveMessage→ message becomes in-flight (invisible to other consumers) for the duration of the visibility timeout - Consumer processes successfully → calls
DeleteMessage→ message is gone - Consumer fails or crashes → visibility timeout expires → message becomes available again for redelivery
- After maxReceiveCount failed deliveries → message moves to the DLQ
Key Configuration Settings
| Setting | Range | Default | Notes |
|---|---|---|---|
| Visibility Timeout | 0s – 12 hours | 30 seconds | How long message is hidden after receive |
| Message Retention | 1 min – 14 days | 4 days | How long undelivered messages are kept |
| Max Message Size | 1 byte – 256 KB | 256 KB | Use S3 + pointer for larger payloads |
| Delay Seconds | 0 – 900s | 0 | Delay before message becomes available |
| Receive Message Wait Time | 0 – 20s | 0 | Long polling wait time |
| Max Receive Count | 1 – 1,000 | — | Deliveries before DLQ routing |
Visibility Timeout Deep Dive
The visibility timeout is the most commonly misunderstood SQS concept.
1import { SQSClient, ReceiveMessageCommand, DeleteMessageCommand, ChangeMessageVisibilityCommand } from '@aws-sdk/client-sqs';
2
3const sqs = new SQSClient({ region: 'us-east-1' });
4const QUEUE_URL = process.env.QUEUE_URL;
5
6const { Messages } = await sqs.send(new ReceiveMessageCommand({
7 QueueUrl: QUEUE_URL,
8 MaxNumberOfMessages: 10, // up to 10 per call
9 WaitTimeSeconds: 20, // long polling — wait up to 20s for messages
10 VisibilityTimeout: 60, // override queue default for this receive call
11}));
12
13for (const message of Messages ?? []) {
14 try {
15 // If processing will take longer than the visibility timeout,
16 // extend it before it expires to prevent duplicate processing
17 await sqs.send(new ChangeMessageVisibilityCommand({
18 QueueUrl: QUEUE_URL,
19 ReceiptHandle: message.ReceiptHandle,
20 VisibilityTimeout: 120, // extend by another 2 minutes
21 }));
22
23 await processMessage(JSON.parse(message.Body));
24
25 // Only delete AFTER successful processing
26 await sqs.send(new DeleteMessageCommand({
27 QueueUrl: QUEUE_URL,
28 ReceiptHandle: message.ReceiptHandle,
29 }));
30 } catch (err) {
31 console.error('Processing failed — message will reappear after timeout:', err);
32 // Do NOT delete — let it become visible again for retry
33 }
34}Common mistake: Deleting the message before processing is complete. If your code crashes after delete, the message is gone forever. Always delete after successful processing.
Short Polling vs Long Polling
| Mode | Behaviour | Cost | Latency |
|---|---|---|---|
| Short Polling | Returns immediately, even if queue is empty | Higher (more API calls) | Low |
| Long Polling | Waits up to 20s for a message to arrive | Lower (fewer empty responses) | Slightly higher |
Long polling is always preferred in production. Enable it by setting WaitTimeSeconds to 1–20 on ReceiveMessage, or set ReceiveMessageWaitTimeSeconds on the queue itself.
1# Enable long polling at the queue level (applies to all consumers)
2aws sqs set-queue-attributes \
3 --queue-url https://sqs.us-east-1.amazonaws.com/123/MyQueue \
4 --attributes ReceiveMessageWaitTimeSeconds=20Dead-Letter Queues (DLQ)
A DLQ is a separate queue that receives messages which could not be processed successfully after maxReceiveCount attempts.
1# Create the DLQ first
2aws sqs create-queue --queue-name MyQueue-DLQ
3
4# Get DLQ ARN
5DLQ_ARN=$(aws sqs get-queue-attributes \
6 --queue-url https://sqs.us-east-1.amazonaws.com/123/MyQueue-DLQ \
7 --attribute-names QueueArn \
8 --query Attributes.QueueArn --output text)
9
10# Attach DLQ to source queue with maxReceiveCount = 3
11aws sqs set-queue-attributes \
12 --queue-url https://sqs.us-east-1.amazonaws.com/123/MyQueue \
13 --attributes '{
14 "RedrivePolicy": "{
15 \"deadLetterTargetArn\": \"'$DLQ_ARN'\",
16 \"maxReceiveCount\": \"3\"
17 }"
18 }'DLQ rules:
- Source queue and DLQ must be the same type (Standard → Standard DLQ, FIFO → FIFO DLQ)
- DLQ must be in the same AWS account and region
- Messages in DLQ retain their original
MessageIdand body - Set DLQ retention longer than the source queue so you have time to investigate
DLQ Redrive (Message Recovery)
After fixing the consumer bug, move messages from the DLQ back to the source queue:
1aws sqs start-message-move-task \
2 --source-arn arn:aws:sqs:us-east-1:123:MyQueue-DLQ \
3 --destination-arn arn:aws:sqs:us-east-1:123:MyQueueFIFO Queue Details
FIFO queues guarantee strict ordering and exactly-once processing within a message group.
1// Sending to a FIFO queue — two required attributes
2await sqs.send(new SendMessageCommand({
3 QueueUrl: 'https://sqs.us-east-1.amazonaws.com/123/Orders.fifo',
4 MessageBody: JSON.stringify({ orderId: 'ORD-001', action: 'PLACE' }),
5 MessageGroupId: 'customer-cust_A', // all messages for cust_A are ordered
6 MessageDeduplicationId: 'ORD-001-PLACE', // deduplication key (5-min window)
7}));| Attribute | Purpose |
|---|---|
MessageGroupId | Groups related messages for strict ordering. Different groups process independently (parallel). |
MessageDeduplicationId | Prevents duplicate processing within 5-minute window. Alternatively, enable content-based deduplication (SHA-256 hash of body). |
Throughput limits:
- 300 TPS without batching
- 3,000 TPS with
SendMessageBatch(up to 10 messages per batch) - 300 distinct
MessageGroupIdvalues can be in-flight simultaneously
Delay Queues & Per-Message Delays
Delay Queue: Set DelaySeconds (0–900s) on the queue — all new messages are invisible for that duration before becoming available.
Per-message delay: Override at send time with DelaySeconds parameter. Only available for Standard queues (FIFO queues do not support per-message delays).
1// Send a message that won't be visible for 5 minutes
2await sqs.send(new SendMessageCommand({
3 QueueUrl: QUEUE_URL,
4 MessageBody: JSON.stringify({ reminder: 'follow-up email', userId: 'usr_01' }),
5 DelaySeconds: 300,
6}));SQS with Lambda (Event Source Mapping)
Lambda can poll SQS automatically via an event source mapping — no polling code needed in your function.
1// Lambda receives a batch of SQS messages
2exports.handler = async (event) => {
3 const failures = [];
4
5 for (const record of event.Records) {
6 try {
7 const body = JSON.parse(record.body);
8 await processOrder(body);
9 } catch (err) {
10 console.error(`Failed to process ${record.messageId}:`, err);
11 // Report this message as failed — others in batch succeed
12 failures.push({ itemIdentifier: record.messageId });
13 }
14 }
15
16 // Return failed message IDs — Lambda will NOT delete them
17 // Requires FunctionResponseTypes: ['ReportBatchItemFailures'] on the ESM
18 return { batchItemFailures: failures };
19};Event source mapping settings:
| Setting | Range | Notes |
|---|---|---|
| Batch size | 1 – 10,000 | Messages per Lambda invocation |
| Batch window | 0 – 300s | Wait to fill batch before invoking |
| Max concurrency | 2 – 1,000 | Limit simultaneous Lambda invocations |
| ReportBatchItemFailures | on/off | Enable partial batch success |
Without ReportBatchItemFailures: If any message fails, the entire batch is retried — including messages that already succeeded, causing duplicate processing.
Message Attributes & Metadata
Attach structured metadata to messages without changing the body:
1await sqs.send(new SendMessageCommand({
2 QueueUrl: QUEUE_URL,
3 MessageBody: JSON.stringify({ orderId: 'ORD-001' }),
4 MessageAttributes: {
5 EventType: {
6 DataType: 'String',
7 StringValue: 'ORDER_PLACED',
8 },
9 Priority: {
10 DataType: 'Number',
11 StringValue: '1',
12 },
13 TraceId: {
14 DataType: 'String',
15 StringValue: 'trace-abc-123',
16 },
17 },
18}));Up to 10 message attributes per message. Attributes count toward the 256 KB size limit.
Access Control — Queue Policies
SQS queues use resource-based policies to control cross-account or cross-service access:
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Effect": "Allow",
6 "Principal": { "Service": "sns.amazonaws.com" },
7 "Action": "sqs:SendMessage",
8 "Resource": "arn:aws:sqs:us-east-1:123:MyQueue",
9 "Condition": {
10 "ArnEquals": {
11 "aws:SourceArn": "arn:aws:sns:us-east-1:123:MyTopic"
12 }
13 }
14 }
15 ]
16}DVA-C02 Quick Reference
| Topic | Key Fact |
|---|---|
| Standard queue delivery | At-least-once (duplicates possible) |
| FIFO queue delivery | Exactly-once |
| Standard queue ordering | Best-effort (not guaranteed) |
| FIFO queue ordering | Strict per MessageGroupId |
| FIFO throughput | 300 TPS (3,000 with batching) |
| Default visibility timeout | 30 seconds |
| Max visibility timeout | 12 hours |
| Default message retention | 4 days |
| Max message retention | 14 days |
| Max message size | 256 KB |
| Max delay seconds | 900 seconds (15 min) |
| Long polling max wait | 20 seconds |
| DLQ same type required | Standard → Standard DLQ, FIFO → FIFO DLQ |
| Per-message delay | Standard only (not FIFO) |
| Lambda batch size | 1 – 10,000 |
| Partial batch failure | ReportBatchItemFailures + return batchItemFailures |
| FIFO deduplication window | 5 minutes |
| Extend visibility in-flight | ChangeMessageVisibility |
| Recover DLQ messages | StartMessageMoveTask (redrive) |
Practice Questions7
Q1. A Lambda function processes messages from an SQS Standard Queue. After processing, some messages reappear in the queue and are processed multiple times. The developer has confirmed the Lambda is completing successfully. What is the most likely cause?
Select one answer before revealing.
Q2. A developer is building an order processing system. Each customer's orders must be processed in the exact sequence they were placed. A customer may place multiple orders simultaneously. Which SQS configuration satisfies this requirement?
Select one answer before revealing.
Q3. An SQS consumer receives a message, begins processing, but crashes before completing. The message visibility timeout is 30 seconds and the processing typically takes 45 seconds. What happens to the message?
Select one answer before revealing.
Q4. A developer wants to delay all messages in an SQS queue by 90 seconds before they become available to consumers. Individual messages should be able to override this delay. Which SQS features should be used?
Select one answer before revealing.
Q5. A Lambda function processes SQS messages in batches of 10. Occasionally, 2 messages in a batch fail while the other 8 succeed. Currently, the entire batch is returned to the queue on any failure. How should the developer fix this so that only the failed messages are retried?
Select one answer before revealing.
Q6. A team wants to fan out a single SQS message to multiple processing pipelines simultaneously. Each pipeline has its own SQS queue. What is the recommended architecture?
Select one answer before revealing.
Q7. A developer notices that some SQS messages are failing repeatedly and piling up, preventing new messages from being processed in a FIFO queue. What configuration prevents this?
Select one answer before revealing.