Amazon Kinesis Data Streams
A comprehensive deep dive into Amazon Kinesis Data Streams — shards, producers, consumers, enhanced fan-out, Lambda integration, KCL, resharding, Kinesis Firehose, and real-time streaming patterns for the DVA-C02 exam.
What is Amazon Kinesis Data Streams?
Amazon Kinesis Data Streams (KDS) is a fully managed, real-time data streaming service capable of capturing gigabytes of data per second from thousands of producers. Unlike SQS (which is a queue), Kinesis is a log — records are retained and can be replayed by multiple independent consumers simultaneously.
Core mental model: Kinesis is a highway with multiple lanes (shards). Producers put data onto the highway. Multiple consumers can read the same lanes independently, at their own speed, and replay history. SQS is a one-way door — once a consumer takes a message, it's gone.
When to use Kinesis:
- Real-time analytics on continuous data streams (clickstreams, IoT sensors, logs)
- Multiple independent consumers reading the same data stream
- Need to replay data (re-process historical records)
- Ordered processing per partition key
Core Architecture
Shards — The Unit of Capacity
A shard is the fundamental throughput unit of a Kinesis stream. Every stream is composed of one or more shards.
| Metric | Value per Shard |
|---|---|
| Write throughput | 1 MB/s or 1,000 records/sec (whichever is hit first) |
| Read throughput (shared) | 2 MB/s total, shared across ALL consumers |
| Read throughput (enhanced fan-out) | 2 MB/s per registered consumer |
| Max record size | 1 MB |
| Default retention | 24 hours |
| Max retention | 365 days |
Capacity math:
- 10 producers each writing 100 KB/s = 1 MB/s → need at least 1 shard
- 100 producers each writing 200 KB/s = 20 MB/s → need at least 20 shards
On-Demand vs Provisioned Mode
| Mode | Capacity | Scaling | Cost |
|---|---|---|---|
| Provisioned | You set shard count | Manual (split/merge) | Per shard-hour |
| On-Demand | Auto-scales up to 200 MB/s | Automatic | Per GB ingested + retrieved |
Records
Every record written to Kinesis has three components:
| Component | Description | Size |
|---|---|---|
| Partition Key | String that determines which shard receives the record. Same key → same shard → ordered | Up to 256 bytes |
| Data Blob | Your payload (any bytes — JSON, Avro, raw binary) | Up to 1 MB |
| Sequence Number | Assigned by Kinesis. Unique per shard. Monotonically increasing. | Auto-assigned |
1import { KinesisClient, PutRecordCommand, PutRecordsCommand } from '@aws-sdk/client-kinesis';
2
3const kinesis = new KinesisClient({ region: 'us-east-1' });
4
5// Single record
6await kinesis.send(new PutRecordCommand({
7 StreamName: 'clickstream',
8 PartitionKey: 'user-usr_01J8X', // same user → same shard → ordered
9 Data: Buffer.from(JSON.stringify({
10 userId: 'usr_01J8X',
11 event: 'page_view',
12 url: '/products/laptop',
13 ts: Date.now(),
14 })),
15}));
16
17// Batch — up to 500 records or 5 MB per PutRecords call
18const records = events.map(event => ({
19 PartitionKey: `user-${event.userId}`,
20 Data: Buffer.from(JSON.stringify(event)),
21}));
22
23const result = await kinesis.send(new PutRecordsCommand({
24 StreamName: 'clickstream',
25 Records: records,
26}));
27
28// PutRecords is partial-success — check for failed records
29if (result.FailedRecordCount > 0) {
30 const failed = result.Records
31 .map((r, i) => ({ ...r, original: records[i] }))
32 .filter(r => r.ErrorCode);
33 // Retry failed records with exponential backoff
34}Producers
1. AWS SDK (PutRecord / PutRecords)
Direct API calls. Simple but no built-in batching or retry beyond what you implement.
2. Kinesis Producer Library (KPL)
The KPL is a high-performance producer library that dramatically increases throughput through:
- Aggregation: Combines multiple small records into a single Kinesis record (up to 1 MB)
- Collection: Batches multiple
PutRecordscalls together - Automatic retry with exponential backoff
- Async, non-blocking writes
Trade-off: KPL adds up to 1 second of latency (RecordMaxBufferedTime). If your use case requires sub-second delivery, use the SDK directly.
KPL aggregation: 1,000 records × 100 bytes each = 100 KB → 1 Kinesis record
Effective throughput: 10,000+ records/sec on a single shard
Consumers must use KCL or de-aggregate manually to unpack KPL-aggregated records. Lambda and Firehose handle de-aggregation automatically.
3. Kinesis Agent
A standalone Java application installed on servers (EC2, on-premise). Monitors log files, converts to JSON, and streams to Kinesis or Firehose with automatic rotation and retry.
Consumers
1. Shared (Classic) Throughput
All consumers using GetRecords share the 2 MB/s read throughput per shard.
- Each consumer polls independently using
GetShardIterator+GetRecords - Max 5
GetRecordscalls/sec per shard - Latency: ~200ms average
1import { GetShardIteratorCommand, GetRecordsCommand } from '@aws-sdk/client-kinesis';
2
3// Get iterator starting from the latest position
4const { ShardIterator } = await kinesis.send(new GetShardIteratorCommand({
5 StreamName: 'clickstream',
6 ShardId: 'shardId-000000000000',
7 ShardIteratorType: 'LATEST', // TRIM_HORIZON (oldest), AT_SEQUENCE_NUMBER, AFTER_SEQUENCE_NUMBER, AT_TIMESTAMP
8}));
9
10// Poll for records
11let iterator = ShardIterator;
12while (true) {
13 const { Records, NextShardIterator, MillisBehindLatest } = await kinesis.send(
14 new GetRecordsCommand({ ShardIterator: iterator, Limit: 100 })
15 );
16
17 if (Records.length > 0) {
18 for (const record of Records) {
19 const data = JSON.parse(Buffer.from(record.Data).toString());
20 console.log('Processing:', data, 'Seq:', record.SequenceNumber);
21 }
22 }
23
24 console.log(`${MillisBehindLatest}ms behind latest`);
25 iterator = NextShardIterator;
26 await new Promise(r => setTimeout(r, 1000)); // poll every second
27}2. Enhanced Fan-Out (EFO)
Each registered consumer gets a dedicated 2 MB/s pipe per shard — completely independent of other consumers. Uses HTTP/2 push (SubscribeToShard) instead of polling.
| Feature | Shared | Enhanced Fan-Out |
|---|---|---|
| Throughput | 2 MB/s shared | 2 MB/s per consumer per shard |
| Latency | ~200ms | ~70ms |
| Cost | Included | Extra per shard-consumer-hour + data retrieval |
| Model | Pull (polling) | Push (HTTP/2 push) |
| Max consumers | 5 calls/sec/shard | 20 registered consumers per stream |
1# Register an enhanced fan-out consumer
2aws kinesis register-stream-consumer \
3 --stream-arn arn:aws:kinesis:us-east-1:123:stream/clickstream \
4 --consumer-name analytics-service3. Lambda (Event Source Mapping)
Lambda polls Kinesis on your behalf. Configure key settings:
1// Lambda receives batches per shard
2exports.handler = async (event) => {
3 const failures = [];
4
5 for (const record of event.Records) {
6 try {
7 const data = JSON.parse(Buffer.from(record.kinesis.data, 'base64').toString());
8 console.log('Shard:', record.eventSourceARN.split('/').pop());
9 console.log('Sequence:', record.kinesis.sequenceNumber);
10 console.log('Partition key:', record.kinesis.partitionKey);
11 await processEvent(data);
12 } catch (err) {
13 failures.push({ itemIdentifier: record.kinesis.sequenceNumber });
14 }
15 }
16
17 return { batchItemFailures: failures };
18};Lambda ESM settings for Kinesis:
| Setting | Range | Notes |
|---|---|---|
| Batch size | 1 – 10,000 | Records per invocation |
| Batch window | 0 – 300s | Wait to fill batch |
| Starting position | TRIM_HORIZON / LATEST / AT_TIMESTAMP | Where to start reading |
| Parallelization factor | 1 – 10 | Concurrent batches per shard |
| Bisect on error | true/false | Split failed batch in half to isolate bad record |
| Destination on failure | SQS / SNS | Route unprocessable records |
| Tumbling window | 0 – 900s | Aggregation window for stateful processing |
4. Kinesis Client Library (KCL)
The KCL is a Java library (with connectors for Python, Ruby, Node.js) that manages:
- Shard assignment: Distributes shards across multiple worker instances
- Checkpointing: Stores consumer position in DynamoDB (one table per application)
- Lease management: Workers take leases on shards; dead workers' leases are taken over
- Automatic scaling: Add workers → shards redistributed automatically
One shard = at most one KCL worker at a time. You can have more workers than shards, but only one processes each shard.
Ordering and Partition Keys
Records with the same partition key always go to the same shard — ensuring ordered delivery for that key.
Partition key "user-123" → always Shard 1 → records arrive in order
Partition key "user-456" → always Shard 2 → ordered within Shard 2
Hot shard problem: If one partition key generates disproportionate traffic (e.g., all records from a single high-traffic user), that shard hits its 1 MB/s write limit.
Solution: Use high-cardinality partition keys, or append a random suffix for non-ordered writes:
1// High cardinality — spread across shards
2const partitionKey = `user-${userId}-${Date.now() % 100}`;Resharding
Split a shard when throughput approaches its limit. Merge shards when traffic drops to reduce cost.
1# Split shard — specify the hash key to split at
2aws kinesis split-shard \
3 --stream-name clickstream \
4 --shard-to-split shardId-000000000000 \
5 --new-starting-hash-key 170141183460469231731687303715884105728
6
7# Merge two adjacent shards
8aws kinesis merge-shards \
9 --stream-name clickstream \
10 --shard-to-merge shardId-000000000000 \
11 --adjacent-shard-to-merge shardId-000000000001Resharding is sequential — only one split or merge operation at a time. After resharding, the parent shard is closed but still readable until all records expire. Consumers must finish reading the parent before moving to children.
Data Retention and Replay
| Retention | Cost | Use case |
|---|---|---|
| 24 hours (default) | Included in shard-hour | Short-term processing |
| Up to 7 days | Standard rate | Extended processing window |
| Up to 365 days | Long-term retention rate | Compliance, audit, replay |
Replay: Restart a consumer from TRIM_HORIZON (oldest available record) or any AT_TIMESTAMP to reprocess historical data — impossible with SQS.
Kinesis Data Firehose
Firehose is a fully managed delivery stream — no consumer code needed. It buffers records and delivers batches to destinations.
| Feature | Kinesis Data Streams | Kinesis Firehose |
|---|---|---|
| Consumer management | You manage consumers | Fully managed — no code |
| Delivery latency | Real-time (milliseconds) | 60s minimum (buffer) |
| Destinations | Custom (Lambda, KCL, etc.) | S3, Redshift, OpenSearch, Splunk, HTTP |
| Data transformation | You implement | Lambda transform built-in |
| Replay | ✅ | ❌ |
| Scaling | Manual (shards) | Automatic |
| Pricing | Per shard-hour | Per GB ingested |
Firehose buffer settings:
- Buffer size: 1 MB – 128 MB (deliver when buffer is full)
- Buffer interval: 60s – 900s (deliver when timer expires)
- Whichever condition is met first triggers delivery
Kinesis vs SQS vs SNS
| Feature | Kinesis | SQS | SNS |
|---|---|---|---|
| Message retention | Up to 365 days | Up to 14 days | None (push only) |
| Replay | ✅ | ❌ | ❌ |
| Ordering | Per shard/partition key | FIFO queue only | FIFO topic only |
| Multiple consumers | ✅ simultaneously | ❌ one at a time | ✅ (fan-out) |
| Throughput | Shard-based, scalable | Unlimited (Standard) | Unlimited |
| Consumer model | Pull (or EFO push) | Pull | Push |
| Real-time | ✅ | Near real-time | ✅ |
| Use case | Streaming analytics, event sourcing | Job queues, decoupling | Notifications, fan-out |
DVA-C02 Quick Reference
| Topic | Key Fact |
|---|---|
| Shard write limit | 1 MB/s or 1,000 records/sec |
| Shard read limit (shared) | 2 MB/s total per shard |
| Shard read limit (EFO) | 2 MB/s per registered consumer per shard |
| Max record size | 1 MB |
| Default retention | 24 hours |
| Max retention | 365 days |
| Max EFO consumers per stream | 20 |
| KPL adds | Aggregation + batching + ~1s latency |
| KCL checkpointing | Stored in DynamoDB |
| Same partition key | Same shard → ordered delivery |
| Lambda bisect on error | Splits failed batch in half to find bad record |
| Firehose min latency | 60 seconds (not real-time) |
| Firehose replay | Not supported |
| On-demand mode | Auto-scales up to 200 MB/s |
| Shard iterator types | TRIM_HORIZON, LATEST, AT_SEQUENCE_NUMBER, AT_TIMESTAMP |
| Resharding | Sequential — one operation at a time |
| Hot shard fix | High-cardinality partition keys |
Practice Questions5
Q1. A Kinesis Data Stream has 4 shards. A producer is writing 6 MB/s of data. Consumers are falling behind. What is the correct action?
Select one answer before revealing.
Q2. A Kinesis stream has multiple Lambda consumers. The developer notices that all consumers are reading the same data, and there is noticeable read latency because they share the 2 MB/s per shard read limit. What is the recommended solution?
Select one answer before revealing.
Q3. A developer uses a partition key of "productCategory" for Kinesis records. The stream has 10 shards but one shard is handling 90% of all traffic. What is the root cause and solution?
Select one answer before revealing.
Q4. A developer needs to archive all Kinesis stream records to S3 in Parquet format, partitioned by date, without writing any custom code. Which service should be used?
Select one answer before revealing.
Q5. A developer uses a Lambda function as a Kinesis consumer. The function processes a batch of 100 records but 3 records fail. With default settings, what happens?
Select one answer before revealing.