Concept

Hard

Amazon Kinesis Data Streams

10 min read·KinesisStreamingReal-timeDVA-C02

A comprehensive deep dive into Amazon Kinesis Data Streams — shards, producers, consumers, enhanced fan-out, Lambda integration, KCL, resharding, Kinesis Firehose, and real-time streaming patterns for the DVA-C02 exam.

What is Amazon Kinesis Data Streams?

Amazon Kinesis Data Streams (KDS) is a fully managed, real-time data streaming service capable of capturing gigabytes of data per second from thousands of producers. Unlike SQS (which is a queue), Kinesis is a log — records are retained and can be replayed by multiple independent consumers simultaneously.

Core mental model: Kinesis is a highway with multiple lanes (shards). Producers put data onto the highway. Multiple consumers can read the same lanes independently, at their own speed, and replay history. SQS is a one-way door — once a consumer takes a message, it's gone.

When to use Kinesis:

Real-time analytics on continuous data streams (clickstreams, IoT sensors, logs)
Multiple independent consumers reading the same data stream
Need to replay data (re-process historical records)
Ordered processing per partition key

Core Architecture

Rendering diagram…

Shards — The Unit of Capacity

A shard is the fundamental throughput unit of a Kinesis stream. Every stream is composed of one or more shards.

Metric	Value per Shard
Write throughput	1 MB/s or 1,000 records/sec (whichever is hit first)
Read throughput (shared)	2 MB/s total, shared across ALL consumers
Read throughput (enhanced fan-out)	2 MB/s per registered consumer
Max record size	1 MB
Default retention	24 hours
Max retention	365 days

Capacity math:

10 producers each writing 100 KB/s = 1 MB/s → need at least 1 shard
100 producers each writing 200 KB/s = 20 MB/s → need at least 20 shards

On-Demand vs Provisioned Mode

Mode	Capacity	Scaling	Cost
Provisioned	You set shard count	Manual (split/merge)	Per shard-hour
On-Demand	Auto-scales up to 200 MB/s	Automatic	Per GB ingested + retrieved

Records

Every record written to Kinesis has three components:

Component	Description	Size
Partition Key	String that determines which shard receives the record. Same key → same shard → ordered	Up to 256 bytes
Data Blob	Your payload (any bytes — JSON, Avro, raw binary)	Up to 1 MB
Sequence Number	Assigned by Kinesis. Unique per shard. Monotonically increasing.	Auto-assigned

javascript

1import { KinesisClient, PutRecordCommand, PutRecordsCommand } from '@aws-sdk/client-kinesis';
2
3const kinesis = new KinesisClient({ region: 'us-east-1' });
4
5// Single record
6await kinesis.send(new PutRecordCommand({
7  StreamName: 'clickstream',
8  PartitionKey: 'user-usr_01J8X',   // same user → same shard → ordered
9  Data: Buffer.from(JSON.stringify({
10    userId: 'usr_01J8X',
11    event: 'page_view',
12    url: '/products/laptop',
13    ts: Date.now(),
14  })),
15}));
16
17// Batch — up to 500 records or 5 MB per PutRecords call
18const records = events.map(event => ({
19  PartitionKey: `user-${event.userId}`,
20  Data: Buffer.from(JSON.stringify(event)),
21}));
22
23const result = await kinesis.send(new PutRecordsCommand({
24  StreamName: 'clickstream',
25  Records: records,
26}));
27
28// PutRecords is partial-success — check for failed records
29if (result.FailedRecordCount > 0) {
30  const failed = result.Records
31    .map((r, i) => ({ ...r, original: records[i] }))
32    .filter(r => r.ErrorCode);
33  // Retry failed records with exponential backoff
34}

Producers

1. AWS SDK (PutRecord / PutRecords)

Direct API calls. Simple but no built-in batching or retry beyond what you implement.

2. Kinesis Producer Library (KPL)

The KPL is a high-performance producer library that dramatically increases throughput through:

Aggregation: Combines multiple small records into a single Kinesis record (up to 1 MB)
Collection: Batches multiple PutRecords calls together
Automatic retry with exponential backoff
Async, non-blocking writes

Trade-off: KPL adds up to 1 second of latency (RecordMaxBufferedTime). If your use case requires sub-second delivery, use the SDK directly.

KPL aggregation: 1,000 records × 100 bytes each = 100 KB → 1 Kinesis record
Effective throughput: 10,000+ records/sec on a single shard

Consumers must use KCL or de-aggregate manually to unpack KPL-aggregated records. Lambda and Firehose handle de-aggregation automatically.

3. Kinesis Agent

A standalone Java application installed on servers (EC2, on-premise). Monitors log files, converts to JSON, and streams to Kinesis or Firehose with automatic rotation and retry.

Consumers

1. Shared (Classic) Throughput

All consumers using GetRecords share the 2 MB/s read throughput per shard.

Each consumer polls independently using GetShardIterator + GetRecords
Max 5 GetRecords calls/sec per shard
Latency: ~200ms average

javascript

1import { GetShardIteratorCommand, GetRecordsCommand } from '@aws-sdk/client-kinesis';
2
3// Get iterator starting from the latest position
4const { ShardIterator } = await kinesis.send(new GetShardIteratorCommand({
5  StreamName: 'clickstream',
6  ShardId: 'shardId-000000000000',
7  ShardIteratorType: 'LATEST',       // TRIM_HORIZON (oldest), AT_SEQUENCE_NUMBER, AFTER_SEQUENCE_NUMBER, AT_TIMESTAMP
8}));
9
10// Poll for records
11let iterator = ShardIterator;
12while (true) {
13  const { Records, NextShardIterator, MillisBehindLatest } = await kinesis.send(
14    new GetRecordsCommand({ ShardIterator: iterator, Limit: 100 })
15  );
16
17  if (Records.length > 0) {
18    for (const record of Records) {
19      const data = JSON.parse(Buffer.from(record.Data).toString());
20      console.log('Processing:', data, 'Seq:', record.SequenceNumber);
21    }
22  }
23
24  console.log(`${MillisBehindLatest}ms behind latest`);
25  iterator = NextShardIterator;
26  await new Promise(r => setTimeout(r, 1000)); // poll every second
27}

2. Enhanced Fan-Out (EFO)

Each registered consumer gets a dedicated 2 MB/s pipe per shard — completely independent of other consumers. Uses HTTP/2 push (SubscribeToShard) instead of polling.

Feature	Shared	Enhanced Fan-Out
Throughput	2 MB/s shared	2 MB/s per consumer per shard
Latency	~200ms	~70ms
Cost	Included	Extra per shard-consumer-hour + data retrieval
Model	Pull (polling)	Push (HTTP/2 push)
Max consumers	5 calls/sec/shard	20 registered consumers per stream

bash

1# Register an enhanced fan-out consumer
2aws kinesis register-stream-consumer \
3  --stream-arn arn:aws:kinesis:us-east-1:123:stream/clickstream \
4  --consumer-name analytics-service

3. Lambda (Event Source Mapping)

Lambda polls Kinesis on your behalf. Configure key settings:

javascript

1// Lambda receives batches per shard
2exports.handler = async (event) => {
3  const failures = [];
4
5  for (const record of event.Records) {
6    try {
7      const data = JSON.parse(Buffer.from(record.kinesis.data, 'base64').toString());
8      console.log('Shard:', record.eventSourceARN.split('/').pop());
9      console.log('Sequence:', record.kinesis.sequenceNumber);
10      console.log('Partition key:', record.kinesis.partitionKey);
11      await processEvent(data);
12    } catch (err) {
13      failures.push({ itemIdentifier: record.kinesis.sequenceNumber });
14    }
15  }
16
17  return { batchItemFailures: failures };
18};

Lambda ESM settings for Kinesis:

Setting	Range	Notes
Batch size	1 – 10,000	Records per invocation
Batch window	0 – 300s	Wait to fill batch
Starting position	`TRIM_HORIZON` / `LATEST` / `AT_TIMESTAMP`	Where to start reading
Parallelization factor	1 – 10	Concurrent batches per shard
Bisect on error	true/false	Split failed batch in half to isolate bad record
Destination on failure	SQS / SNS	Route unprocessable records
Tumbling window	0 – 900s	Aggregation window for stateful processing

4. Kinesis Client Library (KCL)

The KCL is a Java library (with connectors for Python, Ruby, Node.js) that manages:

Shard assignment: Distributes shards across multiple worker instances
Checkpointing: Stores consumer position in DynamoDB (one table per application)
Lease management: Workers take leases on shards; dead workers' leases are taken over
Automatic scaling: Add workers → shards redistributed automatically

One shard = at most one KCL worker at a time. You can have more workers than shards, but only one processes each shard.

Ordering and Partition Keys

Records with the same partition key always go to the same shard — ensuring ordered delivery for that key.

Partition key "user-123" → always Shard 1 → records arrive in order
Partition key "user-456" → always Shard 2 → ordered within Shard 2

Hot shard problem: If one partition key generates disproportionate traffic (e.g., all records from a single high-traffic user), that shard hits its 1 MB/s write limit.

Solution: Use high-cardinality partition keys, or append a random suffix for non-ordered writes:

javascript

1// High cardinality — spread across shards
2const partitionKey = `user-${userId}-${Date.now() % 100}`;

Resharding

Split a shard when throughput approaches its limit. Merge shards when traffic drops to reduce cost.

bash

1# Split shard — specify the hash key to split at
2aws kinesis split-shard \
3  --stream-name clickstream \
4  --shard-to-split shardId-000000000000 \
5  --new-starting-hash-key 170141183460469231731687303715884105728
6
7# Merge two adjacent shards
8aws kinesis merge-shards \
9  --stream-name clickstream \
10  --shard-to-merge shardId-000000000000 \
11  --adjacent-shard-to-merge shardId-000000000001

Resharding is sequential — only one split or merge operation at a time. After resharding, the parent shard is closed but still readable until all records expire. Consumers must finish reading the parent before moving to children.

Data Retention and Replay

Retention	Cost	Use case
24 hours (default)	Included in shard-hour	Short-term processing
Up to 7 days	Standard rate	Extended processing window
Up to 365 days	Long-term retention rate	Compliance, audit, replay

Replay: Restart a consumer from TRIM_HORIZON (oldest available record) or any AT_TIMESTAMP to reprocess historical data — impossible with SQS.

Kinesis Data Firehose

Firehose is a fully managed delivery stream — no consumer code needed. It buffers records and delivers batches to destinations.

Rendering diagram…

Feature	Kinesis Data Streams	Kinesis Firehose
Consumer management	You manage consumers	Fully managed — no code
Delivery latency	Real-time (milliseconds)	60s minimum (buffer)
Destinations	Custom (Lambda, KCL, etc.)	S3, Redshift, OpenSearch, Splunk, HTTP
Data transformation	You implement	Lambda transform built-in
Replay	✅	❌
Scaling	Manual (shards)	Automatic
Pricing	Per shard-hour	Per GB ingested

Firehose buffer settings:

Buffer size: 1 MB – 128 MB (deliver when buffer is full)
Buffer interval: 60s – 900s (deliver when timer expires)
Whichever condition is met first triggers delivery

Kinesis vs SQS vs SNS

Feature	Kinesis	SQS	SNS
Message retention	Up to 365 days	Up to 14 days	None (push only)
Replay	✅	❌	❌
Ordering	Per shard/partition key	FIFO queue only	FIFO topic only
Multiple consumers	✅ simultaneously	❌ one at a time	✅ (fan-out)
Throughput	Shard-based, scalable	Unlimited (Standard)	Unlimited
Consumer model	Pull (or EFO push)	Pull	Push
Real-time	✅	Near real-time	✅
Use case	Streaming analytics, event sourcing	Job queues, decoupling	Notifications, fan-out

DVA-C02 Quick Reference

Topic	Key Fact
Shard write limit	1 MB/s or 1,000 records/sec
Shard read limit (shared)	2 MB/s total per shard
Shard read limit (EFO)	2 MB/s per registered consumer per shard
Max record size	1 MB
Default retention	24 hours
Max retention	365 days
Max EFO consumers per stream	20
KPL adds	Aggregation + batching + ~1s latency
KCL checkpointing	Stored in DynamoDB
Same partition key	Same shard → ordered delivery
Lambda bisect on error	Splits failed batch in half to find bad record
Firehose min latency	60 seconds (not real-time)
Firehose replay	Not supported
On-demand mode	Auto-scales up to 200 MB/s
Shard iterator types	`TRIM_HORIZON`, `LATEST`, `AT_SEQUENCE_NUMBER`, `AT_TIMESTAMP`
Resharding	Sequential — one operation at a time
Hot shard fix	High-cardinality partition keys

Practice Questions7

easy

Q1. A developer must ingest clickstream events from thousands of web clients in real time. Multiple independent consumer applications need to read the same stream of records, and records must be replayable for 24 hours. Which service fits these requirements?