/Amazon Kinesis Data Streams
Concept
Hard

Amazon Kinesis Data Streams

10 min read·KinesisStreamingReal-timeDVA-C02

A comprehensive deep dive into Amazon Kinesis Data Streams — shards, producers, consumers, enhanced fan-out, Lambda integration, KCL, resharding, Kinesis Firehose, and real-time streaming patterns for the DVA-C02 exam.


What is Amazon Kinesis Data Streams?

Amazon Kinesis Data Streams (KDS) is a fully managed, real-time data streaming service capable of capturing gigabytes of data per second from thousands of producers. Unlike SQS (which is a queue), Kinesis is a log — records are retained and can be replayed by multiple independent consumers simultaneously.

Core mental model: Kinesis is a highway with multiple lanes (shards). Producers put data onto the highway. Multiple consumers can read the same lanes independently, at their own speed, and replay history. SQS is a one-way door — once a consumer takes a message, it's gone.

When to use Kinesis:

  • Real-time analytics on continuous data streams (clickstreams, IoT sensors, logs)
  • Multiple independent consumers reading the same data stream
  • Need to replay data (re-process historical records)
  • Ordered processing per partition key

Core Architecture

Rendering diagram…

Shards — The Unit of Capacity

A shard is the fundamental throughput unit of a Kinesis stream. Every stream is composed of one or more shards.

MetricValue per Shard
Write throughput1 MB/s or 1,000 records/sec (whichever is hit first)
Read throughput (shared)2 MB/s total, shared across ALL consumers
Read throughput (enhanced fan-out)2 MB/s per registered consumer
Max record size1 MB
Default retention24 hours
Max retention365 days

Capacity math:

  • 10 producers each writing 100 KB/s = 1 MB/s → need at least 1 shard
  • 100 producers each writing 200 KB/s = 20 MB/s → need at least 20 shards

On-Demand vs Provisioned Mode

ModeCapacityScalingCost
ProvisionedYou set shard countManual (split/merge)Per shard-hour
On-DemandAuto-scales up to 200 MB/sAutomaticPer GB ingested + retrieved

Records

Every record written to Kinesis has three components:

ComponentDescriptionSize
Partition KeyString that determines which shard receives the record. Same key → same shard → orderedUp to 256 bytes
Data BlobYour payload (any bytes — JSON, Avro, raw binary)Up to 1 MB
Sequence NumberAssigned by Kinesis. Unique per shard. Monotonically increasing.Auto-assigned
javascript
1import { KinesisClient, PutRecordCommand, PutRecordsCommand } from '@aws-sdk/client-kinesis';
2
3const kinesis = new KinesisClient({ region: 'us-east-1' });
4
5// Single record
6await kinesis.send(new PutRecordCommand({
7  StreamName: 'clickstream',
8  PartitionKey: 'user-usr_01J8X',   // same user → same shard → ordered
9  Data: Buffer.from(JSON.stringify({
10    userId: 'usr_01J8X',
11    event: 'page_view',
12    url: '/products/laptop',
13    ts: Date.now(),
14  })),
15}));
16
17// Batch — up to 500 records or 5 MB per PutRecords call
18const records = events.map(event => ({
19  PartitionKey: `user-${event.userId}`,
20  Data: Buffer.from(JSON.stringify(event)),
21}));
22
23const result = await kinesis.send(new PutRecordsCommand({
24  StreamName: 'clickstream',
25  Records: records,
26}));
27
28// PutRecords is partial-success — check for failed records
29if (result.FailedRecordCount > 0) {
30  const failed = result.Records
31    .map((r, i) => ({ ...r, original: records[i] }))
32    .filter(r => r.ErrorCode);
33  // Retry failed records with exponential backoff
34}

Producers

1. AWS SDK (PutRecord / PutRecords)

Direct API calls. Simple but no built-in batching or retry beyond what you implement.

2. Kinesis Producer Library (KPL)

The KPL is a high-performance producer library that dramatically increases throughput through:

  • Aggregation: Combines multiple small records into a single Kinesis record (up to 1 MB)
  • Collection: Batches multiple PutRecords calls together
  • Automatic retry with exponential backoff
  • Async, non-blocking writes

Trade-off: KPL adds up to 1 second of latency (RecordMaxBufferedTime). If your use case requires sub-second delivery, use the SDK directly.

KPL aggregation: 1,000 records × 100 bytes each = 100 KB → 1 Kinesis record Effective throughput: 10,000+ records/sec on a single shard

Consumers must use KCL or de-aggregate manually to unpack KPL-aggregated records. Lambda and Firehose handle de-aggregation automatically.

3. Kinesis Agent

A standalone Java application installed on servers (EC2, on-premise). Monitors log files, converts to JSON, and streams to Kinesis or Firehose with automatic rotation and retry.


Consumers

1. Shared (Classic) Throughput

All consumers using GetRecords share the 2 MB/s read throughput per shard.

  • Each consumer polls independently using GetShardIterator + GetRecords
  • Max 5 GetRecords calls/sec per shard
  • Latency: ~200ms average
javascript
1import { GetShardIteratorCommand, GetRecordsCommand } from '@aws-sdk/client-kinesis';
2
3// Get iterator starting from the latest position
4const { ShardIterator } = await kinesis.send(new GetShardIteratorCommand({
5  StreamName: 'clickstream',
6  ShardId: 'shardId-000000000000',
7  ShardIteratorType: 'LATEST',       // TRIM_HORIZON (oldest), AT_SEQUENCE_NUMBER, AFTER_SEQUENCE_NUMBER, AT_TIMESTAMP
8}));
9
10// Poll for records
11let iterator = ShardIterator;
12while (true) {
13  const { Records, NextShardIterator, MillisBehindLatest } = await kinesis.send(
14    new GetRecordsCommand({ ShardIterator: iterator, Limit: 100 })
15  );
16
17  if (Records.length > 0) {
18    for (const record of Records) {
19      const data = JSON.parse(Buffer.from(record.Data).toString());
20      console.log('Processing:', data, 'Seq:', record.SequenceNumber);
21    }
22  }
23
24  console.log(`${MillisBehindLatest}ms behind latest`);
25  iterator = NextShardIterator;
26  await new Promise(r => setTimeout(r, 1000)); // poll every second
27}

2. Enhanced Fan-Out (EFO)

Each registered consumer gets a dedicated 2 MB/s pipe per shard — completely independent of other consumers. Uses HTTP/2 push (SubscribeToShard) instead of polling.

FeatureSharedEnhanced Fan-Out
Throughput2 MB/s shared2 MB/s per consumer per shard
Latency~200ms~70ms
CostIncludedExtra per shard-consumer-hour + data retrieval
ModelPull (polling)Push (HTTP/2 push)
Max consumers5 calls/sec/shard20 registered consumers per stream
bash
1# Register an enhanced fan-out consumer
2aws kinesis register-stream-consumer \
3  --stream-arn arn:aws:kinesis:us-east-1:123:stream/clickstream \
4  --consumer-name analytics-service

3. Lambda (Event Source Mapping)

Lambda polls Kinesis on your behalf. Configure key settings:

javascript
1// Lambda receives batches per shard
2exports.handler = async (event) => {
3  const failures = [];
4
5  for (const record of event.Records) {
6    try {
7      const data = JSON.parse(Buffer.from(record.kinesis.data, 'base64').toString());
8      console.log('Shard:', record.eventSourceARN.split('/').pop());
9      console.log('Sequence:', record.kinesis.sequenceNumber);
10      console.log('Partition key:', record.kinesis.partitionKey);
11      await processEvent(data);
12    } catch (err) {
13      failures.push({ itemIdentifier: record.kinesis.sequenceNumber });
14    }
15  }
16
17  return { batchItemFailures: failures };
18};

Lambda ESM settings for Kinesis:

SettingRangeNotes
Batch size1 – 10,000Records per invocation
Batch window0 – 300sWait to fill batch
Starting positionTRIM_HORIZON / LATEST / AT_TIMESTAMPWhere to start reading
Parallelization factor1 – 10Concurrent batches per shard
Bisect on errortrue/falseSplit failed batch in half to isolate bad record
Destination on failureSQS / SNSRoute unprocessable records
Tumbling window0 – 900sAggregation window for stateful processing

4. Kinesis Client Library (KCL)

The KCL is a Java library (with connectors for Python, Ruby, Node.js) that manages:

  • Shard assignment: Distributes shards across multiple worker instances
  • Checkpointing: Stores consumer position in DynamoDB (one table per application)
  • Lease management: Workers take leases on shards; dead workers' leases are taken over
  • Automatic scaling: Add workers → shards redistributed automatically

One shard = at most one KCL worker at a time. You can have more workers than shards, but only one processes each shard.


Ordering and Partition Keys

Records with the same partition key always go to the same shard — ensuring ordered delivery for that key.

Partition key "user-123" → always Shard 1 → records arrive in order Partition key "user-456" → always Shard 2 → ordered within Shard 2

Hot shard problem: If one partition key generates disproportionate traffic (e.g., all records from a single high-traffic user), that shard hits its 1 MB/s write limit.

Solution: Use high-cardinality partition keys, or append a random suffix for non-ordered writes:

javascript
1// High cardinality — spread across shards
2const partitionKey = `user-${userId}-${Date.now() % 100}`;

Resharding

Split a shard when throughput approaches its limit. Merge shards when traffic drops to reduce cost.

bash
1# Split shard — specify the hash key to split at
2aws kinesis split-shard \
3  --stream-name clickstream \
4  --shard-to-split shardId-000000000000 \
5  --new-starting-hash-key 170141183460469231731687303715884105728
6
7# Merge two adjacent shards
8aws kinesis merge-shards \
9  --stream-name clickstream \
10  --shard-to-merge shardId-000000000000 \
11  --adjacent-shard-to-merge shardId-000000000001

Resharding is sequential — only one split or merge operation at a time. After resharding, the parent shard is closed but still readable until all records expire. Consumers must finish reading the parent before moving to children.


Data Retention and Replay

RetentionCostUse case
24 hours (default)Included in shard-hourShort-term processing
Up to 7 daysStandard rateExtended processing window
Up to 365 daysLong-term retention rateCompliance, audit, replay

Replay: Restart a consumer from TRIM_HORIZON (oldest available record) or any AT_TIMESTAMP to reprocess historical data — impossible with SQS.


Kinesis Data Firehose

Firehose is a fully managed delivery stream — no consumer code needed. It buffers records and delivers batches to destinations.

Rendering diagram…
FeatureKinesis Data StreamsKinesis Firehose
Consumer managementYou manage consumersFully managed — no code
Delivery latencyReal-time (milliseconds)60s minimum (buffer)
DestinationsCustom (Lambda, KCL, etc.)S3, Redshift, OpenSearch, Splunk, HTTP
Data transformationYou implementLambda transform built-in
Replay
ScalingManual (shards)Automatic
PricingPer shard-hourPer GB ingested

Firehose buffer settings:

  • Buffer size: 1 MB – 128 MB (deliver when buffer is full)
  • Buffer interval: 60s – 900s (deliver when timer expires)
  • Whichever condition is met first triggers delivery

Kinesis vs SQS vs SNS

FeatureKinesisSQSSNS
Message retentionUp to 365 daysUp to 14 daysNone (push only)
Replay
OrderingPer shard/partition keyFIFO queue onlyFIFO topic only
Multiple consumers✅ simultaneously❌ one at a time✅ (fan-out)
ThroughputShard-based, scalableUnlimited (Standard)Unlimited
Consumer modelPull (or EFO push)PullPush
Real-timeNear real-time
Use caseStreaming analytics, event sourcingJob queues, decouplingNotifications, fan-out

DVA-C02 Quick Reference

TopicKey Fact
Shard write limit1 MB/s or 1,000 records/sec
Shard read limit (shared)2 MB/s total per shard
Shard read limit (EFO)2 MB/s per registered consumer per shard
Max record size1 MB
Default retention24 hours
Max retention365 days
Max EFO consumers per stream20
KPL addsAggregation + batching + ~1s latency
KCL checkpointingStored in DynamoDB
Same partition keySame shard → ordered delivery
Lambda bisect on errorSplits failed batch in half to find bad record
Firehose min latency60 seconds (not real-time)
Firehose replayNot supported
On-demand modeAuto-scales up to 200 MB/s
Shard iterator typesTRIM_HORIZON, LATEST, AT_SEQUENCE_NUMBER, AT_TIMESTAMP
ReshardingSequential — one operation at a time
Hot shard fixHigh-cardinality partition keys

Practice Questions5

medium

Q1. A Kinesis Data Stream has 4 shards. A producer is writing 6 MB/s of data. Consumers are falling behind. What is the correct action?


Select one answer before revealing.

hard

Q2. A Kinesis stream has multiple Lambda consumers. The developer notices that all consumers are reading the same data, and there is noticeable read latency because they share the 2 MB/s per shard read limit. What is the recommended solution?


Select one answer before revealing.

hard

Q3. A developer uses a partition key of "productCategory" for Kinesis records. The stream has 10 shards but one shard is handling 90% of all traffic. What is the root cause and solution?


Select one answer before revealing.

easy

Q4. A developer needs to archive all Kinesis stream records to S3 in Parquet format, partitioned by date, without writing any custom code. Which service should be used?


Select one answer before revealing.

medium

Q5. A developer uses a Lambda function as a Kinesis consumer. The function processes a batch of 100 records but 3 records fail. With default settings, what happens?


Select one answer before revealing.