Concept

Hard

DynamoDB Data Modeling & Operations

14 min read·DynamoDBNoSQLData ModelingDVA-C02

A comprehensive deep dive into Amazon DynamoDB — primary keys, capacity modes, expressions, secondary indexes, streams, DAX, transactions, and data modeling patterns for the DVA-C02 exam.

What is Amazon DynamoDB?

Amazon DynamoDB is a fully managed, serverless, key-value and document NoSQL database that delivers single-digit millisecond performance at any scale. AWS handles all hardware provisioning, patching, replication, and backups — you only interact with tables and items.

Core mental model: DynamoDB is not a relational database. There are no joins, no schemas, and no SQL. You design your access patterns first, then model your data around them.

When to choose DynamoDB:

Need predictable low latency at massive scale
Traffic is spiky or unpredictable (on-demand mode)
Can define access patterns upfront
Need a fully managed, serverless data store

Tables, Items, and Attributes

Concept	DynamoDB equivalent	RDBMS equivalent
Table	Table	Table
Item	Row / document	Row
Attribute	Field	Column
Primary key	Primary key	Primary key

A table is a collection of items. No fixed schema — each item can have different attributes.
An item is a single data record. Maximum size: 400 KB.
An attribute is a name-value pair. Supported types: String (S), Number (N), Binary (B), Boolean (BOOL), Null (NULL), List (L), Map (M), StringSet (SS), NumberSet (NS), BinarySet (BS).

json

1{
2  "userId": "usr_01J8X",
3  "email": "alice@example.com",
4  "createdAt": "2024-01-15T10:30:00Z",
5  "profile": {
6    "name": "Alice Johnson",
7    "tier": "premium"
8  },
9  "tags": ["admin", "beta-tester"]
10}

Primary Key Design

The primary key is the single most important design decision in DynamoDB. Get it wrong and you will either hit hot partitions or be unable to query your data efficiently.

Option 1 — Partition Key Only (Simple Key)

Every item must have a unique partition key. DynamoDB hashes the PK to determine which physical partition stores the item.

Table: Users — PK: userId (must be unique)

userId	email	name
usr_01J8X	alice@example.com	Alice
usr_02K9Y	bob@example.com	Bob

Option 2 — Partition Key + Sort Key (Composite Key)

The PK + SK combination must be unique. Multiple items can share the same PK — they are stored together and sorted by SK. This enables efficient range queries.

Table: Orders — PK: customerId, SK: orderId#timestamp

customerId	SK (orderId)	amount	status
cust_A	ORDER#2024-01-01T10:00:00	49.99	PAID
cust_A	ORDER#2024-01-15T14:30:00	129.00	SHIPPED
cust_A	ORDER#2024-02-01T09:00:00	19.99	PENDING
cust_B	ORDER#2024-01-10T11:00:00	75.00	PAID

Query pattern enabled: "Give me all orders for customer A, sorted by date" — a single efficient Query operation.

Rendering diagram…

Capacity Modes

Provisioned Capacity

You specify Read Capacity Units (RCU) and Write Capacity Units (WCU) in advance. AWS guarantees that throughput. Use Auto Scaling to adjust capacity based on CloudWatch metrics.

Unit	What it covers
1 RCU	1 strongly consistent read of ≤ 4 KB, OR 2 eventually consistent reads of ≤ 4 KB
1 WCU	1 write of ≤ 1 KB
Transactional reads	2 RCU per 4 KB
Transactional writes	2 WCU per 1 KB

Capacity math example: Read 10 KB item with strong consistency = 10 KB ÷ 4 KB = 2.5 → rounded up = 3 RCU

bash

1aws dynamodb create-table \
2  --table-name Orders \
3  --attribute-definitions \
4      AttributeName=customerId,AttributeType=S \
5      AttributeName=orderId,AttributeType=S \
6  --key-schema \
7      AttributeName=customerId,KeyType=HASH \
8      AttributeName=orderId,KeyType=RANGE \
9  --billing-mode PROVISIONED \
10  --provisioned-throughput ReadCapacityUnits=10,WriteCapacityUnits=5

On-Demand Capacity

No capacity planning — DynamoDB instantly scales to any traffic level. You pay per request.

Best for: Unpredictable traffic, new tables where usage is unknown, development/test
Cost: ~6–7× more expensive per request than provisioned at sustained load
Switch between modes once per 24 hours

Core Read Operations

GetItem — Single Item Lookup

Retrieves exactly one item by its full primary key. Most efficient operation.

javascript

1import { DynamoDBDocumentClient, GetCommand } from '@aws-sdk/lib-dynamodb';
2
3const result = await docClient.send(new GetCommand({
4  TableName: 'Orders',
5  Key: {
6    customerId: 'cust_A',
7    orderId: 'ORDER#2024-01-01T10:00:00',
8  },
9  ConsistentRead: true,  // strongly consistent — uses 1 RCU instead of 0.5
10}));
11
12console.log(result.Item);

Returns all items with the same PK, optionally filtered by SK. Results are always sorted by SK. Most common operation for composite key tables.

javascript

1import { QueryCommand } from '@aws-sdk/lib-dynamodb';
2
3// Get all orders for cust_A in January 2024
4const result = await docClient.send(new QueryCommand({
5  TableName: 'Orders',
6  KeyConditionExpression: 'customerId = :cid AND orderId BETWEEN :start AND :end',
7  ExpressionAttributeValues: {
8    ':cid':   'cust_A',
9    ':start': 'ORDER#2024-01-01',
10    ':end':   'ORDER#2024-01-31T99:99:99',
11  },
12  ScanIndexForward: false,  // descending order (newest first)
13  Limit: 20,
14}));

Scan — Full Table Read

Reads every item in the table (or index). Extremely expensive on large tables. Avoid in production hot paths.

javascript

1import { ScanCommand } from '@aws-sdk/lib-dynamodb';
2
3// Parallel scan — splits table into 4 segments for faster reads
4const segment = 0; // Run this concurrently with segments 1, 2, 3
5const result = await docClient.send(new ScanCommand({
6  TableName: 'Orders',
7  TotalSegments: 4,
8  Segment: segment,
9  FilterExpression: 'amount > :min',
10  ExpressionAttributeValues: { ':min': 100 },
11}));

Filter vs KeyCondition: FilterExpression is applied after items are read — you are still charged RCUs for all scanned items, even those filtered out. Only KeyConditionExpression reduces RCU consumption.

Core Write Operations

javascript

1import { PutCommand, UpdateCommand, DeleteCommand } from '@aws-sdk/lib-dynamodb';
2
3// PutItem — creates or completely replaces an item
4await docClient.send(new PutCommand({
5  TableName: 'Orders',
6  Item: {
7    customerId: 'cust_A',
8    orderId: 'ORDER#2024-03-01T12:00:00',
9    amount: 59.99,
10    status: 'PENDING',
11  },
12  ConditionExpression: 'attribute_not_exists(customerId)',  // fail if already exists
13}));
14
15// UpdateItem — modify specific attributes without replacing the whole item
16await docClient.send(new UpdateCommand({
17  TableName: 'Orders',
18  Key: { customerId: 'cust_A', orderId: 'ORDER#2024-03-01T12:00:00' },
19  UpdateExpression: 'SET #s = :status, updatedAt = :ts ADD version :one',
20  ExpressionAttributeNames:  { '#s': 'status' },
21  ExpressionAttributeValues: {
22    ':status': 'SHIPPED',
23    ':ts':     new Date().toISOString(),
24    ':one':    1,
25  },
26  ConditionExpression: '#s = :pending',  // optimistic locking
27  ExpressionAttributeValues: {
28    ':status':  'SHIPPED',
29    ':ts':      new Date().toISOString(),
30    ':one':     1,
31    ':pending': 'PENDING',
32  },
33}));
34
35// DeleteItem — remove an item
36await docClient.send(new DeleteCommand({
37  TableName: 'Orders',
38  Key: { customerId: 'cust_A', orderId: 'ORDER#2024-03-01T12:00:00' },
39  ConditionExpression: 'attribute_exists(customerId)',
40}));

Expressions Cheat Sheet

DynamoDB uses expression syntax instead of SQL. These are the five expression types:

Expression	Purpose	Used in
KeyConditionExpression	Filter by PK and SK	Query
FilterExpression	Post-read filter on any attribute	Query, Scan
ConditionExpression	Abort write if condition fails	Put, Update, Delete, Transact
UpdateExpression	Specify attribute changes	UpdateItem
ProjectionExpression	Return only specific attributes	All reads

UpdateExpression verbs:

Verb	Example	Purpose
`SET`	`SET name = :val, profile.tier = :tier`	Set or overwrite an attribute
`REMOVE`	`REMOVE tags[0], optionalField`	Delete an attribute or list element
`ADD`	`ADD viewCount :one`	Atomic increment (Number) or union (Set)
`DELETE`	`DELETE tags :removeSet`	Remove elements from a Set attribute

ConditionExpression functions:

Function	Purpose
`attribute_exists(path)`	Attribute must exist
`attribute_not_exists(path)`	Attribute must NOT exist — use for safe inserts
`attribute_type(path, type)`	Check attribute data type
`begins_with(path, substr)`	String prefix check
`contains(path, operand)`	String contains or Set membership
`size(path)`	Length of string, list, map, or set

Secondary Indexes

When you need to query by attributes other than the primary key, use indexes.

Rendering diagram…

Global Secondary Index (GSI)

Different PK and SK from the base table — enables completely new access patterns
Has its own provisioned throughput (or inherits on-demand)
Only supports eventually consistent reads
Can be added or deleted at any time
Up to 20 GSIs per table
Items appear in the GSI only if they have the GSI's key attributes

bash

1aws dynamodb update-table --table-name Orders \
2  --attribute-definitions \
3      AttributeName=status,AttributeType=S \
4      AttributeName=createdAt,AttributeType=S \
5  --global-secondary-index-updates '[{
6    "Create": {
7      "IndexName": "status-createdAt-index",
8      "KeySchema": [
9        {"AttributeName":"status","KeyType":"HASH"},
10        {"AttributeName":"createdAt","KeyType":"RANGE"}
11      ],
12      "Projection": {"ProjectionType":"ALL"},
13      "ProvisionedThroughput": {"ReadCapacityUnits":5,"WriteCapacityUnits":5}
14    }
15  }]'

Local Secondary Index (LSI)

Same PK as the base table, different SK — enables range queries on a different sort dimension
Must be created at table creation time — cannot add or delete later
Shares the base table's throughput (no separate capacity)
Supports strongly consistent reads (unlike GSI)
Up to 5 LSIs per table
Items with the same PK across base table + LSI share a 10 GB item collection limit

Feature	GSI	LSI
PK	Any attribute	Must match base table PK
SK	Any attribute	Any attribute (different from base)
Consistency	Eventually consistent only	Strongly or eventually consistent
Throughput	Separate (provisioned/on-demand)	Shared with base table
Creation	Any time	Table creation only
Deletion	Supported	Not supported
Limit	20 per table	5 per table
Item collection limit	None	10 GB per PK value

Transactions

DynamoDB transactions allow ACID operations across multiple items and tables in a single all-or-nothing request.

javascript

1import { TransactWriteCommand, TransactGetCommand } from '@aws-sdk/lib-dynamodb';
2
3// Transfer funds — debit one account, credit another atomically
4await docClient.send(new TransactWriteCommand({
5  TransactItems: [
6    {
7      Update: {
8        TableName: 'Accounts',
9        Key: { accountId: 'ACC-001' },
10        UpdateExpression: 'ADD balance :debit',
11        ConditionExpression: 'balance >= :amount',
12        ExpressionAttributeValues: { ':debit': -100, ':amount': 100 },
13      },
14    },
15    {
16      Update: {
17        TableName: 'Accounts',
18        Key: { accountId: 'ACC-002' },
19        UpdateExpression: 'ADD balance :credit',
20        ExpressionAttributeValues: { ':credit': 100 },
21      },
22    },
23    {
24      Put: {
25        TableName: 'Transactions',
26        Item: {
27          txId: 'TX-' + Date.now(),
28          from: 'ACC-001', to: 'ACC-002', amount: 100,
29          createdAt: new Date().toISOString(),
30        },
31        ConditionExpression: 'attribute_not_exists(txId)',  // idempotency check
32      },
33    },
34  ],
35}));

Transaction limits:

Up to 100 unique items per TransactWriteItems or TransactGetItems
Spans multiple tables (within the same region and account)
Costs 2× the normal RCU/WCU (transaction overhead)
Not supported on GSIs directly (write to base table, GSI updates automatically)

DynamoDB Streams

Streams capture a time-ordered log of every item change (insert, update, delete) in the table.

Rendering diagram…

Stream view types (what data each record contains):

View Type	Contents
KEYS_ONLY	Only the key attributes of the modified item
NEW_IMAGE	The entire item after the change
OLD_IMAGE	The entire item before the change
NEW_AND_OLD_IMAGES	Both before and after — most useful for auditing

bash

1aws dynamodb update-table --table-name Orders \
2  --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES

Common use cases: Cross-region replication, search index maintenance (ElasticSearch/OpenSearch), audit logging, cache invalidation, event-driven triggers.

DynamoDB Accelerator (DAX)

DAX is an in-memory cache specifically built for DynamoDB. It provides microsecond read latency for eventually consistent reads.

Rendering diagram…

Feature	Detail
Read latency	Microseconds (vs. single-digit ms for DynamoDB)
Write behavior	Write-through — writes go to DynamoDB first, then cache
API compatibility	Same DynamoDB API — minimal code change
Consistency	Eventually consistent reads only (strongly consistent reads bypass DAX)
Item cache TTL	Default 5 minutes
Query cache TTL	Default 1 minute
VPC	Deployed inside your VPC

javascript

1import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
2import { DaxDocument } from 'amazon-dax-client';
3
4// Swap DynamoDB client for DAX client — all other code unchanged
5const daxClient = new DaxDocument({
6  endpoints: ['my-dax-cluster.abc123.dax-clusters.us-east-1.amazonaws.com:8111'],
7  region: 'us-east-1',
8});

When NOT to use DAX: Write-heavy workloads, strongly-consistent reads required, Lambda functions with very infrequent invocations (cold start cost of maintaining DAX connection), or applications that are already fast enough without caching.

Time to Live (TTL)

TTL automatically deletes items past a specified expiry timestamp — free of charge, no WCU consumed.

javascript

1// Store a session that expires in 24 hours
2const expiresAt = Math.floor(Date.now() / 1000) + 86400; // Unix epoch seconds
3
4await docClient.send(new PutCommand({
5  TableName: 'Sessions',
6  Item: {
7    sessionId: 'sess_xyz',
8    userId: 'usr_01J8X',
9    data: { theme: 'dark' },
10    expiresAt,  // DynamoDB reads this attribute for TTL
11  },
12}));

bash

1# Enable TTL on the "expiresAt" attribute
2aws dynamodb update-time-to-live \
3  --table-name Sessions \
4  --time-to-live-specification Enabled=true,AttributeName=expiresAt

Important TTL facts:

Deletion is asynchronous — items may linger for up to 48 hours past expiry
Expired-but-not-yet-deleted items are excluded from reads (DynamoDB filters them out)
TTL deletions appear in DynamoDB Streams with userIdentity.type = "Service"
TTL attribute must be a Number storing Unix epoch seconds

Hot Partition Problem & Solutions

DynamoDB distributes data across physical partitions by hashing the partition key. If many requests hit the same PK value, you get a hot partition — a bottleneck.

Symptoms: ProvisionedThroughputExceededException, high ThrottledRequests metric, uneven latency.

Write Sharding

Distribute writes across multiple partitions by appending a random suffix to the PK:

javascript

1// ❌ HOT PARTITION: All writes to the same partition
2await docClient.send(new PutCommand({
3  TableName: 'Events',
4  Item: { pk: 'EVENT_LOG', sk: Date.now(), data: event },
5}));
6
7// ✅ SHARDED: Spread across 10 shards
8const SHARD_COUNT = 10;
9const shard = Math.floor(Math.random() * SHARD_COUNT);
10await docClient.send(new PutCommand({
11  TableName: 'Events',
12  Item: { pk: `EVENT_LOG#${shard}`, sk: Date.now(), data: event },
13}));
14
15// To read all shards, query each shard in parallel
16const queries = Array.from({ length: SHARD_COUNT }, (_, i) =>
17  docClient.send(new QueryCommand({
18    TableName: 'Events',
19    KeyConditionExpression: 'pk = :pk',
20    ExpressionAttributeValues: { ':pk': `EVENT_LOG#${i}` },
21  }))
22);
23const results = await Promise.all(queries);

Single-Table Design Pattern

Store multiple entity types in a single table using generic PK/SK names and a type attribute. Eliminates cross-table joins and reduces operational overhead.

PK	SK	type	attributes
USER#usr_01	PROFILE	user	name, email
USER#usr_01	ORDER#2024-01-01	order	amount, status
USER#usr_01	ORDER#2024-01-15	order	amount, status
PRODUCT#prod_A	METADATA	product	name, price
PRODUCT#prod_A	REVIEW#rev_001	review	rating, text

Batch Operations

javascript

1import { BatchGetCommand, BatchWriteCommand } from '@aws-sdk/lib-dynamodb';
2
3// BatchGetItem — up to 100 items across multiple tables
4const result = await docClient.send(new BatchGetCommand({
5  RequestItems: {
6    'Users':  { Keys: [{ userId: 'usr_01' }, { userId: 'usr_02' }] },
7    'Orders': { Keys: [{ customerId: 'cust_A', orderId: 'ORDER#001' }] },
8  },
9}));
10// result.UnprocessedKeys contains items that failed — retry with exponential backoff
11
12// BatchWriteItem — up to 25 put/delete operations across tables
13await docClient.send(new BatchWriteCommand({
14  RequestItems: {
15    'Orders': [
16      { PutRequest: { Item: { customerId: 'cust_A', orderId: 'ORDER#002', amount: 25 } } },
17      { DeleteRequest: { Key: { customerId: 'cust_A', orderId: 'ORDER#001' } } },
18    ],
19  },
20}));

BatchWriteItem does NOT support UpdateItem — only Put and Delete. For updates, use TransactWriteItems or individual UpdateItem calls.

DVA-C02 Quick Reference

Topic	Key Fact
Max item size	400 KB
Partition key only	Each PK value must be unique
Composite key	PK + SK combination must be unique
1 RCU strong	1 read of ≤ 4 KB
1 WCU	1 write of ≤ 1 KB
Transactional cost	2× RCU/WCU
Switch capacity modes	Once per 24 hours
GSI consistency	Eventually consistent only
LSI consistency	Strongly or eventually consistent
LSI creation	Table creation time only
Max GSIs per table	20
Max LSIs per table	5
Max layers (batch get)	100 items
Max items (transact write)	100 items
Streams retention	24 hours
TTL deletion delay	Up to 48 hours after expiry
DAX write behavior	Write-through
DAX consistency	Eventually consistent reads only
Hot partition fix	Write sharding (random suffix)
Safe insert pattern	`attribute_not_exists(pk)` condition

Practice Questions13

easy

Q1. An application requires a fully managed NoSQL database that provides single-digit millisecond latency for key-value lookups and scales automatically as traffic grows, with no instance management. Which service should the developer choose?