DynamoDB Data Modeling & Operations
A comprehensive deep dive into Amazon DynamoDB — primary keys, capacity modes, expressions, secondary indexes, streams, DAX, transactions, and data modeling patterns for the DVA-C02 exam.
What is Amazon DynamoDB?
Amazon DynamoDB is a fully managed, serverless, key-value and document NoSQL database that delivers single-digit millisecond performance at any scale. AWS handles all hardware provisioning, patching, replication, and backups — you only interact with tables and items.
Core mental model: DynamoDB is not a relational database. There are no joins, no schemas, and no SQL. You design your access patterns first, then model your data around them.
When to choose DynamoDB:
- Need predictable low latency at massive scale
- Traffic is spiky or unpredictable (on-demand mode)
- Can define access patterns upfront
- Need a fully managed, serverless data store
Tables, Items, and Attributes
| Concept | DynamoDB equivalent | RDBMS equivalent |
|---|---|---|
| Table | Table | Table |
| Item | Row / document | Row |
| Attribute | Field | Column |
| Primary key | Primary key | Primary key |
- A table is a collection of items. No fixed schema — each item can have different attributes.
- An item is a single data record. Maximum size: 400 KB.
- An attribute is a name-value pair. Supported types: String (S), Number (N), Binary (B), Boolean (BOOL), Null (NULL), List (L), Map (M), StringSet (SS), NumberSet (NS), BinarySet (BS).
1{
2 "userId": "usr_01J8X",
3 "email": "alice@example.com",
4 "createdAt": "2024-01-15T10:30:00Z",
5 "profile": {
6 "name": "Alice Johnson",
7 "tier": "premium"
8 },
9 "tags": ["admin", "beta-tester"]
10}Primary Key Design
The primary key is the single most important design decision in DynamoDB. Get it wrong and you will either hit hot partitions or be unable to query your data efficiently.
Option 1 — Partition Key Only (Simple Key)
Every item must have a unique partition key. DynamoDB hashes the PK to determine which physical partition stores the item.
Table: Users — PK: userId (must be unique)
| userId | name | |
|---|---|---|
| usr_01J8X | alice@example.com | Alice |
| usr_02K9Y | bob@example.com | Bob |
Option 2 — Partition Key + Sort Key (Composite Key)
The PK + SK combination must be unique. Multiple items can share the same PK — they are stored together and sorted by SK. This enables efficient range queries.
Table: Orders — PK: customerId, SK: orderId#timestamp
| customerId | SK (orderId) | amount | status |
|---|---|---|---|
| cust_A | ORDER#2024-01-01T10:00:00 | 49.99 | PAID |
| cust_A | ORDER#2024-01-15T14:30:00 | 129.00 | SHIPPED |
| cust_A | ORDER#2024-02-01T09:00:00 | 19.99 | PENDING |
| cust_B | ORDER#2024-01-10T11:00:00 | 75.00 | PAID |
Query pattern enabled: "Give me all orders for customer A, sorted by date" — a single efficient Query operation.
Capacity Modes
Provisioned Capacity
You specify Read Capacity Units (RCU) and Write Capacity Units (WCU) in advance. AWS guarantees that throughput. Use Auto Scaling to adjust capacity based on CloudWatch metrics.
| Unit | What it covers |
|---|---|
| 1 RCU | 1 strongly consistent read of ≤ 4 KB, OR 2 eventually consistent reads of ≤ 4 KB |
| 1 WCU | 1 write of ≤ 1 KB |
| Transactional reads | 2 RCU per 4 KB |
| Transactional writes | 2 WCU per 1 KB |
Capacity math example: Read 10 KB item with strong consistency = 10 KB ÷ 4 KB = 2.5 → rounded up = 3 RCU
1aws dynamodb create-table \
2 --table-name Orders \
3 --attribute-definitions \
4 AttributeName=customerId,AttributeType=S \
5 AttributeName=orderId,AttributeType=S \
6 --key-schema \
7 AttributeName=customerId,KeyType=HASH \
8 AttributeName=orderId,KeyType=RANGE \
9 --billing-mode PROVISIONED \
10 --provisioned-throughput ReadCapacityUnits=10,WriteCapacityUnits=5On-Demand Capacity
No capacity planning — DynamoDB instantly scales to any traffic level. You pay per request.
- Best for: Unpredictable traffic, new tables where usage is unknown, development/test
- Cost: ~6–7× more expensive per request than provisioned at sustained load
- Switch between modes once per 24 hours
Core Read Operations
GetItem — Single Item Lookup
Retrieves exactly one item by its full primary key. Most efficient operation.
1import { DynamoDBDocumentClient, GetCommand } from '@aws-sdk/lib-dynamodb';
2
3const result = await docClient.send(new GetCommand({
4 TableName: 'Orders',
5 Key: {
6 customerId: 'cust_A',
7 orderId: 'ORDER#2024-01-01T10:00:00',
8 },
9 ConsistentRead: true, // strongly consistent — uses 1 RCU instead of 0.5
10}));
11
12console.log(result.Item);Query — Items Sharing a Partition Key
Returns all items with the same PK, optionally filtered by SK. Results are always sorted by SK. Most common operation for composite key tables.
1import { QueryCommand } from '@aws-sdk/lib-dynamodb';
2
3// Get all orders for cust_A in January 2024
4const result = await docClient.send(new QueryCommand({
5 TableName: 'Orders',
6 KeyConditionExpression: 'customerId = :cid AND orderId BETWEEN :start AND :end',
7 ExpressionAttributeValues: {
8 ':cid': 'cust_A',
9 ':start': 'ORDER#2024-01-01',
10 ':end': 'ORDER#2024-01-31T99:99:99',
11 },
12 ScanIndexForward: false, // descending order (newest first)
13 Limit: 20,
14}));Scan — Full Table Read
Reads every item in the table (or index). Extremely expensive on large tables. Avoid in production hot paths.
1import { ScanCommand } from '@aws-sdk/lib-dynamodb';
2
3// Parallel scan — splits table into 4 segments for faster reads
4const segment = 0; // Run this concurrently with segments 1, 2, 3
5const result = await docClient.send(new ScanCommand({
6 TableName: 'Orders',
7 TotalSegments: 4,
8 Segment: segment,
9 FilterExpression: 'amount > :min',
10 ExpressionAttributeValues: { ':min': 100 },
11}));Filter vs KeyCondition: FilterExpression is applied after items are read — you are still charged RCUs for all scanned items, even those filtered out. Only KeyConditionExpression reduces RCU consumption.
Core Write Operations
1import { PutCommand, UpdateCommand, DeleteCommand } from '@aws-sdk/lib-dynamodb';
2
3// PutItem — creates or completely replaces an item
4await docClient.send(new PutCommand({
5 TableName: 'Orders',
6 Item: {
7 customerId: 'cust_A',
8 orderId: 'ORDER#2024-03-01T12:00:00',
9 amount: 59.99,
10 status: 'PENDING',
11 },
12 ConditionExpression: 'attribute_not_exists(customerId)', // fail if already exists
13}));
14
15// UpdateItem — modify specific attributes without replacing the whole item
16await docClient.send(new UpdateCommand({
17 TableName: 'Orders',
18 Key: { customerId: 'cust_A', orderId: 'ORDER#2024-03-01T12:00:00' },
19 UpdateExpression: 'SET #s = :status, updatedAt = :ts ADD version :one',
20 ExpressionAttributeNames: { '#s': 'status' },
21 ExpressionAttributeValues: {
22 ':status': 'SHIPPED',
23 ':ts': new Date().toISOString(),
24 ':one': 1,
25 },
26 ConditionExpression: '#s = :pending', // optimistic locking
27 ExpressionAttributeValues: {
28 ':status': 'SHIPPED',
29 ':ts': new Date().toISOString(),
30 ':one': 1,
31 ':pending': 'PENDING',
32 },
33}));
34
35// DeleteItem — remove an item
36await docClient.send(new DeleteCommand({
37 TableName: 'Orders',
38 Key: { customerId: 'cust_A', orderId: 'ORDER#2024-03-01T12:00:00' },
39 ConditionExpression: 'attribute_exists(customerId)',
40}));Expressions Cheat Sheet
DynamoDB uses expression syntax instead of SQL. These are the five expression types:
| Expression | Purpose | Used in |
|---|---|---|
| KeyConditionExpression | Filter by PK and SK | Query |
| FilterExpression | Post-read filter on any attribute | Query, Scan |
| ConditionExpression | Abort write if condition fails | Put, Update, Delete, Transact |
| UpdateExpression | Specify attribute changes | UpdateItem |
| ProjectionExpression | Return only specific attributes | All reads |
UpdateExpression verbs:
| Verb | Example | Purpose |
|---|---|---|
SET | SET name = :val, profile.tier = :tier | Set or overwrite an attribute |
REMOVE | REMOVE tags[0], optionalField | Delete an attribute or list element |
ADD | ADD viewCount :one | Atomic increment (Number) or union (Set) |
DELETE | DELETE tags :removeSet | Remove elements from a Set attribute |
ConditionExpression functions:
| Function | Purpose |
|---|---|
attribute_exists(path) | Attribute must exist |
attribute_not_exists(path) | Attribute must NOT exist — use for safe inserts |
attribute_type(path, type) | Check attribute data type |
begins_with(path, substr) | String prefix check |
contains(path, operand) | String contains or Set membership |
size(path) | Length of string, list, map, or set |
Secondary Indexes
When you need to query by attributes other than the primary key, use indexes.
Global Secondary Index (GSI)
- Different PK and SK from the base table — enables completely new access patterns
- Has its own provisioned throughput (or inherits on-demand)
- Only supports eventually consistent reads
- Can be added or deleted at any time
- Up to 20 GSIs per table
- Items appear in the GSI only if they have the GSI's key attributes
1aws dynamodb update-table --table-name Orders \
2 --attribute-definitions \
3 AttributeName=status,AttributeType=S \
4 AttributeName=createdAt,AttributeType=S \
5 --global-secondary-index-updates '[{
6 "Create": {
7 "IndexName": "status-createdAt-index",
8 "KeySchema": [
9 {"AttributeName":"status","KeyType":"HASH"},
10 {"AttributeName":"createdAt","KeyType":"RANGE"}
11 ],
12 "Projection": {"ProjectionType":"ALL"},
13 "ProvisionedThroughput": {"ReadCapacityUnits":5,"WriteCapacityUnits":5}
14 }
15 }]'Local Secondary Index (LSI)
- Same PK as the base table, different SK — enables range queries on a different sort dimension
- Must be created at table creation time — cannot add or delete later
- Shares the base table's throughput (no separate capacity)
- Supports strongly consistent reads (unlike GSI)
- Up to 5 LSIs per table
- Items with the same PK across base table + LSI share a 10 GB item collection limit
| Feature | GSI | LSI |
|---|---|---|
| PK | Any attribute | Must match base table PK |
| SK | Any attribute | Any attribute (different from base) |
| Consistency | Eventually consistent only | Strongly or eventually consistent |
| Throughput | Separate (provisioned/on-demand) | Shared with base table |
| Creation | Any time | Table creation only |
| Deletion | Supported | Not supported |
| Limit | 20 per table | 5 per table |
| Item collection limit | None | 10 GB per PK value |
Transactions
DynamoDB transactions allow ACID operations across multiple items and tables in a single all-or-nothing request.
1import { TransactWriteCommand, TransactGetCommand } from '@aws-sdk/lib-dynamodb';
2
3// Transfer funds — debit one account, credit another atomically
4await docClient.send(new TransactWriteCommand({
5 TransactItems: [
6 {
7 Update: {
8 TableName: 'Accounts',
9 Key: { accountId: 'ACC-001' },
10 UpdateExpression: 'ADD balance :debit',
11 ConditionExpression: 'balance >= :amount',
12 ExpressionAttributeValues: { ':debit': -100, ':amount': 100 },
13 },
14 },
15 {
16 Update: {
17 TableName: 'Accounts',
18 Key: { accountId: 'ACC-002' },
19 UpdateExpression: 'ADD balance :credit',
20 ExpressionAttributeValues: { ':credit': 100 },
21 },
22 },
23 {
24 Put: {
25 TableName: 'Transactions',
26 Item: {
27 txId: 'TX-' + Date.now(),
28 from: 'ACC-001', to: 'ACC-002', amount: 100,
29 createdAt: new Date().toISOString(),
30 },
31 ConditionExpression: 'attribute_not_exists(txId)', // idempotency check
32 },
33 },
34 ],
35}));Transaction limits:
- Up to 100 unique items per TransactWriteItems or TransactGetItems
- Spans multiple tables (within the same region and account)
- Costs 2× the normal RCU/WCU (transaction overhead)
- Not supported on GSIs directly (write to base table, GSI updates automatically)
DynamoDB Streams
Streams capture a time-ordered log of every item change (insert, update, delete) in the table.
Stream view types (what data each record contains):
| View Type | Contents |
|---|---|
| KEYS_ONLY | Only the key attributes of the modified item |
| NEW_IMAGE | The entire item after the change |
| OLD_IMAGE | The entire item before the change |
| NEW_AND_OLD_IMAGES | Both before and after — most useful for auditing |
1aws dynamodb update-table --table-name Orders \
2 --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGESCommon use cases: Cross-region replication, search index maintenance (ElasticSearch/OpenSearch), audit logging, cache invalidation, event-driven triggers.
DynamoDB Accelerator (DAX)
DAX is an in-memory cache specifically built for DynamoDB. It provides microsecond read latency for eventually consistent reads.
| Feature | Detail |
|---|---|
| Read latency | Microseconds (vs. single-digit ms for DynamoDB) |
| Write behavior | Write-through — writes go to DynamoDB first, then cache |
| API compatibility | Same DynamoDB API — minimal code change |
| Consistency | Eventually consistent reads only (strongly consistent reads bypass DAX) |
| Item cache TTL | Default 5 minutes |
| Query cache TTL | Default 1 minute |
| VPC | Deployed inside your VPC |
1import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
2import { DaxDocument } from 'amazon-dax-client';
3
4// Swap DynamoDB client for DAX client — all other code unchanged
5const daxClient = new DaxDocument({
6 endpoints: ['my-dax-cluster.abc123.dax-clusters.us-east-1.amazonaws.com:8111'],
7 region: 'us-east-1',
8});When NOT to use DAX: Write-heavy workloads, strongly-consistent reads required, Lambda functions with very infrequent invocations (cold start cost of maintaining DAX connection), or applications that are already fast enough without caching.
Time to Live (TTL)
TTL automatically deletes items past a specified expiry timestamp — free of charge, no WCU consumed.
1// Store a session that expires in 24 hours
2const expiresAt = Math.floor(Date.now() / 1000) + 86400; // Unix epoch seconds
3
4await docClient.send(new PutCommand({
5 TableName: 'Sessions',
6 Item: {
7 sessionId: 'sess_xyz',
8 userId: 'usr_01J8X',
9 data: { theme: 'dark' },
10 expiresAt, // DynamoDB reads this attribute for TTL
11 },
12}));1# Enable TTL on the "expiresAt" attribute
2aws dynamodb update-time-to-live \
3 --table-name Sessions \
4 --time-to-live-specification Enabled=true,AttributeName=expiresAtImportant TTL facts:
- Deletion is asynchronous — items may linger for up to 48 hours past expiry
- Expired-but-not-yet-deleted items are excluded from reads (DynamoDB filters them out)
- TTL deletions appear in DynamoDB Streams with
userIdentity.type = "Service" - TTL attribute must be a Number storing Unix epoch seconds
Hot Partition Problem & Solutions
DynamoDB distributes data across physical partitions by hashing the partition key. If many requests hit the same PK value, you get a hot partition — a bottleneck.
Symptoms: ProvisionedThroughputExceededException, high ThrottledRequests metric, uneven latency.
Write Sharding
Distribute writes across multiple partitions by appending a random suffix to the PK:
1// ❌ HOT PARTITION: All writes to the same partition
2await docClient.send(new PutCommand({
3 TableName: 'Events',
4 Item: { pk: 'EVENT_LOG', sk: Date.now(), data: event },
5}));
6
7// ✅ SHARDED: Spread across 10 shards
8const SHARD_COUNT = 10;
9const shard = Math.floor(Math.random() * SHARD_COUNT);
10await docClient.send(new PutCommand({
11 TableName: 'Events',
12 Item: { pk: `EVENT_LOG#${shard}`, sk: Date.now(), data: event },
13}));
14
15// To read all shards, query each shard in parallel
16const queries = Array.from({ length: SHARD_COUNT }, (_, i) =>
17 docClient.send(new QueryCommand({
18 TableName: 'Events',
19 KeyConditionExpression: 'pk = :pk',
20 ExpressionAttributeValues: { ':pk': `EVENT_LOG#${i}` },
21 }))
22);
23const results = await Promise.all(queries);Single-Table Design Pattern
Store multiple entity types in a single table using generic PK/SK names and a type attribute. Eliminates cross-table joins and reduces operational overhead.
| PK | SK | type | attributes |
|---|---|---|---|
| USER#usr_01 | PROFILE | user | name, email |
| USER#usr_01 | ORDER#2024-01-01 | order | amount, status |
| USER#usr_01 | ORDER#2024-01-15 | order | amount, status |
| PRODUCT#prod_A | METADATA | product | name, price |
| PRODUCT#prod_A | REVIEW#rev_001 | review | rating, text |
Batch Operations
1import { BatchGetCommand, BatchWriteCommand } from '@aws-sdk/lib-dynamodb';
2
3// BatchGetItem — up to 100 items across multiple tables
4const result = await docClient.send(new BatchGetCommand({
5 RequestItems: {
6 'Users': { Keys: [{ userId: 'usr_01' }, { userId: 'usr_02' }] },
7 'Orders': { Keys: [{ customerId: 'cust_A', orderId: 'ORDER#001' }] },
8 },
9}));
10// result.UnprocessedKeys contains items that failed — retry with exponential backoff
11
12// BatchWriteItem — up to 25 put/delete operations across tables
13await docClient.send(new BatchWriteCommand({
14 RequestItems: {
15 'Orders': [
16 { PutRequest: { Item: { customerId: 'cust_A', orderId: 'ORDER#002', amount: 25 } } },
17 { DeleteRequest: { Key: { customerId: 'cust_A', orderId: 'ORDER#001' } } },
18 ],
19 },
20}));BatchWriteItem does NOT support UpdateItem — only Put and Delete. For updates, use TransactWriteItems or individual UpdateItem calls.
DVA-C02 Quick Reference
| Topic | Key Fact |
|---|---|
| Max item size | 400 KB |
| Partition key only | Each PK value must be unique |
| Composite key | PK + SK combination must be unique |
| 1 RCU strong | 1 read of ≤ 4 KB |
| 1 WCU | 1 write of ≤ 1 KB |
| Transactional cost | 2× RCU/WCU |
| Switch capacity modes | Once per 24 hours |
| GSI consistency | Eventually consistent only |
| LSI consistency | Strongly or eventually consistent |
| LSI creation | Table creation time only |
| Max GSIs per table | 20 |
| Max LSIs per table | 5 |
| Max layers (batch get) | 100 items |
| Max items (transact write) | 100 items |
| Streams retention | 24 hours |
| TTL deletion delay | Up to 48 hours after expiry |
| DAX write behavior | Write-through |
| DAX consistency | Eventually consistent reads only |
| Hot partition fix | Write sharding (random suffix) |
| Safe insert pattern | attribute_not_exists(pk) condition |
Practice Questions10
Q1. What is the difference between a Partition Key and a Sort Key in DynamoDB?
Select one answer before revealing.
Q2. A developer runs a DynamoDB Scan on a table with 100 GB of data. What is the expected behavior and cost concern?
Select one answer before revealing.
Q3. A developer needs to write an item to DynamoDB only if an item with that partition key does not already exist. What is the correct approach?
Select one answer before revealing.
Q4. A DynamoDB table stores IoT sensor readings with sensorId as the partition key. The table is experiencing hot partitions because only 10 sensors generate 90% of writes. Which strategy best distributes the write load?
Select one answer before revealing.
Q5. Which TWO DynamoDB features are best suited for improving read performance on a read-heavy application? (Choose 2)
Select one answer before revealing.
Q6. A developer needs to update a DynamoDB item's "views" counter atomically, incrementing it by 1. What is the correct approach?
Select one answer before revealing.
Q7. A developer enables DynamoDB Streams on a table and creates a Lambda event source mapping. Which stream view type includes both the old and new item images in every stream record?
Select one answer before revealing.
Q8. A developer creates a DynamoDB GSI with the same attributes as the base table's key but in reverse (table: PK=userId, SK=orderId; GSI: PK=orderId, SK=userId). What behavior should the developer expect when querying the GSI?
Select one answer before revealing.
Q9. A developer performs a DynamoDB TransactWriteItems operation with 3 PutItem actions. One of the items fails a condition check. What happens to the other two items?
Select one answer before revealing.
Q10. A DynamoDB table has TTL enabled on the "expiresAt" attribute. An item has expiresAt set to a Unix timestamp that passed 2 hours ago. What is the state of this item?
Select one answer before revealing.