/DynamoDB Data Modeling & Operations
Concept
Hard

DynamoDB Data Modeling & Operations

14 min read·DynamoDBNoSQLData ModelingDVA-C02

A comprehensive deep dive into Amazon DynamoDB — primary keys, capacity modes, expressions, secondary indexes, streams, DAX, transactions, and data modeling patterns for the DVA-C02 exam.


What is Amazon DynamoDB?

Amazon DynamoDB is a fully managed, serverless, key-value and document NoSQL database that delivers single-digit millisecond performance at any scale. AWS handles all hardware provisioning, patching, replication, and backups — you only interact with tables and items.

Core mental model: DynamoDB is not a relational database. There are no joins, no schemas, and no SQL. You design your access patterns first, then model your data around them.

When to choose DynamoDB:

  • Need predictable low latency at massive scale
  • Traffic is spiky or unpredictable (on-demand mode)
  • Can define access patterns upfront
  • Need a fully managed, serverless data store

Tables, Items, and Attributes

ConceptDynamoDB equivalentRDBMS equivalent
TableTableTable
ItemRow / documentRow
AttributeFieldColumn
Primary keyPrimary keyPrimary key
  • A table is a collection of items. No fixed schema — each item can have different attributes.
  • An item is a single data record. Maximum size: 400 KB.
  • An attribute is a name-value pair. Supported types: String (S), Number (N), Binary (B), Boolean (BOOL), Null (NULL), List (L), Map (M), StringSet (SS), NumberSet (NS), BinarySet (BS).
json
1{
2  "userId": "usr_01J8X",
3  "email": "alice@example.com",
4  "createdAt": "2024-01-15T10:30:00Z",
5  "profile": {
6    "name": "Alice Johnson",
7    "tier": "premium"
8  },
9  "tags": ["admin", "beta-tester"]
10}

Primary Key Design

The primary key is the single most important design decision in DynamoDB. Get it wrong and you will either hit hot partitions or be unable to query your data efficiently.

Option 1 — Partition Key Only (Simple Key)

Every item must have a unique partition key. DynamoDB hashes the PK to determine which physical partition stores the item.

Table: Users — PK: userId (must be unique)

userIdemailname
usr_01J8Xalice@example.comAlice
usr_02K9Ybob@example.comBob

Option 2 — Partition Key + Sort Key (Composite Key)

The PK + SK combination must be unique. Multiple items can share the same PK — they are stored together and sorted by SK. This enables efficient range queries.

Table: Orders — PK: customerId, SK: orderId#timestamp

customerIdSK (orderId)amountstatus
cust_AORDER#2024-01-01T10:00:0049.99PAID
cust_AORDER#2024-01-15T14:30:00129.00SHIPPED
cust_AORDER#2024-02-01T09:00:0019.99PENDING
cust_BORDER#2024-01-10T11:00:0075.00PAID

Query pattern enabled: "Give me all orders for customer A, sorted by date" — a single efficient Query operation.

Rendering diagram…

Capacity Modes

Provisioned Capacity

You specify Read Capacity Units (RCU) and Write Capacity Units (WCU) in advance. AWS guarantees that throughput. Use Auto Scaling to adjust capacity based on CloudWatch metrics.

UnitWhat it covers
1 RCU1 strongly consistent read of ≤ 4 KB, OR 2 eventually consistent reads of ≤ 4 KB
1 WCU1 write of ≤ 1 KB
Transactional reads2 RCU per 4 KB
Transactional writes2 WCU per 1 KB

Capacity math example: Read 10 KB item with strong consistency = 10 KB ÷ 4 KB = 2.5 → rounded up = 3 RCU

bash
1aws dynamodb create-table \
2  --table-name Orders \
3  --attribute-definitions \
4      AttributeName=customerId,AttributeType=S \
5      AttributeName=orderId,AttributeType=S \
6  --key-schema \
7      AttributeName=customerId,KeyType=HASH \
8      AttributeName=orderId,KeyType=RANGE \
9  --billing-mode PROVISIONED \
10  --provisioned-throughput ReadCapacityUnits=10,WriteCapacityUnits=5

On-Demand Capacity

No capacity planning — DynamoDB instantly scales to any traffic level. You pay per request.

  • Best for: Unpredictable traffic, new tables where usage is unknown, development/test
  • Cost: ~6–7× more expensive per request than provisioned at sustained load
  • Switch between modes once per 24 hours

Core Read Operations

GetItem — Single Item Lookup

Retrieves exactly one item by its full primary key. Most efficient operation.

javascript
1import { DynamoDBDocumentClient, GetCommand } from '@aws-sdk/lib-dynamodb';
2
3const result = await docClient.send(new GetCommand({
4  TableName: 'Orders',
5  Key: {
6    customerId: 'cust_A',
7    orderId: 'ORDER#2024-01-01T10:00:00',
8  },
9  ConsistentRead: true,  // strongly consistent — uses 1 RCU instead of 0.5
10}));
11
12console.log(result.Item);

Query — Items Sharing a Partition Key

Returns all items with the same PK, optionally filtered by SK. Results are always sorted by SK. Most common operation for composite key tables.

javascript
1import { QueryCommand } from '@aws-sdk/lib-dynamodb';
2
3// Get all orders for cust_A in January 2024
4const result = await docClient.send(new QueryCommand({
5  TableName: 'Orders',
6  KeyConditionExpression: 'customerId = :cid AND orderId BETWEEN :start AND :end',
7  ExpressionAttributeValues: {
8    ':cid':   'cust_A',
9    ':start': 'ORDER#2024-01-01',
10    ':end':   'ORDER#2024-01-31T99:99:99',
11  },
12  ScanIndexForward: false,  // descending order (newest first)
13  Limit: 20,
14}));

Scan — Full Table Read

Reads every item in the table (or index). Extremely expensive on large tables. Avoid in production hot paths.

javascript
1import { ScanCommand } from '@aws-sdk/lib-dynamodb';
2
3// Parallel scan — splits table into 4 segments for faster reads
4const segment = 0; // Run this concurrently with segments 1, 2, 3
5const result = await docClient.send(new ScanCommand({
6  TableName: 'Orders',
7  TotalSegments: 4,
8  Segment: segment,
9  FilterExpression: 'amount > :min',
10  ExpressionAttributeValues: { ':min': 100 },
11}));

Filter vs KeyCondition: FilterExpression is applied after items are read — you are still charged RCUs for all scanned items, even those filtered out. Only KeyConditionExpression reduces RCU consumption.


Core Write Operations

javascript
1import { PutCommand, UpdateCommand, DeleteCommand } from '@aws-sdk/lib-dynamodb';
2
3// PutItem — creates or completely replaces an item
4await docClient.send(new PutCommand({
5  TableName: 'Orders',
6  Item: {
7    customerId: 'cust_A',
8    orderId: 'ORDER#2024-03-01T12:00:00',
9    amount: 59.99,
10    status: 'PENDING',
11  },
12  ConditionExpression: 'attribute_not_exists(customerId)',  // fail if already exists
13}));
14
15// UpdateItem — modify specific attributes without replacing the whole item
16await docClient.send(new UpdateCommand({
17  TableName: 'Orders',
18  Key: { customerId: 'cust_A', orderId: 'ORDER#2024-03-01T12:00:00' },
19  UpdateExpression: 'SET #s = :status, updatedAt = :ts ADD version :one',
20  ExpressionAttributeNames:  { '#s': 'status' },
21  ExpressionAttributeValues: {
22    ':status': 'SHIPPED',
23    ':ts':     new Date().toISOString(),
24    ':one':    1,
25  },
26  ConditionExpression: '#s = :pending',  // optimistic locking
27  ExpressionAttributeValues: {
28    ':status':  'SHIPPED',
29    ':ts':      new Date().toISOString(),
30    ':one':     1,
31    ':pending': 'PENDING',
32  },
33}));
34
35// DeleteItem — remove an item
36await docClient.send(new DeleteCommand({
37  TableName: 'Orders',
38  Key: { customerId: 'cust_A', orderId: 'ORDER#2024-03-01T12:00:00' },
39  ConditionExpression: 'attribute_exists(customerId)',
40}));

Expressions Cheat Sheet

DynamoDB uses expression syntax instead of SQL. These are the five expression types:

ExpressionPurposeUsed in
KeyConditionExpressionFilter by PK and SKQuery
FilterExpressionPost-read filter on any attributeQuery, Scan
ConditionExpressionAbort write if condition failsPut, Update, Delete, Transact
UpdateExpressionSpecify attribute changesUpdateItem
ProjectionExpressionReturn only specific attributesAll reads

UpdateExpression verbs:

VerbExamplePurpose
SETSET name = :val, profile.tier = :tierSet or overwrite an attribute
REMOVEREMOVE tags[0], optionalFieldDelete an attribute or list element
ADDADD viewCount :oneAtomic increment (Number) or union (Set)
DELETEDELETE tags :removeSetRemove elements from a Set attribute

ConditionExpression functions:

FunctionPurpose
attribute_exists(path)Attribute must exist
attribute_not_exists(path)Attribute must NOT exist — use for safe inserts
attribute_type(path, type)Check attribute data type
begins_with(path, substr)String prefix check
contains(path, operand)String contains or Set membership
size(path)Length of string, list, map, or set

Secondary Indexes

When you need to query by attributes other than the primary key, use indexes.

Rendering diagram…

Global Secondary Index (GSI)

  • Different PK and SK from the base table — enables completely new access patterns
  • Has its own provisioned throughput (or inherits on-demand)
  • Only supports eventually consistent reads
  • Can be added or deleted at any time
  • Up to 20 GSIs per table
  • Items appear in the GSI only if they have the GSI's key attributes
bash
1aws dynamodb update-table --table-name Orders \
2  --attribute-definitions \
3      AttributeName=status,AttributeType=S \
4      AttributeName=createdAt,AttributeType=S \
5  --global-secondary-index-updates '[{
6    "Create": {
7      "IndexName": "status-createdAt-index",
8      "KeySchema": [
9        {"AttributeName":"status","KeyType":"HASH"},
10        {"AttributeName":"createdAt","KeyType":"RANGE"}
11      ],
12      "Projection": {"ProjectionType":"ALL"},
13      "ProvisionedThroughput": {"ReadCapacityUnits":5,"WriteCapacityUnits":5}
14    }
15  }]'

Local Secondary Index (LSI)

  • Same PK as the base table, different SK — enables range queries on a different sort dimension
  • Must be created at table creation time — cannot add or delete later
  • Shares the base table's throughput (no separate capacity)
  • Supports strongly consistent reads (unlike GSI)
  • Up to 5 LSIs per table
  • Items with the same PK across base table + LSI share a 10 GB item collection limit
FeatureGSILSI
PKAny attributeMust match base table PK
SKAny attributeAny attribute (different from base)
ConsistencyEventually consistent onlyStrongly or eventually consistent
ThroughputSeparate (provisioned/on-demand)Shared with base table
CreationAny timeTable creation only
DeletionSupportedNot supported
Limit20 per table5 per table
Item collection limitNone10 GB per PK value

Transactions

DynamoDB transactions allow ACID operations across multiple items and tables in a single all-or-nothing request.

javascript
1import { TransactWriteCommand, TransactGetCommand } from '@aws-sdk/lib-dynamodb';
2
3// Transfer funds — debit one account, credit another atomically
4await docClient.send(new TransactWriteCommand({
5  TransactItems: [
6    {
7      Update: {
8        TableName: 'Accounts',
9        Key: { accountId: 'ACC-001' },
10        UpdateExpression: 'ADD balance :debit',
11        ConditionExpression: 'balance >= :amount',
12        ExpressionAttributeValues: { ':debit': -100, ':amount': 100 },
13      },
14    },
15    {
16      Update: {
17        TableName: 'Accounts',
18        Key: { accountId: 'ACC-002' },
19        UpdateExpression: 'ADD balance :credit',
20        ExpressionAttributeValues: { ':credit': 100 },
21      },
22    },
23    {
24      Put: {
25        TableName: 'Transactions',
26        Item: {
27          txId: 'TX-' + Date.now(),
28          from: 'ACC-001', to: 'ACC-002', amount: 100,
29          createdAt: new Date().toISOString(),
30        },
31        ConditionExpression: 'attribute_not_exists(txId)',  // idempotency check
32      },
33    },
34  ],
35}));

Transaction limits:

  • Up to 100 unique items per TransactWriteItems or TransactGetItems
  • Spans multiple tables (within the same region and account)
  • Costs 2× the normal RCU/WCU (transaction overhead)
  • Not supported on GSIs directly (write to base table, GSI updates automatically)

DynamoDB Streams

Streams capture a time-ordered log of every item change (insert, update, delete) in the table.

Rendering diagram…

Stream view types (what data each record contains):

View TypeContents
KEYS_ONLYOnly the key attributes of the modified item
NEW_IMAGEThe entire item after the change
OLD_IMAGEThe entire item before the change
NEW_AND_OLD_IMAGESBoth before and after — most useful for auditing
bash
1aws dynamodb update-table --table-name Orders \
2  --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES

Common use cases: Cross-region replication, search index maintenance (ElasticSearch/OpenSearch), audit logging, cache invalidation, event-driven triggers.


DynamoDB Accelerator (DAX)

DAX is an in-memory cache specifically built for DynamoDB. It provides microsecond read latency for eventually consistent reads.

Rendering diagram…
FeatureDetail
Read latencyMicroseconds (vs. single-digit ms for DynamoDB)
Write behaviorWrite-through — writes go to DynamoDB first, then cache
API compatibilitySame DynamoDB API — minimal code change
ConsistencyEventually consistent reads only (strongly consistent reads bypass DAX)
Item cache TTLDefault 5 minutes
Query cache TTLDefault 1 minute
VPCDeployed inside your VPC
javascript
1import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
2import { DaxDocument } from 'amazon-dax-client';
3
4// Swap DynamoDB client for DAX client — all other code unchanged
5const daxClient = new DaxDocument({
6  endpoints: ['my-dax-cluster.abc123.dax-clusters.us-east-1.amazonaws.com:8111'],
7  region: 'us-east-1',
8});

When NOT to use DAX: Write-heavy workloads, strongly-consistent reads required, Lambda functions with very infrequent invocations (cold start cost of maintaining DAX connection), or applications that are already fast enough without caching.


Time to Live (TTL)

TTL automatically deletes items past a specified expiry timestamp — free of charge, no WCU consumed.

javascript
1// Store a session that expires in 24 hours
2const expiresAt = Math.floor(Date.now() / 1000) + 86400; // Unix epoch seconds
3
4await docClient.send(new PutCommand({
5  TableName: 'Sessions',
6  Item: {
7    sessionId: 'sess_xyz',
8    userId: 'usr_01J8X',
9    data: { theme: 'dark' },
10    expiresAt,  // DynamoDB reads this attribute for TTL
11  },
12}));
bash
1# Enable TTL on the "expiresAt" attribute
2aws dynamodb update-time-to-live \
3  --table-name Sessions \
4  --time-to-live-specification Enabled=true,AttributeName=expiresAt

Important TTL facts:

  • Deletion is asynchronous — items may linger for up to 48 hours past expiry
  • Expired-but-not-yet-deleted items are excluded from reads (DynamoDB filters them out)
  • TTL deletions appear in DynamoDB Streams with userIdentity.type = "Service"
  • TTL attribute must be a Number storing Unix epoch seconds

Hot Partition Problem & Solutions

DynamoDB distributes data across physical partitions by hashing the partition key. If many requests hit the same PK value, you get a hot partition — a bottleneck.

Symptoms: ProvisionedThroughputExceededException, high ThrottledRequests metric, uneven latency.

Write Sharding

Distribute writes across multiple partitions by appending a random suffix to the PK:

javascript
1// ❌ HOT PARTITION: All writes to the same partition
2await docClient.send(new PutCommand({
3  TableName: 'Events',
4  Item: { pk: 'EVENT_LOG', sk: Date.now(), data: event },
5}));
6
7// ✅ SHARDED: Spread across 10 shards
8const SHARD_COUNT = 10;
9const shard = Math.floor(Math.random() * SHARD_COUNT);
10await docClient.send(new PutCommand({
11  TableName: 'Events',
12  Item: { pk: `EVENT_LOG#${shard}`, sk: Date.now(), data: event },
13}));
14
15// To read all shards, query each shard in parallel
16const queries = Array.from({ length: SHARD_COUNT }, (_, i) =>
17  docClient.send(new QueryCommand({
18    TableName: 'Events',
19    KeyConditionExpression: 'pk = :pk',
20    ExpressionAttributeValues: { ':pk': `EVENT_LOG#${i}` },
21  }))
22);
23const results = await Promise.all(queries);

Single-Table Design Pattern

Store multiple entity types in a single table using generic PK/SK names and a type attribute. Eliminates cross-table joins and reduces operational overhead.

PKSKtypeattributes
USER#usr_01PROFILEusername, email
USER#usr_01ORDER#2024-01-01orderamount, status
USER#usr_01ORDER#2024-01-15orderamount, status
PRODUCT#prod_AMETADATAproductname, price
PRODUCT#prod_AREVIEW#rev_001reviewrating, text

Batch Operations

javascript
1import { BatchGetCommand, BatchWriteCommand } from '@aws-sdk/lib-dynamodb';
2
3// BatchGetItem — up to 100 items across multiple tables
4const result = await docClient.send(new BatchGetCommand({
5  RequestItems: {
6    'Users':  { Keys: [{ userId: 'usr_01' }, { userId: 'usr_02' }] },
7    'Orders': { Keys: [{ customerId: 'cust_A', orderId: 'ORDER#001' }] },
8  },
9}));
10// result.UnprocessedKeys contains items that failed — retry with exponential backoff
11
12// BatchWriteItem — up to 25 put/delete operations across tables
13await docClient.send(new BatchWriteCommand({
14  RequestItems: {
15    'Orders': [
16      { PutRequest: { Item: { customerId: 'cust_A', orderId: 'ORDER#002', amount: 25 } } },
17      { DeleteRequest: { Key: { customerId: 'cust_A', orderId: 'ORDER#001' } } },
18    ],
19  },
20}));

BatchWriteItem does NOT support UpdateItem — only Put and Delete. For updates, use TransactWriteItems or individual UpdateItem calls.


DVA-C02 Quick Reference

TopicKey Fact
Max item size400 KB
Partition key onlyEach PK value must be unique
Composite keyPK + SK combination must be unique
1 RCU strong1 read of ≤ 4 KB
1 WCU1 write of ≤ 1 KB
Transactional cost2× RCU/WCU
Switch capacity modesOnce per 24 hours
GSI consistencyEventually consistent only
LSI consistencyStrongly or eventually consistent
LSI creationTable creation time only
Max GSIs per table20
Max LSIs per table5
Max layers (batch get)100 items
Max items (transact write)100 items
Streams retention24 hours
TTL deletion delayUp to 48 hours after expiry
DAX write behaviorWrite-through
DAX consistencyEventually consistent reads only
Hot partition fixWrite sharding (random suffix)
Safe insert patternattribute_not_exists(pk) condition

Practice Questions10

easy

Q1. What is the difference between a Partition Key and a Sort Key in DynamoDB?


Select one answer before revealing.

easy

Q2. A developer runs a DynamoDB Scan on a table with 100 GB of data. What is the expected behavior and cost concern?


Select one answer before revealing.

medium

Q3. A developer needs to write an item to DynamoDB only if an item with that partition key does not already exist. What is the correct approach?


Select one answer before revealing.

hard

Q4. A DynamoDB table stores IoT sensor readings with sensorId as the partition key. The table is experiencing hot partitions because only 10 sensors generate 90% of writes. Which strategy best distributes the write load?


Select one answer before revealing.

medium

Q5. Which TWO DynamoDB features are best suited for improving read performance on a read-heavy application? (Choose 2)


Select one answer before revealing.

medium

Q6. A developer needs to update a DynamoDB item's "views" counter atomically, incrementing it by 1. What is the correct approach?


Select one answer before revealing.

medium

Q7. A developer enables DynamoDB Streams on a table and creates a Lambda event source mapping. Which stream view type includes both the old and new item images in every stream record?


Select one answer before revealing.

hard

Q8. A developer creates a DynamoDB GSI with the same attributes as the base table's key but in reverse (table: PK=userId, SK=orderId; GSI: PK=orderId, SK=userId). What behavior should the developer expect when querying the GSI?


Select one answer before revealing.

hard

Q9. A developer performs a DynamoDB TransactWriteItems operation with 3 PutItem actions. One of the items fails a condition check. What happens to the other two items?


Select one answer before revealing.

medium

Q10. A DynamoDB table has TTL enabled on the "expiresAt" attribute. An item has expiresAt set to a Unix timestamp that passed 2 hours ago. What is the state of this item?


Select one answer before revealing.