# Preventing Race Conditions in AWS DynamoDB with Lambda and Kinesis Pipelines
## Summary
A pipeline processes DynamoDB updates via Kinesis and Lambda functions:
- **Lambda A** reads items and publishes versioned events (`{key, version, ...}`) to Kinesis
- Kinesis partitions events by key
- **Lambda B** consumes events and writes updates to DynamoDB
**Problem:** Retries/concurrent processing occasionally cause stale data (e.g., `version=41` overwriting `version=42`) despite partitioning.
## Root Cause
- **Non-atomic read-modify-write:** Lambda B checks the current version with `get_item` before writing, but concurrent executions may:
1. Read the same stale item state
2. Perform conflicting writes
Due to Kinesis consumer scaling and Lambda retries.
- **Kinesis parallelism behavior:** Per-key ordering doesn’t prevent overlapping Lambda invocations from multiple shards or batches.
## Why This Happens in Real Systems
- Distributed systems **retry operations automatically** (e.g., Lambda/Kinesis retries on failure).
- Concurrency arises from:
- Multiple Kinesis shards processing distinct partitions
- Multiple batches processing simultaneously
- Developers assume partitioning **guarantees serial execution** (it doesn’t).
## Real-World Impact
- **Data corruption:** Loss of newer data when a stale write "wins."
- **Inconsistent state downstream:** Consumers of corrupted DynamoDB state make incorrect decisions.
- **Hard to debug:** Silent data loss without domain-specific logging.
## Example Code
**Problematic Implementation**:
```python
# Lambda B (potential race condition)
current = dynamodb.get_item(Key={'key': event['key']}).get('Item')
if not current or event['version'] > current['version']:
dynamodb.put_item(Item=event) # Stale data can overwrite newer version
Fixed Implementation:
# Lambda B (atomic conditional write)
try:
dynamodb.put_item(
Item=event,
ConditionExpression='attribute_not_exists(#key) OR version < :new_version',
ExpressionAttributeNames={'#key': 'key'},
ExpressionAttributeValues={':new_version': event['version']}
)
except dynamodb.meta.client.exceptions.ConditionalCheckFailedException:
pass # Newer version exists → safely skip stale write
How Senior Engineers Fix It
- Replace
get_item→put_itemlogic with atomic conditional writes using DynamoDB’s Compare-and-Set. - Handle conflicts gracefully:
- Swallow
ConditionalCheckFailedExceptionfor stale writes (it’s not an error).
- Swallow
- De-duplication: Include event IDs/sequence numbers upstream to deduplicate retries.
- Augment monitoring: Track
ConditionalCheckFailedmetrics via CloudWatch. - Set batch window: Keep Kinesis batch size ≤ 1 for ultra-high throughput keys.
Why Juniors Miss It
- Concurrency complexity: Misunderstanding distributed system guarantees—partitioning ≠ serialization.
- Local testing gaps: Race conditions rarely surface in single-threaded tests.
- **Assumption