Django Docker production solutions

# Production Incident: Worker Timeouts After Migrating Django App From runserver to Gunicorn

## Summary
A Django application migrated from `runserver` to Gunicorn in Docker production experienced intermittent HTTP worker timeouts (`WORKER TIMEOUT`), resulting in incomplete data processing during TMDB API integrations. Development server worked without issues. Quick fix was reverting to `runserver`.

## Root Cause
Timeout occurred due to:
- Long-running synchronous requests exceeding Gunicorn's **default 30-second timeout**
- TMDB API operations blocking worker threads
- No explicit timeout configuration in Gunicorn
- Development server (`runserver`) having no request timeouts

## Why This Happens in Real Systems
- Production servers enforce worker timeouts to prevent resource starvation
- Externally-dependent operations are vulnerable to network latency
- Synchronous APIs/non-optimized database queries prolong request cycles
- Local/test environments rarely simulate production traffic volumes

## Real-World Impact
- **Data corruption**: Incomplete API processing → partial database records  
- **Reduced availability**: Workers killed → degraded capacity → client errors  
- **Operational overhead**: Manual recovery of failed imports required  

## Example or Code
Gunicorn configuration without timeout safeguards:
```bash
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "3", "localmovies.wsgi:application"]

Blocking API operation pattern:

def update_movies():
    movies = tmdb_api.fetch_all()  # Synchronous call taking >30s
    for movie in movies:            
        Movie.objects.update_or_create(...)  # Expensive DB ops

How Senior Engineers Fix It

Configure appropriate timeouts:
gunicorn --timeout 120 ... (set value exceeding worst-case request)

Use asynchronous processing:

from celery import shared_task

@shared_task
def update_movies_async():
    movies = tmdb_api.fetch_all()
    ...

Optimize imports:
- Paginated API fetching
- Batch database operations with bulk_create

Adaptiveness:

gunicorn --workers=4 \
         --timeout=300 \
         --keep-alive=15 \
         --graceful-timeout=90 \
         ...

Staging validation:
Smoke test with production-scale data before deployment

Why Juniors Miss It

Over-reliance on dev server behavior in production
Unaware of production server default configurations
Underestimation of network-bound operations
Debugging logs focused on app errors vs infrastructure limits
Lack of performance benchmarking for data-heavy operations