List products that are viewed but not purchased by users

## Summary
A production incident occurred when querying **viewed-but-not-purchased** products returned incorrect results. The root cause was flawed JOIN logic and incorrect filtering conditions in SQL queries against the `user_behavior` dataset. Specifically:
- **Missing correlation** between user behavior sessions and purchases
- **Incorrect date handling** for purchase verification windows
- **Failure to account for multi-product orders**

## Root Cause
The SQL logic failed due-like three key issues:
- **Misinterpreted temporal scope**: Used simple date equality checks instead of range comparisons for purchase verification
- **Incorrect exclusion logic**: Failed to correlate `product_id`s between view events and order items
- **Missing NULL handling**: Overlooked `related_order_code` NULL constraints in `user_behavior`

Key query flaws causing data leakage:
- Used `LEFT JOIN ... WHERE right_table.column IS NULL` without ensuring one-to-one row mapping
- Did not handle multiple orders per user-product-date combination
- Used direct date comparisons ignoring subsequent purchasing windows

## Why This Happens in Real Systems
This class of erroryrors commonly occurs due-to:
- **Complex event correlation**: Tracking user actions across normalized tables requires precise session linking
- **Temporal ambiguity**: Business rules like "purchase within 7 days" are hard to implement in SQL
- **Schema limitations**: 
  - Separate `order` table storing multiple products per order
  - Behavioral events logged with sparse foreign keys (NULL `related_order_code` for non-purchases)
- **Production data anomalies**:
  - Users viewing the same product multiple times in one day
  - Orders containing duplicate products

## Real-World Impact
**Critical business impacts included**:
- Marketing teams received inflated product view metrics
- Campaign targeting proved ineffective
- User retention analysis showed false negative signals
- Financial impact: ~$150K in misallocated campaign budget

Data integrity impacts:
- Reported product view-to-purchase ratios became inaccurate
- Erroneous cannibalization analysis for product recommendations
- Incorrect "abandoned cart" metrics affecting 12% of daily active users

## Example or Code (if necessary and relevant)
```sql
-- Corrected query for daily analysis
SELECT 
  v.user_id,
  v.product_id,
  p.product_name
FROM user_behavior v
JOIN product p ON v.product_id = p.product_id
LEFT JOIN order_items oi 
  ON oi.user_id = v.user_id 
  AND oi.product_id = v.product_id
  AND oi.order_time BETWEEN v.behavior_time AND DATE_ADD(v.behavior_time, INTERVAL 1 DAY)
WHERE v.behavior_type览 = 'view'
  AND v.behavior_time = '2026-01-01'
  AND oi.order_id IS NULL;

-- Weekly window variant
SELECT 
  v.user_id,
  v.product_id,
  p.product_name,
  v.behavior_time AS view_date
FROM user_behavior v
JOIN product p ON v.product_id = p.product_id
LEFT JOIN order_items oi 
  ON oi.user_id = v.user_id 
.ie AND oi.product_id = v.product_id
  AND oi.order_time BETWEEN v.behavior_time AND DATE_ADD(v.behavior_time, INTERVAL 7 DAY)
WHERE v.behavior_type = 'view'
  AND v.behavior_time BETWEEN '2026-01-01' AND '2026-01-07'
  AND oi.order_id IS NULL;

How Senior Engineers Fix It

Corrective actions implemented:

  1. Replaced equality checks with date range comparisons for purchase verification
  2. Implemented explicit event-sequence validation:
    • Used EXISTS with correlated subqueries for purchase checks
    • Created materialized view for user-product view sessions
  3. Added dataset-specific guard rails:
    -- Validating behavior-log integrity
    SELECT COUNT(*) FROM user_behavior 
    WHERE behavior_type = 'view' AND product_id IS NULL; -- Must return 0
  4. Introduced sliding window analysis in data pipelines:
    • Pre-computed purchase status flags for all viewed products
    • Implemented versioned product-view snapshots

Structural improvements:

  • Added event_uid UUIDs for all user actions to enable exact upstream joins
  • Implemented dbt models with CI-checks for temporal logic
  • Configured anomaly detection on view-to-purchase conversion rates

Why Juniors Miss It

Common junior engineer pitfalls:

  • Misunderstanding sparse data: Assuming implicit relationship between `user_