Flask API pagination and filtering works, but totalRecords count seems incorrect

Summary

This incident centers on a Flask REST API implementing filtering + pagination, where the totalRecords value appears unreliable. Although the API returns correct filtered results, the logic used to compute the total count introduces subtle correctness and maintainability issues. This postmortem explains why this happens, how real systems typically solve it, and why junior engineers often overlook the deeper pitfalls.

Root Cause

The root cause is the use of:

  • String replacement (query.replace("SELECT *", "SELECT COUNT(*)")) to generate the count query
  • Reusing the same SQL string for both data retrieval and counting
  • SQLite’s query planner behaving differently when LIMIT/OFFSET or certain filters are applied
  • Implicit assumptions that the SELECT structure is always stable

The logic works in simple cases but is fragile and can break when:

  • Additional columns are selected
  • Joins are added
  • The SELECT clause changes shape
  • Query optimizations reorder or rewrite SQL internally

Why This Happens in Real Systems

Real-world systems frequently run into this because:

  • Pagination requires two separate queries:
    • One for the filtered count
    • One for the paginated results
  • Engineers often try to reuse the same SQL string to avoid duplication
  • String manipulation of SQL is error‑prone, especially as queries evolve
  • SQLite and other databases may optimize COUNT queries differently, causing mismatches

Common triggers include:

  • Adding JOINs or GROUP BY
  • Adding computed columns
  • Adding ORDER BY clauses
  • Changing SELECT * to explicit columns

Real-World Impact

Incorrect total record counts cause:

  • Broken pagination UI
  • Infinite scrolling loops
  • Incorrect “last page” calculations
  • Frontend bugs that appear nondeterministic
  • Misleading analytics or reporting

In production, this often leads to:

  • Users seeing empty pages
  • API clients retrying unnecessarily
  • Confusion during debugging because the data looks correct but metadata does not

Example or Code (if necessary and relevant)

Below is the correct pattern: build the WHERE clause once, then use it in two separate queries.

base_query = "FROM cves WHERE 1=1"
params = []

if cve_id_filter:
    base_query += " AND id = ?"
    params.append(cve_id_filter)

if year_filter:
    base_query += " AND id LIKE ?"
    params.append(f"CVE-{year_filter}-%")

if score_filter:
    base_query += " AND baseScore >= ?"
    params.append(float(score_filter))

if days_filter:
    date_n_days_ago = (datetime.now() - timedelta(days=int(days_filter))).isoformat()
    base_query += " AND lastModified >= ?"
    params.append(date_n_days_ago)

# Count query
count_query = f"SELECT COUNT(*) {base_query}"

# Paginated query
data_query = f"SELECT * {base_query} LIMIT ? OFFSET ?"

This avoids string replacement and ensures both queries share the same WHERE logic.

How Senior Engineers Fix It

Senior engineers typically apply these patterns:

  • Separate the WHERE clause from the SELECT clause
  • Build SQL using composable fragments, not string replacement
  • Use parameterized queries consistently
  • Run two independent queries:
    • SELECT COUNT(*) FROM ... WHERE ...
    • SELECT columns FROM ... WHERE ... LIMIT ... OFFSET ...
  • *Avoid SELECT in APIs**, because it makes queries brittle
  • Introduce query builders or ORM layers when complexity grows

They also ensure:

  • COUNT queries never include LIMIT/OFFSET
  • COUNT queries never include ORDER BY unless required
  • Query logic is unit-tested with multiple filter combinations

Why Juniors Miss It

Junior engineers often miss this because:

  • The code “works” at first glance, so deeper issues go unnoticed
  • They assume SELECT * and SELECT COUNT(*) are interchangeable
  • They underestimate how quickly SQL queries evolve in real systems
  • They rely on string manipulation instead of structured query building
  • They don’t yet recognize how pagination interacts with query planning
  • They rarely test edge cases like:
    • Empty result sets
    • Large datasets
    • Multiple filters combined
    • Filters that change query shape

The result is a solution that appears correct but is not robust.

Leave a Comment