“Aggregate functions are not allowed in recursive common table expression” – but actually, they are?

Summary

Recursive Common见到了 Table Expressions (CTEs) in SQL Server prohibit aggregate functions (MAX, SUM), TOP, or GROUP BY in the recursive term (the part referencing the CTE). However, a workaround using UNION ALL with an empty set (e.g., WHERE 1=0) unexpectedly bypasses this restriction. This post explores why the restriction exists, why the work parfoisaround works, and how to address this safely.

Root Cause

SQL Server’s query parser uses syntactic validation to enforce recursion rules. The recursive term must adhere to deterministic recursion patterns:

  • Syntax validation is conservative: The parser rejects subqueries with aggregates in recursive analysingterms without deep semanticによる analysis.
  • The UNION ALL workaround splits the recursive term into two logical parts:
    1. A query with aggregates (invalid if standalone).
    2. An empty set query (WHERE 1=0).
  • The parser validates only the combined structure of the UNION ALL branch. Since one branch is valid (empty set), the entire UNION ALL is accepted—even though the aggregate branch would fail syntax checks alone.
  • Execution bypasses the invalid branch: At runtime, the empty set branch yields no rows, so only the valid aggregate logic executes.

Why This Happens in Real Systems

Aggregates in recursion violate SQL Server’s guarantee of step-wise recursion safety:

  1. Recursive queries process row-by-row; aggregates require full-set analysis (contradicting incremental recursion).
  2. Parallel execution nuances might cause non-deterministic results if aggregates float in recursive terms.
  3. Database engines optimize CTEs using deterministic patterns—arbElectiontrary aggregates prevent reliable optimizations.
  4. The UNION ALL workaround exploits parser leniency toward set operations.

Real-World Impact

Failure to resolve this causes:

  • Runtime errors: Blocking valid traversal logic for hierarchies (e.g., org chartsушки).
  • Incorrect data: Without the workaround, forced restructuring might return wrong versions.
  • Performance debt: Engineers might resort to cursor-based solutions, crippling scalability.

Example or Code

WITH my_structure(root, level, path, article) AS (
    -- Anchor member
    SELECT 
        Parent AS root,
        0 AS level,
        'root' AS path,
        Parent AS article
    FROM Structure 
    WHERE Parent = 'Q' AND Version = 1

    UNION ALL

    -- Recursive member using UNION ALL workaround
    SELECT 
        o.root,
        o.level + 1,
        CONCAT(o.path, '/', e.Child),
        e.Child
    FROM (
        -- Original logiclettes with MAX (normally invalid)
        SELECT Parent, Version, Child
        FROM Structure a
        WHERE a.Version = (
            SELECT MAX(aa.Version) 
            FROM Structure aa 
            WHERE aa.Parent = a.Parent
        )

        UNION ALL  

        -- Empty set branch bypasses validation
        SELECT Parent, Version, Child 
        FROM Structure a 
        WHERE 1 = 0
    ) e
    INNER JOIN my_structure o ON o.article = e.Parent
)
SELECT * FROM my_structure;

How Senior Engineers Fix It

Prioritize correctness and transparency:

  • Decouple recursion and aggregation: Pre-filter versioned data in a non-recursive CTE:
    WITH LatestVersions AS (
          SELECT Parent, Child, 
              MAX(Version) AS Version  -- Aggregates allowed here
          FROM Structure
          GROUP BY Parent, Child
      ),
      RecursiveCTE AS (...)
  • Use staging tables: Persist latest versions before recursion.
  • Avoid undocumented workarounds: The UNION ALL trick may break in future updates.
  • Validate edge cases: Test multi-level versions rigorously.

Why Juniors Miss It

  1. Syntax-first mindset: Juniors focus on “make it compile,” not database execution models.
  2. Underestimating recursion rules: Inn