Multi-Tenant application with Postgres row level security using NestJs and TypeOrm

Summary

This incident analyzes a misinterpretation of how multi‑tenant Postgres RLS and connection management work in a NestJS + TypeORM SaaS architecture. The core confusion: believing that each user session requires its own database connection, leading to fears of millions of connections. The real issue is not RLS itself, but incorrect scoping of the Postgres session state and misuse of AsyncLocalStorage.

Root Cause

The underlying problem stems from treating a Postgres connection as a per‑user resource instead of a per‑tenant or per‑request resource.

Key root causes:

Misreading AsyncLocalStorage as a connection pool rather than a request‑scoped context store.
Assuming RLS requires one connection per user, which is false.
Confusing “data source instance” with “database connection” — TypeORM DataSource ≠ a single TCP connection.
Not understanding that Postgres session variables (e.g., SET app.current_tenant = ...) are what drive RLS, not the connection identity.

Why This Happens in Real Systems

This misunderstanding is extremely common because:

Articles often simplify the explanation and developers assume “one DataSource = one connection”.
Node.js developers are used to stateless HTTP, so the idea of session‑bound RLS policies feels foreign.
AsyncLocalStorage looks like a per‑user store, but in reality it is per‑request, not per‑user.
Multi‑tenant SaaS patterns vary widely, and examples online often mix concepts.

Real-World Impact

If implemented incorrectly, the system can suffer from:

Connection exhaustion
- Postgres cannot handle millions of concurrent connections.
- Even thousands of connections require a connection pooler like PgBouncer.
Memory pressure
- Each connection consumes backend memory.
Latency spikes
- Excessive connection churn leads to slow queries and timeouts.
Broken RLS enforcement
- If tenant context is not set correctly per request, users may see the wrong tenant’s data.
Operational instability
- Connection storms during traffic spikes can crash the DB.

Example or Code (if necessary and relevant)

Below is a correct pattern: one shared DataSource, but tenant context is set per request using Postgres session variables.

// Set tenant for this request
await dataSource.query(`SET app.current_tenant = $1`, [tenantId]);

// All queries now obey RLS for this tenant
return dataSource.getRepository(User).find();

This uses one connection from the pool, not one per user.

How Senior Engineers Fix It

Experienced engineers avoid per‑user connections entirely. They use:

One global DataSource per service instance
A connection pool (10–50 connections), not millions
Per‑request tenant scoping
- SET app.current_tenant = ...
- Or SET LOCAL role = ...
Postgres Row Level Security policies referencing the session variable
PgBouncer in transaction pooling mode for large-scale SaaS
AsyncLocalStorage only to store tenantId, not the DataSource

Senior engineers know that:

RLS is evaluated per query, not per connection owner.
The connection pool is shared across all tenants and users.
The tenant context is lightweight and safe to set per request.

Why Juniors Miss It

Common reasons junior engineers struggle:

They assume DataSource = connection, which is incorrect.
They think RLS is tied to the user identity of the connection, but it’s tied to session variables.
They misunderstand AsyncLocalStorage, believing it stores long‑lived user state.
They try to solve multi‑tenancy at the connection layer, instead of the session layer.
They overestimate the cost of setting session variables, which is negligible.

Juniors often focus on “how do I isolate tenants?” while seniors focus on “how do I scale Postgres safely?”