Summary
A common concurrency issue arises when processing inventory allocations where rows must be selected, accumulated until a threshold is met, and then updated. In the provided approach, SELECT ... FOR UPDATE locks all rows matching a condition (e.g., specific stock code) within a transaction. This forces Hibernate to load entire datasets into memory, leading to excessive locking, high heap usage, and poor throughput. The optimal pattern combines PostgreSQL’s SKIP LOCKED and LIMIT with Hibernate Bulk Operations or Partial State Management to lock only strictly necessary rows and avoid loading the full context.
Root Cause
The immediate cause is the combination of naive locking strategies and ORM hydration behaviors:
- Global Locking: Using
SELECT * FROM stock WHERE code = :code FOR UPDATEwithout a restrictiveORDER BYorLIMITacquires exclusive locks on every row returned. This blocks other transactions attempting to allocate the same stock items, serializing the operation unnecessarily. - Full Context Loading: When the repository method returns a
List<Stock>, Hibernate attempts to manage the lifecycle of every entity instance. Even if only 10 rows are needed to satisfy the amount, the persistence context holds references to all 1,000 loaded rows, causing GC pressure andOutOfMemoryErrorin high-volume scenarios. - Dirty Checking Overhead: If the transaction modifies only a few rows but the context contains thousands, Hibernate’s dirty checking mechanism still scans the entire set during flush, wasting CPU cycles.
Why This Happens in Real Systems
This anti-pattern usually evolves from simplified logic during the MVP phase:
- Simplistic Transaction Safety: Developers assume they need to lock the entire “bucket” (all rows with same code) to ensure consistency, overlooking that row-level locking is sufficient if processed correctly.
- “It works on small data”: With small inventory counts, loading all rows is fast enough. The bottleneck appears only when the “cumulative condition” requires scanning many rows to find enough stock.
- Over-reliance on ORM convenience: Developers prefer iterating over a managed
List<Stock>over writing raw SQL or managing native queries, unaware of the massive performance penalty in “read-heavy, write-light” allocation loops.
Real-World Impact
- Transaction Timeouts: Long-running scans to accumulate quantities cause
lock_timeoutorstatement_timeouterrors. - Deadlocks: Without a consistent
ORDER BY, concurrent transactions scanning the same set of rows can deadlock (A locks row 1, B locks row 2, A waits for row 2, B waits for row 1). - Throughput Bottleneck: The database becomes contended; every allocation request waits for the previous one to release the lock on the
stocktable, effectively serializing the application. - Resource Exhaustion: Java Heap space fills up with hundreds of thousands of managed entities, triggering frequent Garbage Collection pauses.
Example or Code
Here is the problematic approach versus the optimized solution. Note that for the optimized solution, we rely on SQL to handle the accumulation logic, avoiding the need to load every row into Java memory.
-- Problematic Approach (Locks everything, loads everything)
BEGIN;
SELECT * FROM stock WHERE code = 'ABC' FOR UPDATE; -- Locks 1000 rows
-- App calculates in memory, updates 3 rows
-- 997 rows remain locked unnecessarily until commit
COMMIT;
-- Optimized Approach (Database-side accumulation)
-- We use a CTE (Common Table Expression) to find the specific rows required.
-- We acquire locks ONLY on the rows we intend to update using SKIP LOCKED.
WITH batch AS (
SELECT id, qty
FROM stock
WHERE code = 'ABC'
AND qty > 0
ORDER BY id -- CRITICAL for consistency
LIMIT 5 -- Fetch only 5 rows, or however many needed
FOR UPDATE SKIP LOCKED -- Only lock available rows, skip already locked ones
),
accumulation AS (
SELECT
id,
qty,
-- Calculate running total or remaining needed here if strictly required
-- or handle logic in application after fetching the batch
(SELECT SUM(qty) FROM batch) as total_available
FROM batch
)
SELECT * FROM accumulation;
How Senior Engineers Fix It
Senior engineers decouple locking from hydration and push logic to the database.
-
Adopt
LIMITandSKIP LOCKED:LIMIT: Ensures we only fetch and lock the minimum number of rows required to satisfy the cumulative condition.SKIP LOCKED: Allows concurrent transactions to “skip” rows currently locked by another transaction and process the remaining available stock, maximizing concurrency.
-
Use JPQL/Criteria Bulk Updates or Native SQL:
- Instead of
SELECTthenUPDATE, use a Native Query to perform the selection and locking in one go, returning only the IDs of the locked rows. - If the calculation logic is complex, fetch only the IDs and quantities (POJO projection) rather than full managed entities.
- Instead of
-
Partial Loading Strategy:
- If Java-side logic is unavoidable, fetch the
IDs first using a lightweight query. - Use
em.find()or a specificWHERE id IN (...)query to load only the specific rows that will be updated. - Detach other entities immediately if they were loaded by mistake to prevent dirty checking.
- If Java-side logic is unavoidable, fetch the
Why Juniors Miss It
- Conceptual Gap: Juniors often view the database as a “bag of objects” and Java as the processing engine. They fail to realize that databases are set-based processors that are much faster at accumulation and filtering than Java loops.
- “Magic” of ORMs: They trust Hibernate to “do the right thing” with
saveAll(), not realizing thatsaveAll()on 1,000 entities implies 1,000UPDATEstatements (or a bulk update if configured specifically) and heavy memory usage. - Locking Granularity: They understand the need for locking (safety) but not the cost of locking (contention), missing the nuance of
SKIP LOCKEDwhich is specifically designed for queue-like or resource-allocation problems.