LMDB nested RW transaction

Summary

The core issue was a failure to correctly handle a parent transaction handle lifecycle. The user attempted to nest a read-write transaction (wtxn) inside a parent transaction that was either read-only (th was RO) or a handle passed from an external scope (th existed). The mdb_txn_begin call failed with EINVAL because the parent argument was invalid for a nested transaction in LMDB.

Key Findings:

  • Read-only transactions cannot have children. The user was passing a read-only transaction as the parent for a read-write transaction, which is illegal in LMDB.
  • Ownership semantics: If th is passed as NULL, the user must manage the root transaction. If th is passed, it must be a valid read-write transaction.

Root Cause

The EINVAL error returned by mdb_txn_begin occurs when the parent argument is invalid.

Specifically, the failure is caused by two distinct logic errors in the provided code:

  1. Invalid Parent Type: The user logic allows th to be a read-only transaction (MDB_RDONLY). When th is passed as the parent argument to the second mdb_txn_begin (which requests flags 0, i.e., read-write), LMDB throws EINVAL because read-only transactions cannot have children. Nested transactions in LMDB must all be read-write; you cannot mix read-only and read-write modes in a nested hierarchy.
  2. Handle Misalignment: If th is NULL, the code correctly creates a read-only root (rtxn). However, it then tries to start a read-write child (wtxn) inside this read-only root. This causes the same EINVAL failure.

Why This Happens in Real Systems

This is a common pitfall in C APIs dealing with scoped resources and inheritance.

  • Ambiguous APIs: Developers often create wrapper functions that accept a void* parent or Txn* parent to allow flexibility (e.g., “start a transaction inside this one, or start a new one”). In C, without strong type checking, it is easy to accidentally pass a MDB_RDONLY handle into a function that expects to create a child.
  • Parent-Child Constraints: Many transactional systems (LMDB, databases, graphics renderers) require parent and child to share specific attributes. In LMDB, the hierarchy must be monolithic: a read-write root is required for any child activity. The user expected polymorphic behavior (nested op works in both RO and RW contexts), but LMDB enforces strict hierarchy.

Real-World Impact

  • Immediate Crash/Error: The application fails immediately at the transaction start.
  • Deadlocks (Potential): If the user bypasses the check by passing NULL when th is present (to avoid EINVAL), they risk deadlocks. Since the handle th is likely still active in the outer scope, passing NULL creates a separate, blocking transaction that competes for the lock held by th.
  • Data Inconsistency: If the logic is patched incorrectly (e.g., forcing a new read-only transaction instead of a child), the operations inside wtxn will not see uncommitted changes made in th, breaking the “atomicity” expectation of the nested operation.

Example or Code

The user’s original code contained the logic error. Here is the incorrect implementation and the corrected version.

The Bug (Original Logic)

int nested_op_buggy(MDB_env *env, MDB_txn *th) {
    MDB_txn *rtxn, *wtxn;

    // Error 1: If th is RO, rtxn becomes RO.
    if (th) rtxn = th; 
    else mdb_txn_begin(env, NULL, MDB_RDONLY, &rtxn);

    // Error 2: Attempting to create a RW child (0) inside a RO parent (rtxn).
    // Result: EINVAL
    mdb_txn_begin(env, rtxn, 0, &wtxn); 
    return 0;
}

The Fix (Correct Implementation)

The logic must check if a parent exists and if it is capable of having children. If the existing th is read-only, it cannot be used as a parent for a write operation; a new root must be created, or the function must reject the call.

int nested_op_fixed(MDB_env *env, MDB_txn *th) {
    MDB_txn *wtxn;
    int rc;
    int parent_flags = 0;

    // Check if we have an existing parent
    if (th) {
        // We need to determine the flags of the existing transaction 'th'.
        // Since LMDB doesn't provide 'mdb_txn_flags()', we rely on context.
        // However, if 'th' is RO, we CANNOT nest.

        // NOTE: The API design here is flawed if 'th' can be RO.
        // If 'th' is RO, we must abort or create a new root, but we cannot nest.

        // For this example, we assume 'th' is a valid RW transaction handle.
        // If we are unsure, we must check a flag we passed previously or fetch it.
        // Since we can't fetch it, we rely on the caller to ensure 'th' is RW.

        // Start nested transaction
        rc = mdb_txn_begin(env, th, 0, &wtxn);
        if (rc) return rc; 
    } else {
        // No parent: Start a fresh RW root transaction
        rc = mdb_txn_begin(env, NULL, 0, &wtxn);
        if (rc) return rc;
    }

    // Perform operations with wtxn...

    mdb_txn_commit(wtxn);
    return 0;
}

How Senior Engineers Fix It

Senior engineers solve this by enforcing transaction capability checks before attempting nesting.

  1. Explicit Flags: Do not rely on implicit defaults. If a function needs to write, it must validate that the parent transaction is capable of writing.
  2. Wrapper Abstraction: Create a wrapper txn_begin_nested that hides the logic. It accepts a generic MDB_txn*.
  3. Defensive Logic:
    • If parent is NULL: Start a new RW transaction.
    • If parent is provided: Query the parent’s state. If the parent is RO, either fail the request or create a new independent transaction (depending on business requirements), but never attempt to pass the RO handle to mdb_txn_begin with 0 flags.

The Fix Logic:
“Is th valid and RW? Yes -> nest. No -> Fail or Start New.”

Why Juniors Miss It

  • Misinterpretation of “Nested”: Juniors often read “read-write transactions may be nested” but miss the implicit requirement that the root must be read-write. They assume any transaction can be a parent.
  • Ignoring the EINVAL Spec: EINVAL is a generic error. Juniors often look for memory corruption or NULL pointers first, rather than checking the semantics of the arguments (e.g., “Is the parent actually allowed to have children?”).
  • Lazy State Checking: They assume the MDB_txn* handle passed in is “ready to go” without verifying its read/write status, leading to invalid API calls that violate the library’s internal constraints.