DatabaseError(1020, “1020 (HY000): Record has changed since last read in table ‘db_control'”

Summary

A Python script using MariaDB began failing with DatabaseError(1020, "Record has changed since last read in table 'db_control'") after an upgrade to MariaDB 11.8.3 on Raspberry Pi OS Trixie. The root cause was identified as the default enabling of innodb_snapshot_isolation in this MariaDB version. This change alters how transaction isolation handles concurrent modifications, causing a failure in a workflow that previously worked. The error specifically triggers during a “read-modify-write” sequence within a transaction, where another session modifies the target record after the initial read but before the write.

Root Cause

The primary driver is the configuration change in MariaDB 11.8.3, where innodb_snapshot_isolation is now ON by default.

  • Snapshot Isolation Behavior: With snapshot isolation active, a transaction operates on a consistent “snapshot” of the database taken at the start of the transaction. Standard REPEATABLE READ guarantees that rows read remain consistent. However, by default, snapshot isolation does not strictly prevent the “write skew” anomaly unless explicitly checked.
  • Optimistic Locking Failure: The specific error 1020 (HY000): Record has changed since last read indicates an optimistic locking check performed by the storage engine. When the script performs a write to the specific record it previously read, the engine checks if that record has been modified by another transaction in the meantime (based on the state seen at the transaction start).
  • Transaction Scope: The issue arises because the transaction remains open for the duration of the multi-read process, holding the snapshot. The single write at the end fails because the record version in the live database has diverged from the snapshot.

Why This Happens in Real Systems

This scenario is a classic trap in concurrent database programming.

  • Read-Modify-Write Cycle: The script likely follows this pattern: BEGIN -> SELECT (Read record X) -> Process -> UPDATE (Write record X) -> COMMIT.
  • Concurrent Updates: If another process or thread reads and updates record X (even just a timestamp or a flag) between the SELECT and UPDATE of the first transaction, the version chain changes.
  • Isolation Level Nuances: While REPEATABLE READ usually prevents reading dirty data, snapshot isolation optimizes for read performance. When the UPDATE executes, it attempts to modify the row based on the version it knows. If the version has incremented, the engine aborts the transaction to preserve consistency, throwing error 1020. This often happens in polling scripts or batch processing jobs that take longer than expected.

Real-World Impact

  • Intermittent Failures: The error appears “every few days” rather than immediately, making it hard to reproduce. It depends on the precise timing of concurrent access to the specific record.
  • Data Integrity Risks: If the script aborts without handling the error, the “Process” logic is lost, potentially leaving the system in an inconsistent state.
  • Log Bloat: As noted in the input, catching this error generically often leads to retry loops or large log files if the specific condition isn’t handled gracefully.

Example or Code

To reproduce the logic causing the error, we simulate two concurrent transactions. This Python script uses mysql-connector-python (or similar) to demonstrate the collision.

import threading
import time
import mysql.connector

# Configuration
DB_CONFIG = {
    'user': 'root',
    'password': 'password',
    'host': 'localhost',
    'database': 'test_db'
}

def setup_db():
    conn = mysql.connector.connect(**DB_CONFIG)
    cur = conn.cursor()
    cur.execute("DROP TABLE IF EXISTS db_control")
    cur.execute("""
        CREATE TABLE db_control (
            id INT PRIMARY KEY,
            value INT,
            version INT DEFAULT 0
        )
    """)
    cur.execute("INSERT INTO db_control (id, value, version) VALUES (1, 100, 0)")
    conn.commit()
    conn.close()

def transaction_a():
    """
    The problematic transaction:
    1. Reads record
    2. Waits (simulating processing)
    3. Tries to write (fails if Transaction B ran in step 2)
    """
    conn = mysql.connector.connect(**DB_CONFIG)
    cur = conn.cursor()

    try:
        print("Tx A: Beginning transaction")
        cur.execute("START TRANSACTION")

        # 1. Read
        cur.execute("SELECT value, version FROM db_control WHERE id = 1")
        row = cur.fetchone()
        print(f"Tx A: Read value={row[0]}, version={row[1]}")

        # Simulate processing time
        print("Tx A: Processing... (waiting 2s)")
        time.sleep(2)

        # 3. Write (Optimistic check happens here)
        print("Tx A: Attempting update")
        # We try to update based on the read version to mimic logic, 
        # but MariaDB checks the row version in the engine layer automatically.
        cur.execute("UPDATE db_control SET value = %s WHERE id = 1", (row[0] + 50,))

        conn.commit()
        print("Tx A: Committed")

    except mysql.connector.Error as e:
        print(f"Tx A: ERROR {e}")
        conn.rollback()
    finally:
        conn.close()

def transaction_b():
    """
    A competing transaction that modifies the record quickly.
    """
    conn = mysql.connector.connect(**DB_CONFIG)
    cur = conn.cursor()

    time.sleep(1) # Start slightly after A

    print("Tx B: Beginning transaction")
    cur.execute("START TRANSACTION")
    cur.execute("UPDATE db_control SET value = value + 10, version = version + 1 WHERE id = 1")
    conn.commit()
    print("Tx B: Committed update")
    conn.close()

if __name__ == "__main__":
    setup_db()

    t1 = threading.Thread(target=transaction_a)
    t2 = threading.Thread(target=transaction_b)

    t1.start()
    t2.start()

    t1.join()
    t2.join()

How Senior Engineers Fix It

There are two main approaches: preventing the collision or handling it gracefully.

  1. Transactional Queuing / Pessimistic Locking:
    Use SELECT ... FOR UPDATE immediately within the transaction. This locks the row physically (or via gap locks) so no other transaction can modify it until the first one commits.

    cur.execute("SELECT value FROM db_control WHERE id = 1 FOR UPDATE")
    # ... process ...
    cur.execute("UPDATE db_control SET value = %s WHERE id = 1", (new_val,))
    conn.commit()
  2. Retry Logic (Application Layer):
    Wrap the transaction in a loop that catches error 1020 and retries. This is often preferred for high-concurrency systems where locks hurt performance.

    def execute_with_retry():
        for _ in range(5): # Max retries
            try:
                # ... Run the transaction code ...
                return
            except mysql.connector.Error as e:
                if e.errno == 1020:
                    time.sleep(random.uniform(0.1, 0.5))
                    continue
                raise
  3. Configuration Change:
    If the application was not designed for concurrency, simply setting innodb_snapshot_isolation = OFF in the MariaDB configuration file (/etc/mysql/mariadb.conf.d/50-server.cnf) reverts to the previous behavior. However, this is a “band-aid” and not a fix for proper concurrency design.

Why Juniors Miss It

  • Assumption of Linearity: Junior developers often assume code executes linearly without interleaving from other processes. They don’t anticipate that a record read at the start of a function might be stale by the time they write to it.
  • Blind Reliance on Defaults: Changes in database defaults between major OS versions (like Raspberry Pi OS upgrades) are often overlooked. “It worked before” leads to investigating the code rather than the environment configuration.
  • Missing Locking Clauses: The concept of FOR UPDATE is often ignored until concurrency bugs appear. It seems like “extra work” until the system fails under load or timing conditions.