Resolving SQLite read‑only errors when using ZFS in Docker

Summary

An application running inside a Docker container (ezBookkeeping) failed to initialize its SQLite database when the underlying storage was transitioned from an ext4 filesystem to a ZFS dataset. Despite the container running as root and the directory possessing rwx permissions for all users, the application threw a fatal error: attempt to write a readonly database. This issue highlights a subtle mismatch between how SQLite manages concurrency via journal/WAL files and how ZFS handles atomic writes and file locking.

Root Cause

The failure is not a traditional Linux permission issue, but a filesystem-level locking and atomicity conflict. Specifically:

  • SQLite WAL Mode vs. ZFS CoW: SQLite often utilizes Write-Ahead Logging (WAL). This involves creating -wal and -shm files. On ZFS, the Copy-on-Write (CoW) nature and the way ZFS handles file locks can conflict with how SQLite attempts to acquire exclusive locks on these auxiliary files.
  • ZFS Recordsize Mismatch: The default ZFS recordsize is typically 128k, whereas SQLite performs much smaller, page-based writes (often 4k). This can cause write amplification and unexpected behavior during the heavy metadata updates required for SQLite’s journal commits.
  • Sync/Direct I/O Semantics: SQLite relies heavily on fsync() to ensure data integrity. While ZFS is extremely robust, the interaction between the Docker storage driver, the ZFS Intent Log (ZIL), and SQLite’s attempts to force synchronous writes can lead the SQLite engine to believe the filesystem has entered a read-only state if a lock cannot be safely negotiated.
  • Snap/Apt Docker Divergence: The user’s transition from Docker Snap to Docker Apt is a red flag. Snap-based Docker uses strictly confined AppArmor/SELinux profiles that often prevent mounting sensitive filesystems like ZFS, leading to “Permission Denied” errors that mask the underlying filesystem incompatibility.

Why This Happens in Real Systems

This is a classic case of leaky abstractions. We assume that if chmod 777 is applied, the application can write. However:

  • Filesystem Semantics: Not all filesystems are created equal. POSIX compliance regarding file locking (fcntl/flock) varies significantly between ext4 (journaling) and ZFS (CoW).
  • Virtualization Layers: Docker adds a layer of abstraction (the storage driver and namespace isolation). When you stack Container -> OverlayFS -> ZFS, you are creating a complex chain of I/O operations where a failure at the bottom (ZFS lock rejection) is bubbled up to the top as a generic “Read-only” error.
  • Kernel/Module Interactions: ZFS is a kernel module. Its behavior regarding atomic writes and synchronous commits is managed differently than the native Linux kernel block layer used by ext4.

Real-World Impact

  • Data Corruption Risk: Applications that ignore “readonly” errors or attempt to force writes on incompatible filesystems risk database corruption.
  • Service Unavailability: Critical microservices may enter a CrashLoopBackOff state, causing cascading failures in a distributed system.
  • Operational Overhead: Engineers spend hours debugging permissions and ownership (the “obvious” culprits) when the actual issue is a low-level architectural mismatch between the database engine and the storage provider.

Example or Code (if necessary and relevant)

To test if the issue is related to ZFS recordsize or sync settings, a senior engineer would adjust the dataset properties:

# Optimize the ZFS dataset specifically for SQLite workloads
# Reduce recordsize to match typical SQLite page sizes (4k or 8k)
sudo zfs set recordsize=8k cortex/ezbookkeeping_data

# Ensure synchronous writes are handled correctly for database integrity
sudo zfs set sync=standard cortex/ezbookkeeping_data

# Disable atime to reduce unnecessary metadata writes
sudo zfs set atime=off cortex/ezbookkeeping_data

How Senior Engineers Fix It

A senior engineer looks past the “Permission Denied” error and analyzes the I/O path:

  1. Isolate the Layer: Determine if the issue is the Container Runtime (Snap vs. Apt), the Filesystem (ZFS vs. ext4), or the Application (SQLite settings).
  2. Align Metadata/Data Blocks: They would tune the recordsize of the ZFS dataset to match the application’s page size to prevent massive write amplification and locking contention.
  3. Verify Locking Mechanisms: They would check if the ZFS dataset is mounted with options that might interfere with fcntl locking, which SQLite requires.
  4. Sanitize the Environment: They would insist on using native Docker (Apt/Binary) instead of Snap to ensure the container has the necessary privileges to interact with the host’s ZFS kernel modules.
  5. Implement Proper Volume Management: Instead of a raw bind mount, they might use a dedicated ZFS dataset with optimized properties specifically for the database.

Why Juniors Miss It

  • Symptom-Focused Debugging: Juniors see “readonly database” and immediately try chmod, chown, or user: root in the Dockerfile. They fix the symptom (the error message) rather than the cause (the filesystem interaction).
  • Assumption of Uniformity: They assume that “all Linux filesystems behave like ext4” and that the abstraction provided by Docker makes the underlying storage invisible.
  • Tooling Blindness: They may not realize that Snap packages run in a highly restricted sandbox that behaves fundamentally differently than standard system packages, especially regarding hardware and filesystem access.

Leave a Comment