Managing 100+ Tables: Modularization, Migrations, and DDD Techniques

Summary

A backend development team encountered significant cognitive load and deployment friction when transitioning from a small-scale prototype to a fully normalized production schema consisting of over 100 tables in a GBase 8s environment. The primary issue was not the database engine itself, but the management overhead and the lack of a structured strategy for schema evolution and developer ergonomics.

Root Cause

The complexity arose from several architectural and process-oriented factors:

  • Over-Normalization: While Third Normal Form (3NF) reduces redundancy, excessive normalization in a single flat namespace can lead to “join explosion,” making queries difficult to write and maintain.
  • Lack of Logical Grouping: Treating 100 tables as a single monolithic block without domain-driven boundaries makes it impossible for developers to maintain a mental model of the system.
  • Implicit Schema Evolution: Relying on manual SQL scripts or “adhoc” changes rather than a versioned migration strategy leads to environment drift between local, staging, and production.
  • Tooling Limitations: Relying on raw SQL editors for large schemas without hierarchical views or schema documentation increases the risk of human error.

Why This Happens in Real Systems

In real-world production environments, complexity is an inevitable byproduct of scale. This happens because:

  • Business Logic Complexity: As software matures, every new feature requires new entities, leading to a natural growth in the number of tables.
  • Compliance Requirements: In sectors like medical software, data integrity and auditability (e.g., tracking who changed what and when) require additional metadata tables, increasing schema size.
  • Decoupling vs. Performance: Engineers often struggle to find the balance between a clean, normalized design and the performance requirements of high-frequency read operations.

Real-World Impact

Failure to manage a large schema leads to:

  • Increased Onboarding Time: New engineers take weeks instead of days to understand the data model.
  • Deployment Fragility: A single incorrect ALTER TABLE statement in a large schema can cause cascading failures or long-running locks.
  • Query Performance Degradation: Developers may write inefficient, deep-join queries because they cannot easily visualize the relationships between distant tables.
  • Data Inconsistency: Without strict management, different microservices or modules might attempt to implement conflicting constraints on the same logical entities.

Example or Code (if necessary and relevant)

Instead of managing 100 tables as a flat list, engineers should use a migration-based approach to track changes:

-- Migration: 20231027_add_patient_vitals_table.sql
CREATE TABLE patient_vitals (
    vital_id INT PRIMARY KEY,
    patient_id INT NOT NULL,
    heart_rate INT,
    systolic_bp INT,
    diastolic_bp INT,
    recorded_at TIMESTAMP,
    CONSTRAINT fk_patient FOREIGN KEY (patient_id) REFERENCES patients(patient_id)
);

CREATE INDEX idx_patient_vitals_patient_id ON patient_vitals(patient_id);

How Senior Engineers Fix It

Senior engineers approach large schemas through modularization and automation:

  • Domain-Driven Design (DDD): They organize tables into bounded contexts. Instead of one giant schema, they treat groups of tables (e.g., PatientManagement, Billing, ClinicalRecords) as logical modules.
  • Database Migration Tools: They implement tools like Liquibase or Flyway. These tools treat database changes as code, allowing for versioning, rollbacks, and automated deployments.
  • View Layer Abstraction: To combat “join explosion,” they create Database Views. This provides a simplified, flattened interface for common queries, hiding the complexity of the underlying 100+ tables.
  • ERD Documentation as Code: They maintain living Entity Relationship Diagrams (ERD) that are automatically generated from the schema, ensuring the documentation never diverges from reality.
  • Strict Naming Conventions: They use prefixing or strict naming patterns to group related tables (e.g., med_prescription, med_dosage, med_patient_history).

Why Juniors Miss It

Junior developers often miss these patterns because:

  • Focus on Syntax over System: They focus on making a single SELECT statement work, whereas seniors focus on how that statement affects the lifecycle of the entire system.
  • Academic Bias: In school, normalization is taught as a mathematical ideal. In production, manageability and performance are often more important than reaching the highest level of normalization.
  • Lack of Operational Experience: Juniors have rarely experienced the “pain” of a failed production deployment caused by a manual schema change, so they don’t instinctively reach for migration tools.

Leave a Comment