pg_upgrade fails when upgrading from v14 to v16

Summary

pg_upgrade failed when upgrading a PostgreSQL server from v14 to v16 due to incompatible extensions (pgvector and hydra) causing schema restoration issues during the upgrade process.

Root Cause

The root cause was inconsistent schema definitions between PostgreSQL v14 and v16 for the installed extensions (pgvector and hydra). Specifically, the attislocal attribute in the pg_attribute catalog table was not properly handled during the binary upgrade process.

Why This Happens in Real Systems

  • Extension Incompatibility: Extensions like pgvector and hydra may not be fully compatible across major PostgreSQL versions.
  • Schema Changes: PostgreSQL v16 introduced changes in how inherited columns and catalog tables are handled, leading to conflicts during binary upgrades.
  • Insufficient Testing: Pre-upgrade checks do not always detect subtle schema inconsistencies introduced by extensions.

Real-World Impact

  • Downtime: The failed upgrade caused extended downtime for the database server.
  • Data Integrity Risk: Partial upgrades can leave the database in an inconsistent state, risking data corruption.
  • Resource Waste: Time and effort spent on troubleshooting and rollback.

Example or Code (if necessary and relevant)

-- Example of conflicting schema update during pg_upgrade
UPDATE pg_catalog.pg_attribute 
SET attislocal = false 
WHERE attname = 'dspp_conflict_cd' 
  AND attrelid = '"big_cust"."purchase_extension_1_82"'::regclass;

How Senior Engineers Fix It

  1. Verify Extension Compatibility: Ensure all extensions are compatible with the target PostgreSQL version.
  2. Manual Schema Migration: Perform manual schema adjustments before running pg_upgrade.
  3. Use Logical Backup: Consider using pg_dump and pg_restore for logical upgrades instead of binary upgrades.
  4. Test Thoroughly: Run pg_upgrade in a staging environment with identical extensions and schema.

Why Juniors Miss It

  • Overreliance on Checks: Assuming all pre-upgrade checks guarantee success without considering extension compatibility.
  • Lack of Experience: Limited exposure to major version upgrades and extension-related pitfalls.
  • Insufficient Logging Analysis: Failing to correlate error logs with specific schema or extension issues.

Leave a Comment