Prevent Rails 7 schema.rb collation drift in MySQL environments

Summary

Running rails db:migrate adds explicit collation definitions to schema.rb even though the local and production MySQL databases are identical. The extra lines pollute version control and cause schema drift warnings.

Root Cause

  • Rails 7 introspects the MySQL connection and writes the default collation it receives from SHOW CREATE TABLE.
  • In the local MySQL client the server reports the collation as utf8mb4_0900_ai_ci while production reports utf8mb4_unicode_ci.
  • The difference is not caused by migrations; it is a metadata mismatch between the client libraries / default session variables.

Why This Happens in Real Systems

  • MySQL 8 introduced new default collations. If the client library (e.g., mysql2 gem) is compiled against a different MySQL client version, the server may return a different default.
  • Rails’ schema dumper treats any deviation from the “native” collation as a change, even when the tables behave identically.
  • CI/CD pipelines often run on containers with a newer MySQL client than production servers, triggering the discrepancy.

Real-World Impact

  • Noisy commits: every migration adds dozens of lines to schema.rb, inflating pull requests.
  • Merge conflicts: concurrent branches modify the same collation block, leading to unnecessary merge work.
  • False alarms: schema diff tools flag changes that have zero functional impact, distracting the team.
  • Potential downtime: if a developer force‑pushes a migration that attempts to alter collations, MySQL may lock tables and degrade performance.

Example or Code (if necessary and relevant)

# db/migrate/20240615000000_add_users.rb
class AddUsers < ActiveRecord::Migration[7.0]
  def change
    create_table :users do |t|
      t.string :email, null: false
      t.timestamps
    end
  end
end

Running the migration locally produces the following snippet in schema.rb:

create_table "users", charset: "utf8mb4", collation: "utf8mb4_0900_ai_ci", force: :cascade do |t|
  t.string "email", null: false
  t.datetime "created_at", null: false
  t.datetime "updated_at", null: false
end

In production the same migration yields:

create_table "users", charset: "utf8mb4", collation: "utf8mb4_unicode_ci", force: :cascade do |t|
  # …
end

How Senior Engineers Fix It

  • Standardize the client library: lock the mysql2 gem version and ensure the Docker image uses the same MySQL client library as production.

  • Force a uniform collation in an initializer:

    # config/initializers/mysql_collation.rb
    ActiveRecord::Base.establish_connection
    ActiveRecord::Base.connection.execute("SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci")
  • Patch the schema dumper: create a custom dumper that strips the collation: option unless it differs from the database default.

    # lib/tasks/custom_schema_dump.rake
    namespace :db do
      task :dump_schema do
        ActiveRecord::SchemaDumper.ignore_tables = %w[sqlite_master]
        Rake::Task["db:schema:dump"].invoke
      end
    end
  • Add a CI check that fails if schema.rb contains unexpected collations, prompting developers to run the fixer.

  • Document the expected collation in the README and lock it with a migration that explicitly sets it for new tables.

Why Juniors Miss It

  • They assume schema.rb is a pure representation of the database, not realizing Rails adds session‑specific metadata.
  • They often overlook environmental differences (client library versions, default session variables) and focus only on migration code.
  • Lack of experience with MySQL 8’s new defaults leads them to treat the collation lines as intentional schema changes.
  • Junior developers may not know how to customize the schema dumper or enforce consistent client configurations across environments.

Leave a Comment