Summary
A production engineer attempted to use MirrorMaker 2 (MM2) to implement a many-to-one aggregation pattern, where multiple local Redpanda/Kafka clusters forward data to a single central cluster. Unlike standard Active/Passive setups designed for Disaster Recovery (DR), this architecture intended to consolidate data without bi-directional synchronization. The implementation failed due to Topic Name Collisions caused by a conflict between the chosen IdentityReplicationPolicy and the identical topic naming conventions across source clusters.
Root Cause
The failure stems from a fundamental mismatch between the Replication Policy and the Data Topology:
- IdentityReplicationPolicy Requirement: This policy instructs MM2 to replicate topics with their exact original names. It assumes that the target cluster is a “mirror” where topic names must remain identical to avoid breaking consumer logic.
- Namespace Collision: Because all source clusters use the same topic names (e.g.,
orders,users), theIdentityReplicationPolicyattempts to writeordersfrom Cluster A andordersfrom Cluster B into the exact same topic on the target cluster. - Data Interleaving/Corruption: This results in a “logical merge” where messages from different physical locations are interleaved in a single target topic, making it impossible to distinguish the original source of a message without inspecting headers.
- DefaultReplicationPolicy Conflict: While the
DefaultReplicationPolicysolves the collision by adding a remote cluster prefix (e.g.,dc1.orders), it breaks downstream consumers that expect the original, un-prefixed topic names.
Why This Happens in Real Systems
In large-scale distributed systems, engineers often fall into the trap of reusing naming conventions across environments to simplify local development. This is a standard practice in Microservices and Infrastructure as Code (IaC). However, when moving from Siloed Architectures to Aggregated Architectures, these identical names become a liability.
The industry standard for DR (Disaster Recovery) focuses on Mirroring (making the target look exactly like the source), while the user’s requirement is Aggregation (combining multiple sources into one). MM2 is heavily optimized and documented for the former, leaving a “configuration gap” for the latter.
Real-World Impact
- Data Integrity Loss: If multiple sources write to the same topic name via
IdentityReplicationPolicy, the target cluster becomes a non-deterministic stream of interleaved events. - Consumer Logic Failure: Downstream applications expecting a specific partition strategy or sequence will fail when data from multiple datacenters is mashed together.
- Operational Overhead: If forced to use
DefaultReplicationPolicy, engineers must implement complex aggregation logic in the consumer layer to “de-prefix” topics, increasing latency and code complexity.
Example or Code (if necessary and relevant)
To solve this, one must move away from standard policies and implement a Custom Transformation or use the RegexTransform within the Kafka Connect framework to normalize names during the sink process.
# Example: Using RegexTransform to strip prefixes if using DefaultReplicationPolicy
# or to dynamically rename topics to include a source identifier to avoid collisions.
transforms=renameTopic
transforms.renameTopic.type=org.apache.kafka.connect.transforms.RegexRouter
# This regex targets the prefix added by DefaultReplicationPolicy (e.g., 'dc1.')
# and replaces it with a custom identifier or removes it if the logic allows.
transforms.renameTopic.regex=^([^.]+)\\..*
transforms.renameTopic.replacement=$1_agg
How Senior Engineers Fix It
A senior engineer approaches this by recognizing that MirrorMaker 2 is the wrong tool for pure aggregation if topic names cannot be unique. Instead of fighting the MM2 policy, they would:
- Implement Topic Namespacing at Source: Enforce a policy where topics are named
[datacenter_id].[topic_name]at the producer level. - Use Kafka Connect with SMTs: Instead of MM2, use a standard Kafka Connect Source/Sink pattern with Single Message Transforms (SMTs). This allows for fine-grained control over the destination topic name.
- Introduce an Aggregation Layer: Use a Stream Processing engine (like Flink or ksqlDB) on the target cluster. The MM2 replicates topics with prefixes (e.g.,
dc1.orders,dc2.orders), and the stream processor performs aUNIONoperation to create a single, cleanorderstopic. - Header-Based Routing: Instead of relying on topic names, ensure every message carries a
source_datacenterheader to allow consumers to filter or route data logically.
Why Juniors Miss It
- Tutorial Bias: Juniors often follow “MirrorMaker 2 Setup” tutorials which are almost exclusively written for High Availability (HA) and DR scenarios, leading them to assume
IdentityReplicationPolicyis a universal “correct” setting. - Ignoring Namespace Collisions: There is a tendency to treat “topics” as abstract entities rather than shared global resources.
- Tool Over-reliance: Juniors often try to force an existing tool (MM2) to perform a task it wasn’t designed for (Aggregation) rather than questioning if a different architecture (Stream Processing or Custom Connectors) is more appropriate.