Contents

A single Kafka cluster, no matter how well provisioned, cannot satisfy every operational requirement across a distributed organisation. Cross-cluster replication addresses four distinct needs:

MirrorMaker 2 vs MirrorMaker 1
FeatureMirrorMaker 1MirrorMaker 2
FoundationStandalone consumer + producer processBuilt on Kafka Connect framework
Offset syncNot supportedNative consumer group offset translation
Topic config syncNot supportedSyncs retention, compaction, partition count
Active/ActiveCycle prevention is manual and fragileBuilt-in cycle prevention via topic prefix policy
MonitoringLimited JMX metricsRich Connect metrics + heartbeat topic
StatusDeprecated (removed in Kafka 4.x)Actively maintained, recommended
Confluent Replicator is a commercial alternative to MirrorMaker 2 that adds exactly-once delivery guarantees and a graphical management UI. For open-source deployments, MirrorMaker 2 is the standard choice.

MirrorMaker 2 is implemented as a set of Kafka Connect connectors. When you run connect-mirror-maker.sh, it starts an embedded Connect worker and deploys four connectors automatically for each configured replication flow.

ConnectorRole
MirrorSourceConnector Reads records from the source cluster and writes them to the target cluster. Handles topic creation, partition count matching, and record-level replication.
MirrorSinkConnector Internal connector used by MirrorSourceConnector to write to the target. Not configured directly in mm2.properties.
MirrorCheckpointConnector Periodically reads consumer group offsets from the source cluster, translates them to the equivalent target offsets, and writes checkpoints to the __consumer_offsets-equivalent topic on the target. Enables seamless consumer failover.
MirrorHeartbeatConnector Produces synthetic heartbeat records to the mm2-heartbeats topic on the source. These records propagate through the replication pipeline, letting you measure end-to-end replication latency.
Remote Topic Naming

By default MM2 uses DefaultReplicationPolicy, which prefixes replicated topics with the source cluster alias and a dot separator. A topic named orders on cluster us-east becomes us-east.orders on the target cluster. This prefix is the key mechanism that prevents replication cycles in active/active topologies — a connector will never replicate a topic that already carries the target cluster's prefix.

Offset Translation

Because a replicated topic on the target cluster has different internal offsets than the source (due to compaction differences, replication lag, and independent segment rolling), consumer group offsets cannot be directly reused. MirrorCheckpointConnector maintains a mapping by writing translated offset checkpoints to a dedicated internal topic. Consumers failing over to the target use these checkpoints to resume from the correct position.

In an active/passive topology, one cluster (us-east) is the primary where producers write. The second cluster (us-west) is a passive DR replica. Consumers normally read from us-east and fail over to us-west only when needed. Replication flows in one direction only: us-east → us-west.

Full mm2.properties — Active/Passive
# ── Cluster aliases ────────────────────────────────────────────── clusters = us-east, us-west # ── Bootstrap servers for each cluster ─────────────────────────── us-east.bootstrap.servers = kafka-east-1:9092,kafka-east-2:9092,kafka-east-3:9092 us-west.bootstrap.servers = kafka-west-1:9092,kafka-west-2:9092,kafka-west-3:9092 # ── Replication flow: us-east → us-west only ───────────────────── us-east->us-west.enabled = true us-west->us-east.enabled = false # ── Topics to replicate (regex) ─────────────────────────────────── us-east->us-west.topics = .* # ── Topics to exclude ──────────────────────────────────────────── us-east->us-west.topics.blacklist = .*\.internal, .*-changelog, connect-.*, __.* # ── Replication factor for remote topics on target ─────────────── replication.factor = 3 # ── Consumer group offset sync ──────────────────────────────────── us-east->us-west.sync.group.offsets.enabled = true us-east->us-west.sync.group.offsets.interval.seconds = 60 # ── Topic config and ACL sync ───────────────────────────────────── us-east->us-west.sync.topic.configs.enabled = true us-east->us-west.sync.topic.acls.enabled = true # ── Replication policy ──────────────────────────────────────────── # DefaultReplicationPolicy prefixes replicated topics: us-east.orders replication.policy.class = org.apache.kafka.connect.mirror.DefaultReplicationPolicy replication.policy.separator = . # ── Heartbeat & checkpoint connectors ──────────────────────────── us-east->us-west.emit.heartbeats.enabled = true us-east->us-west.emit.heartbeats.interval.seconds = 5 us-east->us-west.emit.checkpoints.enabled = true us-east->us-west.emit.checkpoints.interval.seconds = 60 # ── Connect worker settings ─────────────────────────────────────── # Shared across both clusters offset.storage.replication.factor = 3 status.storage.replication.factor = 3 config.storage.replication.factor = 3 # ── Security (if cross-DC TLS is required) ─────────────────────── # us-east.security.protocol = SSL # us-east.ssl.truststore.location = /etc/kafka/certs/truststore.jks # us-east.ssl.truststore.password = changeit # us-east.ssl.keystore.location = /etc/kafka/certs/keystore.jks # us-east.ssl.keystore.password = changeit # # us-west.security.protocol = SSL # us-west.ssl.truststore.location = /etc/kafka/certs/truststore.jks # us-west.ssl.truststore.password = changeit # us-west.ssl.keystore.location = /etc/kafka/certs/keystore.jks # us-west.ssl.keystore.password = changeit
Starting MirrorMaker 2
# Run as a standalone MirrorMaker 2 process (embedded Connect worker) bin/connect-mirror-maker.sh config/mm2.properties # Alternatively — run against a specific pair of clusters only bin/connect-mirror-maker.sh \ --clusters us-east,us-west \ config/mm2.properties
Resulting Topic Names on us-west
Consumer group offset sync runs on the interval defined by sync.group.offsets.interval.seconds. For tighter RPO targets, reduce this value, but be aware of the additional load on both clusters.

In an active/active topology, both clusters accept writes from producers. Each cluster replicates its local topics to the other. The critical challenge is cycle prevention — without it, a record produced on us-east would be replicated to us-west, then replicated back to us-east, and so on indefinitely.

Cycle Prevention via DefaultReplicationPolicy

DefaultReplicationPolicy prevents cycles by checking the topic name prefix before replicating. If a topic on us-west is named us-east.orders, the us-west→us-east connector recognises the us-east. prefix and skips it — the record has already originated from us-east and must not be sent back.

Full mm2.properties — Active/Active
# ── Cluster aliases ────────────────────────────────────────────── clusters = us-east, us-west # ── Bootstrap servers ───────────────────────────────────────────── us-east.bootstrap.servers = kafka-east-1:9092,kafka-east-2:9092,kafka-east-3:9092 us-west.bootstrap.servers = kafka-west-1:9092,kafka-west-2:9092,kafka-west-3:9092 # ── Bidirectional replication ───────────────────────────────────── us-east->us-west.enabled = true us-west->us-east.enabled = true # ── Topics to replicate ─────────────────────────────────────────── us-east->us-west.topics = .* us-west->us-east.topics = .* # ── Exclude internal and already-replicated topics ──────────────── # DefaultReplicationPolicy auto-excludes prefixed topics, but explicit # blacklist guards against edge cases. us-east->us-west.topics.blacklist = .*\.internal, .*-changelog, connect-.*, __.* us-west->us-east.topics.blacklist = .*\.internal, .*-changelog, connect-.*, __.* # ── Replication factor ──────────────────────────────────────────── replication.factor = 3 # ── Consumer group offset sync ──────────────────────────────────── us-east->us-west.sync.group.offsets.enabled = true us-east->us-west.sync.group.offsets.interval.seconds = 60 us-west->us-east.sync.group.offsets.enabled = true us-west->us-east.sync.group.offsets.interval.seconds = 60 # ── Topic config sync ───────────────────────────────────────────── us-east->us-west.sync.topic.configs.enabled = true us-west->us-east.sync.topic.configs.enabled = true # ── ACL sync ────────────────────────────────────────────────────── us-east->us-west.sync.topic.acls.enabled = true us-west->us-east.sync.topic.acls.enabled = true # ── Replication policy — MUST be DefaultReplicationPolicy for A/A ─ # IdentityReplicationPolicy does NOT add a prefix and will cause cycles. replication.policy.class = org.apache.kafka.connect.mirror.DefaultReplicationPolicy replication.policy.separator = . # ── Heartbeat & checkpoint ──────────────────────────────────────── us-east->us-west.emit.heartbeats.enabled = true us-east->us-west.emit.heartbeats.interval.seconds = 5 us-east->us-west.emit.checkpoints.enabled = true us-east->us-west.emit.checkpoints.interval.seconds = 60 us-west->us-east.emit.heartbeats.enabled = true us-west->us-east.emit.heartbeats.interval.seconds = 5 us-west->us-east.emit.checkpoints.enabled = true us-west->us-east.emit.checkpoints.interval.seconds = 60 # ── Connect worker shared settings ─────────────────────────────── offset.storage.replication.factor = 3 status.storage.replication.factor = 3 config.storage.replication.factor = 3
DefaultReplicationPolicy vs IdentityReplicationPolicy
PolicyTopic NamingCycle SafeUse Case
DefaultReplicationPolicy source-alias.topic-name Yes Active/active and active/passive. Standard choice.
IdentityReplicationPolicy topic-name (unchanged) No Active/passive only, when consumers must not change their topic subscription. Requires disabling the reverse flow explicitly.
Conflict Resolution in Active/Active

MirrorMaker 2 does not perform record-level merge or conflict resolution. If both clusters accept writes to the same logical topic (e.g., orders on us-east and orders on us-west), consumers on each cluster will see:

Applications that need a unified view must read from both topics and deduplicate by record key or a business-level identifier. Conflict resolution logic (last-write-wins, CRDT, version vectors) is the responsibility of the application layer.

For most active/active deployments, producers in each region only write to region-specific topics (e.g., orders-east, orders-west). MirrorMaker 2 then replicates each to the other cluster. This avoids same-key conflicts entirely.

The offset of a record on the target cluster is almost never the same as on the source cluster. Compaction, different retention periods, and independent log segment lifecycles cause offsets to diverge. MirrorCheckpointConnector bridges this gap.

How MirrorCheckpointConnector Works
Programmatic Offset Translation

If you need to translate offsets manually (e.g., to determine the exact target partition/offset before a failover), use the RemoteClusterUtils utility class.

import org.apache.kafka.connect.mirror.RemoteClusterUtils; import org.apache.kafka.clients.consumer.OffsetAndMetadata; import org.apache.kafka.common.TopicPartition; import java.util.Map; import java.util.Properties; Properties props = new Properties(); props.put("bootstrap.servers", "kafka-west-1:9092"); // Translate consumer group offsets from us-east to us-west // remoteClusterAlias = "us-east" (source cluster alias as seen from target) // consumerGroupId = the group you want to translate Map<TopicPartition, OffsetAndMetadata> translatedOffsets = RemoteClusterUtils.translateOffsets( props, "us-east", // remote (source) cluster alias "my-consumer-group", Duration.ofSeconds(10) ); // Assign translated offsets to a new consumer on the target cluster KafkaConsumer<String, String> drConsumer = new KafkaConsumer<>(props); drConsumer.assign(translatedOffsets.keySet()); translatedOffsets.forEach(drConsumer::seek);
Consumer Failover Steps
The lag between when a record is produced on the source and when its translated offset checkpoint is written to the target is bounded by emit.checkpoints.interval.seconds. For critical DR scenarios, set this to 30 seconds or less.

MM2 exposes monitoring data through three channels: the heartbeat topic, JMX metrics from the embedded Connect worker, and standard Kafka consumer group lag tooling.

Heartbeat Topic

MirrorHeartbeatConnector produces a record to mm2-heartbeats on the source every emit.heartbeats.interval.seconds seconds. As this record propagates through the replication pipeline to the target, you can measure the end-to-end replication latency by comparing the record timestamp to the current time on the target cluster.

# Consume heartbeats on the target cluster to verify replication is alive bin/kafka-console-consumer.sh \ --bootstrap-server kafka-west-1:9092 \ --topic us-east.mm2-heartbeats \ --from-beginning \ --property print.timestamp=true # Expected output (one record per heartbeat interval): # CreateTime:1713456000000 {"sourceClusterAlias":"us-east","targetClusterAlias":"us-west","timestamp":1713456000000}
Key JMX Metrics
MetricMBeanWhat it Measures
Replication Latency kafka.connect:type=MirrorSourceConnector,target=*,topic=*,partition=* Milliseconds between record timestamp on source and when it was written to target
Record Count kafka.connect:type=MirrorSourceConnector,... Number of records replicated per topic/partition
Connector Status kafka.connect:type=connector-metrics,connector=* RUNNING / PAUSED / FAILED state of each connector
Consumer Lag kafka.consumer:type=consumer-fetch-manager-metrics How far MM2's internal source connector consumer is behind the source topic head
Checking Consumer Lag on Source Connector
# Check lag of the MirrorSourceConnector's internal consumer group on source cluster bin/kafka-consumer-groups.sh \ --bootstrap-server kafka-east-1:9092 \ --group us-east->us-west \ --describe # Check translated consumer group on target cluster bin/kafka-consumer-groups.sh \ --bootstrap-server kafka-west-1:9092 \ --group my-app-consumer-group \ --describe High replication latency combined with zero consumer lag on the source connector usually indicates a slow network link between DCs, not a MM2 configuration problem. Consider tuning fetch.max.bytes and producer.batch.size in the connector config.

Beyond record replication, MM2 can synchronise topic configurations and ACLs from the source cluster to the target, reducing the operational overhead of keeping cluster configurations in lockstep.

What Gets Synced
What Does Not Get Synced
Configuration
# Enable topic config sync (retention, compaction, etc.) us-east->us-west.sync.topic.configs.enabled = true # Configs to exclude from sync (comma-separated regex) us-east->us-west.sync.topic.configs.exclude = follower\.replication\.throttled\.replicas, \ leader\.replication\.throttled\.replicas # Enable ACL sync — requires Kafka 2.7+ on both clusters us-east->us-west.sync.topic.acls.enabled = true Even with sync.topic.configs.enabled = true, changes on the source propagate to the target on the next config-sync cycle (default: every 10 minutes). For immediate propagation, restart the MirrorSourceConnector or force a task restart via the Connect REST API.

MirrorMaker 2 can be deployed in two modes: as a standalone process that embeds its own Connect worker, or as a set of connectors deployed to an existing dedicated Connect cluster.

Mode 1 — Standalone (Embedded) Process

The simplest deployment. Run connect-mirror-maker.sh with your mm2.properties file. MM2 starts an embedded Connect worker, creates the required connectors, and manages their lifecycle. This mode is self-contained but does not use a distributed Connect cluster's offset or config storage topics.

# Single node — start the embedded MM2 process bin/connect-mirror-maker.sh config/mm2.properties # Restrict to a specific pair of clusters (useful when mm2.properties # defines many clusters but a given node should only handle one flow) bin/connect-mirror-maker.sh \ --clusters us-east,us-west \ config/mm2.properties
Mode 2 — Distributed Connect Cluster

For production, deploy MM2 connectors to a dedicated distributed Connect cluster. This provides horizontal scalability, fault tolerance via task redistribution, and a REST management API.

# Start a distributed Connect worker pointing to the MM2 config bin/connect-distributed.sh config/connect-distributed.properties & # Deploy the MirrorSourceConnector via the Connect REST API curl -X POST http://connect-worker:8083/connectors \ -H "Content-Type: application/json" \ -d '{ "name": "us-east-to-us-west-source", "config": { "connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector", "source.cluster.alias": "us-east", "target.cluster.alias": "us-west", "source.cluster.bootstrap.servers": "kafka-east-1:9092", "target.cluster.bootstrap.servers": "kafka-west-1:9092", "topics": ".*", "replication.factor": "3", "sync.topic.configs.enabled": "true", "replication.policy.class": "org.apache.kafka.connect.mirror.DefaultReplicationPolicy" } }' # Deploy the MirrorCheckpointConnector curl -X POST http://connect-worker:8083/connectors \ -H "Content-Type: application/json" \ -d '{ "name": "us-east-to-us-west-checkpoint", "config": { "connector.class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector", "source.cluster.alias": "us-east", "target.cluster.alias": "us-west", "source.cluster.bootstrap.servers": "kafka-east-1:9092", "target.cluster.bootstrap.servers": "kafka-west-1:9092", "sync.group.offsets.enabled": "true", "emit.checkpoints.interval.seconds": "60" } }' # Deploy the MirrorHeartbeatConnector curl -X POST http://connect-worker:8083/connectors \ -H "Content-Type: application/json" \ -d '{ "name": "us-east-to-us-west-heartbeat", "config": { "connector.class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector", "source.cluster.alias": "us-east", "target.cluster.alias": "us-west", "source.cluster.bootstrap.servers": "kafka-east-1:9092", "target.cluster.bootstrap.servers": "kafka-west-1:9092", "emit.heartbeats.interval.seconds": "5" } }'
Scaling
The --clusters flag in standalone mode controls which cluster pairs a given MM2 process handles. In large multi-region setups, deploy one MM2 process per region pair rather than one central process handling all flows — this avoids cross-region consumer traffic from the MM2 process itself.

The following issues account for the majority of MM2 support cases in production environments.

1. Topic Name Prefix Confusion in Active/Active
2. Consumer Offset Lag When DR Cluster is Behind
3. Partition Count Mismatch
4. Forgetting to Enable the Heartbeat Connector
5. Missing SSL Config for Cross-DC Communication
# Both the source consumer AND the target producer inside MM2 # need their own SSL settings. Prefix with the cluster alias. us-east.security.protocol = SSL us-east.ssl.truststore.location = /etc/kafka/certs/east-truststore.jks us-east.ssl.truststore.password = changeit us-east.ssl.keystore.location = /etc/kafka/certs/east-keystore.jks us-east.ssl.keystore.password = changeit us-east.ssl.key.password = changeit us-west.security.protocol = SSL us-west.ssl.truststore.location = /etc/kafka/certs/west-truststore.jks us-west.ssl.truststore.password = changeit us-west.ssl.keystore.location = /etc/kafka/certs/west-keystore.jks us-west.ssl.keystore.password = changeit us-west.ssl.key.password = changeit # SASL/PLAIN example for clusters using authentication # us-east.security.protocol = SASL_SSL # us-east.sasl.mechanism = PLAIN # us-east.sasl.jaas.config = org.apache.kafka.common.security.plain.PlainLoginModule required \ # username="mm2-user" password="mm2-secret";
6. Replication Factor Exceeds Target Cluster Broker Count
Always run a full DR drill — stop the source cluster, trigger consumer failover, and verify end-to-end message flow on the DR cluster — before you need it in a real incident. Paper DR plans are not a substitute for a tested runbook.