Introduction of a new cross-datacenter replication tool for Apache Kafka
Apache Kafka is the de-facto data streaming platform for high-performance data pipelines, streaming analytics and mission-critical applications. For enterprises, as business continues to grow, many scenarios will require to evolve from one Kafka instance to multiple instances. For example, critical services can be migrated and run on dedicated instances to achieve better performance and isolation to satisfy Service Level Agreement or Objective.
Another example is Disaster Recovery (DR) — the instance in a primary datacenter is continuously mirrored to the backup datacenter. When the disaster happens in the primary instance, applications (or “services” alternatively) will quickly fail over to backup datacenter and continue to operate with minimum downtime.
Last but not least, when a business operates in multi-datacenter mode, data are first routed to the geographically nearby datacenter for locality, then transferred to a central cluster in a remote datacenter, called “aggregate cluster”, for a holistic and complete view of data.
Any of the above scenarios demand a tool that replicates real-time data with 5 requirements: (1) fault-tolerant and scale horizontally, (2) low-latency and performant (3) across data centers, (4) very strong message delivery guarantee, (5) simple and transparent failover and failback to applications.
A legacy open-source tool called MirrorMaker can copy data from one to another Kafka instance. However, it has several shortcomings that make it challenging to maintain a low-latency multi-datacenter deployment and build a transparent failover and failback plan, mostly because of the following:
- No clean mechanism to migrate producers or consumers between mirrored Kafka instances. Consumer offsets from two instances never make sense to each other.
- Rebalancing causes latency spikes, which may trigger further rebalances, as it uses high-level Consumer API.
uReplicator from Uber solved some of the MirrorMaker problems. But it uses Apache Helix that requires additional domain knowledge and maintenance.
Confluent Replicator should be a better solution, but it is a proprietary and enterprise software.
We want to promote MirrorMaker 2 (or called MM2), a new Kafka component to replace the legacy MirrorMaker. It satisfies all above five requirements of replicating data between Kafka instances across datacenters.
In the following, we will discuss three major practical use cases of MM2:
Migrate to new Kafka instance
As workload grows over the time, the following risks will be eventually exposed on one Kafka instance:
- any turbulence of Kafka instance will impact all services or applications
- resource contention: services are competing with the shared resources
- unpredictable SLO: a service could take unbounded amount of resources, causing starvation of other services to miss SLO
- slower recovery and maintenance: rebalancing the data partitions in Kafka instance will become slower when the data volume and workload is larger.
- no “one size fits all”: one set of configurations can not satisfy the “conflicting” expectations of different services (stability over performance, performance over consistency)
- any maintenance of Kafka instance (e.g. upgrade, node swap) needs communication to all engineering teams
To mitigate the above risks: critical services can be considered running on dedicated Kafka instances. To migrate from one Kafka instance to another, AWS has a tutorial for their Managed Kafka, which can be generalized to the open-source Apache Kafka.
Though the data in Kafka instance has 3 replicas across all brokers, it is still possible that the whole instance is unavailable when all brokers are located in the same region that can suddenly goes offline, or majority of brokers is offline because of the outage of some racks. To achieve higher availability, it becomes important to set up a backup Kafka instance and continuously replicate from the primary to the backup instance. When the primary is not available, all services are routed to the backup.
It is simplest to not have producers to send new data to the backup while the primary is down. More realistically, producers are redirected and continue producing new data to the backup. When primary is restored from disaster and data is intact, only new data generated during the disaster need to be mirrored back from backup to the primary instance by MM2.
Active-Active Replication across Multi-Datacenters
In the active-active design shown below, one MM2 instance copies data from origin DC-1 to destination DC-2, and another MM2 copies data from origin DC-2 to destination DC-1.
“Producer 1" writes to “Topic 1” in their local DC-1 and “Producer 2” writes to “Topic 2” in their local DC-2.
“Consumer 1” can read data from “Topic 1” that is produced by “Producer 1” in DC-1, and also read data from “Topic 2 mirrored” that is originally produced by “Producer 2” in DC-2 and then replicated to DC-1. Vice versa.
In the event of a disaster causing DC-1 to fail, “Producer 2” and “Consumer 2” in DC-2 can continue operating. If the outage in DC-1 only takes a short period of time and the data produced to “Topic 1” is not too critical, it is not always necessary to aggressively fail over the “Producer 1” to DC-2, since DC-2 is still operating. When DC-1 recovers, the two instances of MM2 will catch up and continue to replicate data across datacenters.
From the application’s point of view, active-active deployment increases both availability and performance, as “Consumer 1” and “Consumer 2” receive the same data (may not in the same order) almost in real time. One data center can completely fail without impacting the data consumption at the other datacenter.
In the next few blogs, we will introduce several follow-up topics, including:
- how to set up a MM2 in 5 minutes for PoC
- exactly-once message delivery guarantee across datacenters
- tools to migrate from existing mirroring solutions to MM2
Please stay tuned for more articles!
A Fault-tolerant Kafka Replication Tool across Multiple Datacenters that scales was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.