返回到文章

采纳

编辑于

kafka镜像集群之间的数据

镜像集群之间的数据
kafka
操作


镜像集群之间的数据


We refer to the process of replicating data between Kafka
clusters "mirroring" to avoid confusion with the replication that
happens amongst the nodes in a single cluster. Kafka comes with a tool
for mirroring data between Kafka clusters. The tool reads from one or
more source clusters and writes to a destination cluster, like this:



我们指的是kafka集群之间复制数据“镜像”,为避免在单个集群中的节点之间发生复制混乱的。kafka附带了kafka集群之间的镜像数据的工具。该工具从一个源集群读取和写入到目标集群,像这样:






A common use case for this kind of mirroring is to provide a replica in
another datacenter. This scenario will be discussed in more detail in
the next section.



常见的用例是镜像在另一个数据中心提供一个副本。这种方案的将在下一节详细讨论。








You can run many such mirroring processes to increase throughput and for
fault-tolerance (if one process dies, the others will take overs the
additional load).



你可以运行很多这样的镜像进程来提高吞吐和容错性(如果某个进程挂了,则其他的进程会接管)








Data will be read from topics in the source cluster and written to a
topic with the same name in the destination cluster. In fact the mirror
maker is little more than a Kafka consumer and producer hooked together.



数据从源集群中的topic读取并将其写入到目标集群中相名的topic。事实上,镜像制作不比消费者和生产者连接要好。








The source and destination clusters are completely independent entities:
they can have different numbers of partitions and the offsets will not
be the same. For this reason the mirror cluster is not really intended
as a fault-tolerance mechanism (as the consumer position will be
different); for that we recommend using normal in-cluster replication.
The mirror maker process will, however, retain and use the message key
for partitioning so order is preserved on a per-key basis.



源和目标集群是完全独立的实体:分区数和offset可以都不相同,就是因为这个原因,镜像集群并不是真的打算作为一个容错机制(消费者位置是不同的),为此,我们推荐使用正常的集群复制。然而,镜像制造将保留和使用分区的消息key,以便每个键基础上保存顺序。








Here is an example showing how to mirror a single topic (named my-topic) from two input clusters:



下面是一个示例演示如何从两个输入集群镜像到一个topic(名为:my-topic):


 > bin/kafka-run-class.sh kafka.tools.MirrorMaker
--consumer.config consumer-1.properties --consumer.config consumer-2.properties
--producer.config producer.properties --whitelist my-topic


Note that we specify the list of topics with the--whitelist option. This option allows any regular expression using Java-style regular expressions. So you could mirror two topics named A and B using--whitelist 'A|B'. Or you could mirror all topics using--whitelist ''.
Make sure to quote any regular expression to ensure the shell doesn't
try to expand it as a file path. For convenience we allow the use of ','
instead of '|' to specify a list of topics.



注意,我们用 --whitelist 选项指定topic列表。此选项允许使用java风格的正则表达式。所以你可以使用--whitelist 'A|B' ,A和B是镜像名。或者你可以镜像所有topic。也可以使用--whitelist ‘
’镜像所有topic,为了确保引用的正则表达式不会被shell认为是一个文件路径,我们允许使用‘,’ 而不是’|’指定topic列表。








Sometime it is easier to say what it is that you don't want. Instead of using--whitelist to say what you want to mirror you can use--blacklist to say what to exclude. This also takes a regular expression argument.



你可以很容易的排除哪些是不需要的,可以用--blacklist来排除,目前--new.consumer不支持。







Combining mirroring with the configuration auto.create.topics.enable=true makes it possible to have a replica cluster that will automatically
create and replicate all data in a source cluster even as new topics are
added.



镜像结合配置auto.create.topics.enable=true,这样副本集群就会自动创建和复制。