We are working on a replacement for our existing producer. The code is
available in trunk now and can be considered beta quality. Below is the
configuration for the new producer.
Name | Type | Default | Importance | Description |
bootstrap.servers | list | high | A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. Data will be load balanced over all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the formhost1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down). If no server in this list is available sending data will fail until on becomes available. (用于建立与kafka集群连接的host/port组。数据将会在所有servers上均衡加载,不管哪些server是指定用于bootstrapping。这个列表仅仅影响初始化的hosts(用于发现全部的servers)。这个列表格式: host1:port1,host2:port2,... 因为这些server仅仅是用于初始化的连接,以发现集群所有成员关系(可能会动态的变化),这个列表不需要包含所有的servers(你可能想要不止一 个server,尽管这样,可能某个server宕机了)。如果没有server在这个列表出现,则发送数据会一直失败,直到列表可用。) | |
acks | string | 1 | high | The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are common:
buffer.memory | long | 33554432 | high | The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will either block or throw an exception based on the preference specified byblock.on.buffer.full.
compression.type | string | none | high | The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values arenone,gzip, orsnappy. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression). (producer用于压缩数据的压缩类型。默认是无压缩。正确的选项值是none、gzip、snappy。 压缩最好用于批量处理,批量处理消息越多,压缩性能越好。) |
retries | int | 0 | high | Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries will potentially change the ordering of records because if two records are sent to a single partition, and the first fails and is retried but the second succeeds, then the second record may appear first. (设置大于0的值将使客户端重新发送任何数据,一旦这些数据发送失败。注意,这些重试与客户端接收到发送错误时的重试没有什么不同。允许重试将潜在的改变数据的顺序,如果这两个消息记录都是发送到同一个partition,则第一个消息失败第二个发送成功,则第二条消息会比第一条消息出现要早。) |
batch.size | int | 16384 | medium | The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes.
client.id | string | medium | The id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included with the request. The application can set any string it wants as this has no functional purpose other than in logging and metrics. (当向server发出请求时,这个字符串会发送给server。目的是能够追踪请求源头,以此来允许ip/port许可列表之外的一些应用可以发送信息。这项应用可以设置任意字符串,因为没有任何功能性的目的,除了记录和跟踪) | |
linger.ms | long | 0 | medium |
max.request.size | int | 1048576 | medium | The maximum size of a request. This is also effectively a cap on the maximum record size. Note that the server has its own cap on record size which may be different from this. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests. (请求的最大字节数。这也是对最大记录尺寸的有效覆盖。注意:server具有自己对消息记录尺寸的覆盖,这些尺寸和这个设置不同。此项设置将会限制producer每次批量发送请求的数目,以防发出巨量的请求。) |
receive.buffer.bytes | int | 32768 | medium | The size of the TCP receive buffer to use when reading data (TCP receive缓存大小,当阅读数据时使用) |
send.buffer.bytes | int | 131072 | medium | The size of the TCP send buffer to use when sending data (TCP send缓存大小,当发送数据时使用) |
timeout.ms | int | 30000 | medium | The configuration controls the maximum amount of time the server will wait for acknowledgments from followers to meet the acknowledgment requirements the producer has specified with theacksconfiguration. If the requested number of acknowledgments are not met when the timeout elapses an error will be returned. This timeout is measured on the server side and does not include the network latency of the request. (此配置选项控制server等待来自followers的确认的最大时间。如果确认的请求数目在此时间内没有实现,则会返回一个错误。这个超时限制是以server端度量的,没有包含请求的网络延迟) |
block.on.buffer.full | boolean | true | low | When our memory buffer is exhausted we must either stop accepting new records (block) or throw errors. By default this setting is true and we block, however in some scenarios blocking is not desirable and it is better to immediately give an error. Setting this tofalsewill accomplish that: the producer will throw a BufferExhaustedException if a recrord is sent and the buffer space is full. (当我们内存缓存用尽时,必须停止接收新消息记录或者抛出错误。默认情况下,这个设置为真,然而某些阻塞可能不值得期待,因此立即抛出错误更好。设置为 false则会这样:producer会抛出一个异常错误:BufferExhaustedException, 如果记录已经发送同时缓存已满) |
metadata.fetch.timeout.ms | long | 60000 | low | The first time data is sent to a topic we must fetch metadata about that topic to know which servers host the topic's partitions. This configuration controls the maximum amount of time we will block waiting for the metadata fetch to succeed before throwing an exception back to the client. (是指我们所获取的一些元素据的第一个时间数据。元素据包含:topic,host,partitions。此项配置是指当等待元素据fetch成功完成所需要的时间,否则会跑出异常给客户端。) |
metadata.max.age.ms | long | 300000 | low | The period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions. (以微秒为单位的时间,是在我们强制更新metadata的时间间隔。即使我们没有看到任何partition leadership改变。) |
metric.reporters | list | [] | low | A list of classes to use as metrics reporters. Implementing theMetricReporterinterface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics. (类的列表,用于衡量指标。实现MetricReporter接口,将允许增加一些类,这些类在新的衡量指标产生时就会改变。JmxReporter总会包含用于注册JMX统计) |
metrics.num.samples | int | 2 | low | The number of samples maintained to compute metrics. (用于维护metrics的样本数) |
metrics.sample.window.ms | long | 30000 | low | The metrics system maintains a configurable number of samples over a fixed window size. This configuration controls the size of the window. For example we might maintain two samples each measured over a 30 second period. When a window expires we erase and overwrite the oldest window. (metrics系统维护可配置的样本数量,在一个可修正的window size。这项配置配置了窗口大小,例如。我们可能在30s的期间维护两个样本。当一个窗口推出后,我们会擦除并重写最老的窗口) |
reconnect.backoff.ms | long | 10 | low | The amount of time to wait before attempting to reconnect to a given host when a connection fails. This avoids a scenario where the client repeatedly attempts to connect to a host in a tight loop. (连接失败时,当我们重新连接时的等待时间。这避免了客户端反复重连) |
retry.backoff.ms | long | 100 | low | The amount of time to wait before attempting to retry a failed produce request to a given topic partition. This avoids repeated sending-and-failing in a tight loop. (在试图重试失败的produce请求之前的等待时间。避免陷入发送-失败的死循环中。) |