This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Configuration

1 - Configuration

Kafka uses key-value pairs in the property file format for configuration. These values can be supplied either from a file or programmatically.

Broker Configs

The essential configurations are the following:

  • broker.id
  • log.dirs
  • zookeeper.connect Topic-level configurations and defaults are discussed in more detail below.
    NameDescriptionTypeDefaultValid ValuesImportance
    zookeeper.connectZookeeper host stringstringhigh
    advertised.host.nameDEPRECATED: only used when `advertised.listeners` or `listeners` are not set. Use `advertised.listeners` instead. Hostname to publish to ZooKeeper for clients to use. In IaaS environments, this may need to be different from the interface to which the broker binds. If this is not set, it will use the value for `host.name` if configured. Otherwise it will use the value returned from java.net.InetAddress.getCanonicalHostName().stringnullhigh
    advertised.listenersListeners to publish to ZooKeeper for clients to use, if different than the `listeners` config property. In IaaS environments, this may need to be different from the interface to which the broker binds. If this is not set, the value for `listeners` will be used. Unlike `listeners` it is not valid to advertise the 0.0.0.0 meta-address.stringnullhigh
    advertised.portDEPRECATED: only used when `advertised.listeners` or `listeners` are not set. Use `advertised.listeners` instead. The port to publish to ZooKeeper for clients to use. In IaaS environments, this may need to be different from the port to which the broker binds. If this is not set, it will publish the same port that the broker binds to.intnullhigh
    auto.create.topics.enableEnable auto creation of topic on the serverbooleantruehigh
    auto.leader.rebalance.enableEnables auto leader balancing. A background thread checks and triggers leader balance if required at regular intervalsbooleantruehigh
    background.threadsThe number of threads to use for various background processing tasksint10[1,...]high
    broker.idThe broker id for this server. If unset, a unique broker id will be generated.To avoid conflicts between zookeeper generated broker id's and user configured broker id's, generated broker ids start from reserved.broker.max.id + 1.int-1high
    compression.typeSpecify the final compression type for a given topic. This configuration accepts the standard compression codecs ('gzip', 'snappy', 'lz4'). It additionally accepts 'uncompressed' which is equivalent to no compression; and 'producer' which means retain the original compression codec set by the producer.stringproducerhigh
    delete.topic.enableEnables delete topic. Delete topic through the admin tool will have no effect if this config is turned offbooleantruehigh
    host.nameDEPRECATED: only used when `listeners` is not set. Use `listeners` instead. hostname of broker. If this is set, it will only bind to this address. If this is not set, it will bind to all interfacesstring""high
    leader.imbalance.check.interval.secondsThe frequency with which the partition rebalance check is triggered by the controllerlong300high
    leader.imbalance.per.broker.percentageThe ratio of leader imbalance allowed per broker. The controller would trigger a leader balance if it goes above this value per broker. The value is specified in percentage.int10high
    listenersListener List - Comma-separated list of URIs we will listen on and the listener names. If the listener name is not a security protocol, listener.security.protocol.map must also be set. Specify hostname as 0.0.0.0 to bind to all interfaces. Leave hostname empty to bind to default interface. Examples of legal listener lists: PLAINTEXT://myhost:9092,SSL://:9091 CLIENT://0.0.0.0:9092,REPLICATION://localhost:9093stringnullhigh
    log.dirThe directory in which the log data is kept (supplemental for log.dirs property)string/tmp/kafka-logshigh
    log.dirsThe directories in which the log data is kept. If not set, the value in log.dir is usedstringnullhigh
    log.flush.interval.messagesThe number of messages accumulated on a log partition before messages are flushed to disklong9223372036854775807[1,...]high
    log.flush.interval.msThe maximum time in ms that a message in any topic is kept in memory before flushed to disk. If not set, the value in log.flush.scheduler.interval.ms is usedlongnullhigh
    log.flush.offset.checkpoint.interval.msThe frequency with which we update the persistent record of the last flush which acts as the log recovery pointint60000[0,...]high
    log.flush.scheduler.interval.msThe frequency in ms that the log flusher checks whether any log needs to be flushed to disklong9223372036854775807high
    log.flush.start.offset.checkpoint.interval.msThe frequency with which we update the persistent record of log start offsetint60000[0,...]high
    log.retention.bytesThe maximum size of the log before deleting itlong-1high
    log.retention.hoursThe number of hours to keep a log file before deleting it (in hours), tertiary to log.retention.ms propertyint168high
    log.retention.minutesThe number of minutes to keep a log file before deleting it (in minutes), secondary to log.retention.ms property. If not set, the value in log.retention.hours is usedintnullhigh
    log.retention.msThe number of milliseconds to keep a log file before deleting it (in milliseconds), If not set, the value in log.retention.minutes is usedlongnullhigh
    log.roll.hoursThe maximum time before a new log segment is rolled out (in hours), secondary to log.roll.ms propertyint168[1,...]high
    log.roll.jitter.hoursThe maximum jitter to subtract from logRollTimeMillis (in hours), secondary to log.roll.jitter.ms propertyint0[0,...]high
    log.roll.jitter.msThe maximum jitter to subtract from logRollTimeMillis (in milliseconds). If not set, the value in log.roll.jitter.hours is usedlongnullhigh
    log.roll.msThe maximum time before a new log segment is rolled out (in milliseconds). If not set, the value in log.roll.hours is usedlongnullhigh
    log.segment.bytesThe maximum size of a single log fileint1073741824[14,...]high
    log.segment.delete.delay.msThe amount of time to wait before deleting a file from the filesystemlong60000[0,...]high
    message.max.bytes

    The largest record batch size allowed by Kafka. If this is increased and there are consumers older than 0.10.2, the consumers' fetch size must also be increased so that the they can fetch record batches this large.

    In the latest message format version, records are always grouped into batches for efficiency. In previous message format versions, uncompressed records are not grouped into batches and this limit only applies to a single record in that case.

    This can be set per topic with the topic level max.message.bytes config.

    int1000012[0,...]high
    min.insync.replicasWhen a producer sets acks to "all" (or "-1"), min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend).
    When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of "all". This will ensure that the producer raises an exception if a majority of replicas do not receive a write.
    int1[1,...]high
    num.io.threadsThe number of threads that the server uses for processing requests, which may include disk I/Oint8[1,...]high
    num.network.threadsThe number of threads that the server uses for receiving requests from the network and sending responses to the networkint3[1,...]high
    num.recovery.threads.per.data.dirThe number of threads per data directory to be used for log recovery at startup and flushing at shutdownint1[1,...]high
    num.replica.fetchersNumber of fetcher threads used to replicate messages from a source broker. Increasing this value can increase the degree of I/O parallelism in the follower broker.int1high
    offset.metadata.max.bytesThe maximum size for a metadata entry associated with an offset commitint4096high
    offsets.commit.required.acksThe required acks before the commit can be accepted. In general, the default (-1) should not be overriddenshort-1high
    offsets.commit.timeout.msOffset commit will be delayed until all replicas for the offsets topic receive the commit or this timeout is reached. This is similar to the producer request timeout.int5000[1,...]high
    offsets.load.buffer.sizeBatch size for reading from the offsets segments when loading offsets into the cache.int5242880[1,...]high
    offsets.retention.check.interval.msFrequency at which to check for stale offsetslong600000[1,...]high
    offsets.retention.minutesOffsets older than this retention period will be discardedint1440[1,...]high
    offsets.topic.compression.codecCompression codec for the offsets topic - compression may be used to achieve "atomic" commitsint0high
    offsets.topic.num.partitionsThe number of partitions for the offset commit topic (should not change after deployment)int50[1,...]high
    offsets.topic.replication.factorThe replication factor for the offsets topic (set higher to ensure availability). Internal topic creation will fail until the cluster size meets this replication factor requirement.short3[1,...]high
    offsets.topic.segment.bytesThe offsets topic segment bytes should be kept relatively small in order to facilitate faster log compaction and cache loadsint104857600[1,...]high
    portDEPRECATED: only used when `listeners` is not set. Use `listeners` instead. the port to listen and accept connections onint9092high
    queued.max.requestsThe number of queued requests allowed before blocking the network threadsint500[1,...]high
    quota.consumer.defaultDEPRECATED: Used only when dynamic default quotas are not configured for or in Zookeeper. Any consumer distinguished by clientId/consumer group will get throttled if it fetches more bytes than this value per-secondlong9223372036854775807[1,...]high
    quota.producer.defaultDEPRECATED: Used only when dynamic default quotas are not configured for , or in Zookeeper. Any producer distinguished by clientId will get throttled if it produces more bytes than this value per-secondlong9223372036854775807[1,...]high
    replica.fetch.min.bytesMinimum bytes expected for each fetch response. If not enough bytes, wait up to replicaMaxWaitTimeMsint1high
    replica.fetch.wait.max.msmax wait time for each fetcher request issued by follower replicas. This value should always be less than the replica.lag.time.max.ms at all times to prevent frequent shrinking of ISR for low throughput topicsint500high
    replica.high.watermark.checkpoint.interval.msThe frequency with which the high watermark is saved out to disklong5000high
    replica.lag.time.max.msIf a follower hasn't sent any fetch requests or hasn't consumed up to the leaders log end offset for at least this time, the leader will remove the follower from isrlong10000high
    replica.socket.receive.buffer.bytesThe socket receive buffer for network requestsint65536high
    replica.socket.timeout.msThe socket timeout for network requests. Its value should be at least replica.fetch.wait.max.msint30000high
    request.timeout.msThe configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.int30000high
    socket.receive.buffer.bytesThe SO_RCVBUF buffer of the socket sever sockets. If the value is -1, the OS default will be used.int102400high
    socket.request.max.bytesThe maximum number of bytes in a socket requestint104857600[1,...]high
    socket.send.buffer.bytesThe SO_SNDBUF buffer of the socket sever sockets. If the value is -1, the OS default will be used.int102400high
    transaction.max.timeout.msThe maximum allowed timeout for transactions. If a client’s requested transaction time exceed this, then the broker will return an error in InitProducerIdRequest. This prevents a client from too large of a timeout, which can stall consumers reading from topics included in the transaction.int900000[1,...]high
    transaction.state.log.load.buffer.sizeBatch size for reading from the transaction log segments when loading producer ids and transactions into the cache.int5242880[1,...]high
    transaction.state.log.min.isrOverridden min.insync.replicas config for the transaction topic.int2[1,...]high
    transaction.state.log.num.partitionsThe number of partitions for the transaction topic (should not change after deployment).int50[1,...]high
    transaction.state.log.replication.factorThe replication factor for the transaction topic (set higher to ensure availability). Internal topic creation will fail until the cluster size meets this replication factor requirement.short3[1,...]high
    transaction.state.log.segment.bytesThe transaction topic segment bytes should be kept relatively small in order to facilitate faster log compaction and cache loadsint104857600[1,...]high
    transactional.id.expiration.msThe maximum amount of time in ms that the transaction coordinator will wait before proactively expire a producer's transactional id without receiving any transaction status updates from it.int604800000[1,...]high
    unclean.leader.election.enableIndicates whether to enable replicas not in the ISR set to be elected as leader as a last resort, even though doing so may result in data lossbooleanfalsehigh
    zookeeper.connection.timeout.msThe max time that the client waits to establish a connection to zookeeper. If not set, the value in zookeeper.session.timeout.ms is usedintnullhigh
    zookeeper.session.timeout.msZookeeper session timeoutint6000high
    zookeeper.set.aclSet client to use secure ACLsbooleanfalsehigh
    broker.id.generation.enableEnable automatic broker id generation on the server. When enabled the value configured for reserved.broker.max.id should be reviewed.booleantruemedium
    broker.rackRack of the broker. This will be used in rack aware replication assignment for fault tolerance. Examples: `RACK1`, `us-east-1d`stringnullmedium
    connections.max.idle.msIdle connections timeout: the server socket processor threads close the connections that idle more than thislong600000medium
    controlled.shutdown.enableEnable controlled shutdown of the serverbooleantruemedium
    controlled.shutdown.max.retriesControlled shutdown can fail for multiple reasons. This determines the number of retries when such failure happensint3medium
    controlled.shutdown.retry.backoff.msBefore each retry, the system needs time to recover from the state that caused the previous failure (Controller fail over, replica lag etc). This config determines the amount of time to wait before retrying.long5000medium
    controller.socket.timeout.msThe socket timeout for controller-to-broker channelsint30000medium
    default.replication.factordefault replication factors for automatically created topicsint1medium
    delete.records.purgatory.purge.interval.requestsThe purge interval (in number of requests) of the delete records request purgatoryint1medium
    fetch.purgatory.purge.interval.requestsThe purge interval (in number of requests) of the fetch request purgatoryint1000medium
    group.initial.rebalance.delay.msThe amount of time the group coordinator will wait for more consumers to join a new group before performing the first rebalance. A longer delay means potentially fewer rebalances, but increases the time until processing begins.int3000medium
    group.max.session.timeout.msThe maximum allowed session timeout for registered consumers. Longer timeouts give consumers more time to process messages in between heartbeats at the cost of a longer time to detect failures.int300000medium
    group.min.session.timeout.msThe minimum allowed session timeout for registered consumers. Shorter timeouts result in quicker failure detection at the cost of more frequent consumer heartbeating, which can overwhelm broker resources.int6000medium
    inter.broker.listener.nameName of listener used for communication between brokers. If this is unset, the listener name is defined by security.inter.broker.protocol. It is an error to set this and security.inter.broker.protocol properties at the same time.stringnullmedium
    inter.broker.protocol.versionSpecify which version of the inter-broker protocol will be used. This is typically bumped after all brokers were upgraded to a new version. Example of some valid values are: 0.8.0, 0.8.1, 0.8.1.1, 0.8.2, 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.9.0.1 Check ApiVersion for the full list.string1.0-IV0medium
    log.cleaner.backoff.msThe amount of time to sleep when there are no logs to cleanlong15000[0,...]medium
    log.cleaner.dedupe.buffer.sizeThe total memory used for log deduplication across all cleaner threadslong134217728medium
    log.cleaner.delete.retention.msHow long are delete records retained?long86400000medium
    log.cleaner.enableEnable the log cleaner process to run on the server. Should be enabled if using any topics with a cleanup.policy=compact including the internal offsets topic. If disabled those topics will not be compacted and continually grow in size.booleantruemedium
    log.cleaner.io.buffer.load.factorLog cleaner dedupe buffer load factor. The percentage full the dedupe buffer can become. A higher value will allow more log to be cleaned at once but will lead to more hash collisionsdouble0.9medium
    log.cleaner.io.buffer.sizeThe total memory used for log cleaner I/O buffers across all cleaner threadsint524288[0,...]medium
    log.cleaner.io.max.bytes.per.secondThe log cleaner will be throttled so that the sum of its read and write i/o will be less than this value on averagedouble1.7976931348623157E308medium
    log.cleaner.min.cleanable.ratioThe minimum ratio of dirty log to total log for a log to eligible for cleaningdouble0.5medium
    log.cleaner.min.compaction.lag.msThe minimum time a message will remain uncompacted in the log. Only applicable for logs that are being compacted.long0medium
    log.cleaner.threadsThe number of background threads to use for log cleaningint1[0,...]medium
    log.cleanup.policyThe default cleanup policy for segments beyond the retention window. A comma separated list of valid policies. Valid policies are: "delete" and "compact"listdelete[compact, delete]medium
    log.index.interval.bytesThe interval with which we add an entry to the offset indexint4096[0,...]medium
    log.index.size.max.bytesThe maximum size in bytes of the offset indexint10485760[4,...]medium
    log.message.format.versionSpecify the message format version the broker will use to append messages to the logs. The value should be a valid ApiVersion. Some examples are: 0.8.2, 0.9.0.0, 0.10.0, check ApiVersion for more details. By setting a particular message format version, the user is certifying that all the existing messages on disk are smaller or equal than the specified version. Setting this value incorrectly will cause consumers with older versions to break as they will receive messages with a format that they don't understand.string1.0-IV0medium
    log.message.timestamp.difference.max.msThe maximum difference allowed between the timestamp when a broker receives a message and the timestamp specified in the message. If log.message.timestamp.type=CreateTime, a message will be rejected if the difference in timestamp exceeds this threshold. This configuration is ignored if log.message.timestamp.type=LogAppendTime.The maximum timestamp difference allowed should be no greater than log.retention.ms to avoid unnecessarily frequent log rolling.long9223372036854775807medium
    log.message.timestamp.typeDefine whether the timestamp in the message is message create time or log append time. The value should be either `CreateTime` or `LogAppendTime`stringCreateTime[CreateTime, LogAppendTime]medium
    log.preallocateShould pre allocate file when create new segment? If you are using Kafka on Windows, you probably need to set it to true.booleanfalsemedium
    log.retention.check.interval.msThe frequency in milliseconds that the log cleaner checks whether any log is eligible for deletionlong300000[1,...]medium
    max.connections.per.ipThe maximum number of connections we allow from each ip addressint2147483647[1,...]medium
    max.connections.per.ip.overridesPer-ip or hostname overrides to the default maximum number of connectionsstring""medium
    num.partitionsThe default number of log partitions per topicint1[1,...]medium
    principal.builder.classThe fully qualified name of a class that implements the KafkaPrincipalBuilder interface, which is used to build the KafkaPrincipal object used during authorization. This config also supports the deprecated PrincipalBuilder interface which was previously used for client authentication over SSL. If no principal builder is defined, the default behavior depends on the security protocol in use. For SSL authentication, the principal name will be the distinguished name from the client certificate if one is provided; otherwise, if client authentication is not required, the principal name will be ANONYMOUS. For SASL authentication, the principal will be derived using the rules defined by sasl.kerberos.principal.to.local.rules if GSSAPI is in use, and the SASL authentication ID for other mechanisms. For PLAINTEXT, the principal will be ANONYMOUS.classnullmedium
    producer.purgatory.purge.interval.requestsThe purge interval (in number of requests) of the producer request purgatoryint1000medium
    queued.max.request.bytesThe number of queued bytes allowed before no more requests are readlong-1medium
    replica.fetch.backoff.msThe amount of time to sleep when fetch partition error occurs.int1000[0,...]medium
    replica.fetch.max.bytesThe number of bytes of messages to attempt to fetch for each partition. This is not an absolute maximum, if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that progress can be made. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config).int1048576[0,...]medium
    replica.fetch.response.max.bytesMaximum bytes expected for the entire fetch response. Records are fetched in batches, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that progress can be made. As such, this is not an absolute maximum. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config).int10485760[0,...]medium
    reserved.broker.max.idMax number that can be used for a broker.idint1000[0,...]medium
    sasl.enabled.mechanismsThe list of SASL mechanisms enabled in the Kafka server. The list may contain any mechanism for which a security provider is available. Only GSSAPI is enabled by default.listGSSAPImedium
    sasl.kerberos.kinit.cmdKerberos kinit command path.string/usr/bin/kinitmedium
    sasl.kerberos.min.time.before.reloginLogin thread sleep time between refresh attempts.long60000medium
    sasl.kerberos.principal.to.local.rulesA list of rules for mapping from principal names to short names (typically operating system usernames). The rules are evaluated in order and the first rule that matches a principal name is used to map it to a short name. Any later rules in the list are ignored. By default, principal names of the form {username}/{hostname}@{REALM} are mapped to {username}. For more details on the format please see security authorization and acls. Note that this configuration is ignored if an extension of KafkaPrincipalBuilder is provided by the principal.builder.class configuration.listDEFAULTmedium
    sasl.kerberos.service.nameThe Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config.stringnullmedium
    sasl.kerberos.ticket.renew.jitterPercentage of random jitter added to the renewal time.double0.05medium
    sasl.kerberos.ticket.renew.window.factorLogin thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket.double0.8medium
    sasl.mechanism.inter.broker.protocolSASL mechanism used for inter-broker communication. Default is GSSAPI.stringGSSAPImedium
    security.inter.broker.protocolSecurity protocol used to communicate between brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. It is an error to set this and inter.broker.listener.name properties at the same time.stringPLAINTEXTmedium
    ssl.cipher.suitesA list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.listnullmedium
    ssl.client.authConfigures kafka broker to request client authentication. The following settings are common:
    • ssl.client.auth=required If set to required client authentication is required.
    • ssl.client.auth=requested This means client authentication is optional. unlike requested , if this option is set client can choose not to provide authentication information about itself
    • ssl.client.auth=none This means client authentication is not needed.
    stringnone[required, requested, none]medium
    ssl.enabled.protocolsThe list of protocols enabled for SSL connections.listTLSv1.2,TLSv1.1,TLSv1medium
    ssl.key.passwordThe password of the private key in the key store file. This is optional for client.passwordnullmedium
    ssl.keymanager.algorithmThe algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.stringSunX509medium
    ssl.keystore.locationThe location of the key store file. This is optional for client and can be used for two-way authentication for client.stringnullmedium
    ssl.keystore.passwordThe store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.passwordnullmedium
    ssl.keystore.typeThe file format of the key store file. This is optional for client.stringJKSmedium
    ssl.protocolThe SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.stringTLSmedium
    ssl.providerThe name of the security provider used for SSL connections. Default value is the default security provider of the JVM.stringnullmedium
    ssl.trustmanager.algorithmThe algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.stringPKIXmedium
    ssl.truststore.locationThe location of the trust store file.stringnullmedium
    ssl.truststore.passwordThe password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.passwordnullmedium
    ssl.truststore.typeThe file format of the trust store file.stringJKSmedium
    alter.config.policy.class.nameThe alter configs policy class that should be used for validation. The class should implement the org.apache.kafka.server.policy.AlterConfigPolicy interface.classnulllow
    authorizer.class.nameThe authorizer class that should be used for authorizationstring""low
    create.topic.policy.class.nameThe create topic policy class that should be used for validation. The class should implement the org.apache.kafka.server.policy.CreateTopicPolicy interface.classnulllow
    listener.security.protocol.mapMap between listener names and security protocols. This must be defined for the same security protocol to be usable in more than one port or IP. For example, internal and external traffic can be separated even if SSL is required for both. Concretely, the user could define listeners with names INTERNAL and EXTERNAL and this property as: `INTERNAL:SSL,EXTERNAL:SSL`. As shown, key and value are separated by a colon and map entries are separated by commas. Each listener name should only appear once in the map. Different security (SSL and SASL) settings can be configured for each listener by adding a normalised prefix (the listener name is lowercased) to the config name. For example, to set a different keystore for the INTERNAL listener, a config with name `listener.name.internal.ssl.keystore.location` would be set. If the config for the listener name is not set, the config will fallback to the generic config (i.e. `ssl.keystore.location`).stringPLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSLlow
    metric.reportersA list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.list""low
    metrics.num.samplesThe number of samples maintained to compute metrics.int2[1,...]low
    metrics.recording.levelThe highest recording level for metrics.stringINFOlow
    metrics.sample.window.msThe window of time a metrics sample is computed over.long30000[1,...]low
    quota.window.numThe number of samples to retain in memory for client quotasint11[1,...]low
    quota.window.size.secondsThe time span of each sample for client quotasint1[1,...]low
    replication.quota.window.numThe number of samples to retain in memory for replication quotasint11[1,...]low
    replication.quota.window.size.secondsThe time span of each sample for replication quotasint1[1,...]low
    ssl.endpoint.identification.algorithmThe endpoint identification algorithm to validate server hostname using server certificate.stringnulllow
    ssl.secure.random.implementationThe SecureRandom PRNG implementation to use for SSL cryptography operations.stringnulllow
    transaction.abort.timed.out.transaction.cleanup.interval.msThe interval at which to rollback transactions that have timed outint60000[1,...]low
    transaction.remove.expired.transaction.cleanup.interval.msThe interval at which to remove transactions that have expired due to transactional.id.expiration.ms passingint3600000[1,...]low
    zookeeper.sync.time.msHow far a ZK follower can be behind a ZK leaderint2000low

More details about broker configuration can be found in the scala class kafka.server.KafkaConfig.

Topic-Level Configs

Configurations pertinent to topics have both a server default as well an optional per-topic override. If no per-topic configuration is given the server default is used. The override can be set at topic creation time by giving one or more --config options. This example creates a topic named my-topic with a custom max message size and flush rate:

  > bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic my-topic --partitions 1
      --replication-factor 1 --config max.message.bytes=64000 --config flush.messages=1

Overrides can also be changed or set later using the alter configs command. This example updates the max message size for my-topic :

  > bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name my-topic
      --alter --add-config max.message.bytes=128000

To check overrides set on the topic you can do

  > bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name my-topic --describe

To remove an override you can do

  > bin/kafka-configs.sh --zookeeper localhost:2181  --entity-type topics --entity-name my-topic --alter --delete-config max.message.bytes

The following are the topic-level configurations. The server’s default configuration for this property is given under the Server Default Property heading. A given server default config value only applies to a topic if it does not have an explicit topic config override.

NameDescriptionTypeDefaultValid ValuesServer Default PropertyImportance
cleanup.policyA string that is either "delete" or "compact". This string designates the retention policy to use on old log segments. The default policy ("delete") will discard old segments when their retention time or size limit has been reached. The "compact" setting will enable log compaction on the topic.listdelete[compact, delete]log.cleanup.policymedium
compression.typeSpecify the final compression type for a given topic. This configuration accepts the standard compression codecs ('gzip', 'snappy', lz4). It additionally accepts 'uncompressed' which is equivalent to no compression; and 'producer' which means retain the original compression codec set by the producer.stringproducer[uncompressed, snappy, lz4, gzip, producer]compression.typemedium
delete.retention.msThe amount of time to retain delete tombstone markers for log compacted topics. This setting also gives a bound on the time in which a consumer must complete a read if they begin from offset 0 to ensure that they get a valid snapshot of the final stage (otherwise delete tombstones may be collected before they complete their scan).long86400000[0,...]log.cleaner.delete.retention.msmedium
file.delete.delay.msThe time to wait before deleting a file from the filesystemlong60000[0,...]log.segment.delete.delay.msmedium
flush.messagesThis setting allows specifying an interval at which we will force an fsync of data written to the log. For example if this was set to 1 we would fsync after every message; if it were 5 we would fsync after every five messages. In general we recommend you not set this and use replication for durability and allow the operating system's background flush capabilities as it is more efficient. This setting can be overridden on a per-topic basis (see the per-topic configuration section).long9223372036854775807[0,...]log.flush.interval.messagesmedium
flush.msThis setting allows specifying a time interval at which we will force an fsync of data written to the log. For example if this was set to 1000 we would fsync after 1000 ms had passed. In general we recommend you not set this and use replication for durability and allow the operating system's background flush capabilities as it is more efficient.long9223372036854775807[0,...]log.flush.interval.msmedium
follower.replication.throttled.replicasA list of replicas for which log replication should be throttled on the follower side. The list should describe a set of replicas in the form [PartitionId]:[BrokerId],[PartitionId]:[BrokerId]:... or alternatively the wildcard '*' can be used to throttle all replicas for this topic.list""[partitionId],[brokerId]:[partitionId],[brokerId]:...follower.replication.throttled.replicasmedium
index.interval.bytesThis setting controls how frequently Kafka adds an index entry to it's offset index. The default setting ensures that we index a message roughly every 4096 bytes. More indexing allows reads to jump closer to the exact position in the log but makes the index larger. You probably don't need to change this.int4096[0,...]log.index.interval.bytesmedium
leader.replication.throttled.replicasA list of replicas for which log replication should be throttled on the leader side. The list should describe a set of replicas in the form [PartitionId]:[BrokerId],[PartitionId]:[BrokerId]:... or alternatively the wildcard '*' can be used to throttle all replicas for this topic.list""[partitionId],[brokerId]:[partitionId],[brokerId]:...leader.replication.throttled.replicasmedium
max.message.bytes

The largest record batch size allowed by Kafka. If this is increased and there are consumers older than 0.10.2, the consumers' fetch size must also be increased so that the they can fetch record batches this large.

In the latest message format version, records are always grouped into batches for efficiency. In previous message format versions, uncompressed records are not grouped into batches and this limit only applies to a single record in that case.

int1000012[0,...]message.max.bytesmedium
message.format.versionSpecify the message format version the broker will use to append messages to the logs. The value should be a valid ApiVersion. Some examples are: 0.8.2, 0.9.0.0, 0.10.0, check ApiVersion for more details. By setting a particular message format version, the user is certifying that all the existing messages on disk are smaller or equal than the specified version. Setting this value incorrectly will cause consumers with older versions to break as they will receive messages with a format that they don't understand.string1.0-IV0log.message.format.versionmedium
message.timestamp.difference.max.msThe maximum difference allowed between the timestamp when a broker receives a message and the timestamp specified in the message. If message.timestamp.type=CreateTime, a message will be rejected if the difference in timestamp exceeds this threshold. This configuration is ignored if message.timestamp.type=LogAppendTime.long9223372036854775807[0,...]log.message.timestamp.difference.max.msmedium
message.timestamp.typeDefine whether the timestamp in the message is message create time or log append time. The value should be either `CreateTime` or `LogAppendTime`stringCreateTimelog.message.timestamp.typemedium
min.cleanable.dirty.ratioThis configuration controls how frequently the log compactor will attempt to clean the log (assuming log compaction is enabled). By default we will avoid cleaning a log where more than 50% of the log has been compacted. This ratio bounds the maximum space wasted in the log by duplicates (at 50% at most 50% of the log could be duplicates). A higher ratio will mean fewer, more efficient cleanings but will mean more wasted space in the log.double0.5[0,...,1]log.cleaner.min.cleanable.ratiomedium
min.compaction.lag.msThe minimum time a message will remain uncompacted in the log. Only applicable for logs that are being compacted.long0[0,...]log.cleaner.min.compaction.lag.msmedium
min.insync.replicasWhen a producer sets acks to "all" (or "-1"), this configuration specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend).
When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of "all". This will ensure that the producer raises an exception if a majority of replicas do not receive a write.
int1[1,...]min.insync.replicasmedium
preallocateTrue if we should preallocate the file on disk when creating a new log segment.booleanfalselog.preallocatemedium
retention.bytesThis configuration controls the maximum size a partition (which consists of log segments) can grow to before we will discard old log segments to free up space if we are using the "delete" retention policy. By default there is no size limit only a time limit. Since this limit is enforced at the partition level, multiply it by the number of partitions to compute the topic retention in bytes.long-1log.retention.bytesmedium
retention.msThis configuration controls the maximum time we will retain a log before we will discard old log segments to free up space if we are using the "delete" retention policy. This represents an SLA on how soon consumers must read their data.long604800000log.retention.msmedium
segment.bytesThis configuration controls the segment file size for the log. Retention and cleaning is always done a file at a time so a larger segment size means fewer files but less granular control over retention.int1073741824[14,...]log.segment.bytesmedium
segment.index.bytesThis configuration controls the size of the index that maps offsets to file positions. We preallocate this index file and shrink it only after log rolls. You generally should not need to change this setting.int10485760[0,...]log.index.size.max.bytesmedium
segment.jitter.msThe maximum random jitter subtracted from the scheduled segment roll time to avoid thundering herds of segment rollinglong0[0,...]log.roll.jitter.msmedium
segment.msThis configuration controls the period of time after which Kafka will force the log to roll even if the segment file isn't full to ensure that retention can delete or compact old data.long604800000[0,...]log.roll.msmedium
unclean.leader.election.enableIndicates whether to enable replicas not in the ISR set to be elected as leader as a last resort, even though doing so may result in data loss.booleanfalseunclean.leader.election.enablemedium

Producer Configs

Below is the configuration of the Java producer:

NameDescriptionTypeDefaultValid ValuesImportance
bootstrap.serversA list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).listhigh
key.serializerSerializer class for key that implements the org.apache.kafka.common.serialization.Serializer interface.classhigh
value.serializerSerializer class for value that implements the org.apache.kafka.common.serialization.Serializer interface.classhigh
acksThe number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are allowed:
  • acks=0 If set to zero then the producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record in this case, and the retries configuration will not take effect (as the client won't generally know of any failures). The offset given back for each record will always be set to -1.
  • acks=1 This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.
  • acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee. This is equivalent to the acks=-1 setting.
string1[all, -1, 0, 1]high
buffer.memoryThe total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for max.block.ms after which it will throw an exception.

This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.

long33554432[0,...]high
compression.typeThe compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values are none, gzip, snappy, or lz4. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression).stringnonehigh
retriesSetting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting max.in.flight.requests.per.connection to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first.int0[0,...,2147483647]high
ssl.key.passwordThe password of the private key in the key store file. This is optional for client.passwordnullhigh
ssl.keystore.locationThe location of the key store file. This is optional for client and can be used for two-way authentication for client.stringnullhigh
ssl.keystore.passwordThe store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.passwordnullhigh
ssl.truststore.locationThe location of the trust store file.stringnullhigh
ssl.truststore.passwordThe password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.passwordnullhigh
batch.sizeThe producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes.

No attempt will be made to batch records larger than this size.

Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent.

A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records.

int16384[0,...]medium
client.idAn id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.string""medium
connections.max.idle.msClose idle connections after the number of milliseconds specified by this config.long540000medium
linger.msThe producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle's algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get batch.size worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will 'linger' for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay). Setting linger.ms=5, for example, would have the effect of reducing the number of requests sent but would add up to 5ms of latency to records sent in the absence of load.long0[0,...]medium
max.block.msThe configuration controls how long KafkaProducer.send() and KafkaProducer.partitionsFor() will block.These methods can be blocked either because the buffer is full or metadata unavailable.Blocking in the user-supplied serializers or partitioner will not be counted against this timeout.long60000[0,...]medium
max.request.sizeThe maximum size of a request in bytes. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests. This is also effectively a cap on the maximum record batch size. Note that the server has its own cap on record batch size which may be different from this.int1048576[0,...]medium
partitioner.classPartitioner class that implements the org.apache.kafka.clients.producer.Partitioner interface.classorg.apache.kafka.clients.producer.internals.DefaultPartitionermedium
receive.buffer.bytesThe size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.int32768[-1,...]medium
request.timeout.msThe configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. This should be larger than replica.lag.time.max.ms (a broker configuration) to reduce the possibility of message duplication due to unnecessary producer retries.int30000[0,...]medium
sasl.jaas.configJAAS login context parameters for SASL connections in the format used by JAAS configuration files. JAAS configuration file format is described here. The format for the value is: ' (=)*;'passwordnullmedium
sasl.kerberos.service.nameThe Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config.stringnullmedium
sasl.mechanismSASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.stringGSSAPImedium
security.protocolProtocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.stringPLAINTEXTmedium
send.buffer.bytesThe size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.int131072[-1,...]medium
ssl.enabled.protocolsThe list of protocols enabled for SSL connections.listTLSv1.2,TLSv1.1,TLSv1medium
ssl.keystore.typeThe file format of the key store file. This is optional for client.stringJKSmedium
ssl.protocolThe SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.stringTLSmedium
ssl.providerThe name of the security provider used for SSL connections. Default value is the default security provider of the JVM.stringnullmedium
ssl.truststore.typeThe file format of the trust store file.stringJKSmedium
enable.idempotenceWhen set to 'true', the producer will ensure that exactly one copy of each message is written in the stream. If 'false', producer retries due to broker failures, etc., may write duplicates of the retried message in the stream. Note that enabling idempotence requires max.in.flight.requests.per.connection to be less than or equal to 5, retries to be greater than 0 and acks must be 'all'. If these values are not explicitly set by the user, suitable values will be chosen. If incompatible values are set, a ConfigException will be thrown.booleanfalselow
interceptor.classesA list of classes to use as interceptors. Implementing the org.apache.kafka.clients.producer.ProducerInterceptor interface allows you to intercept (and possibly mutate) the records received by the producer before they are published to the Kafka cluster. By default, there are no interceptors.listnulllow
max.in.flight.requests.per.connectionThe maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).int5[1,...]low
metadata.max.age.msThe period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.long300000[0,...]low
metric.reportersA list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.list""low
metrics.num.samplesThe number of samples maintained to compute metrics.int2[1,...]low
metrics.recording.levelThe highest recording level for metrics.stringINFO[INFO, DEBUG]low
metrics.sample.window.msThe window of time a metrics sample is computed over.long30000[0,...]low
reconnect.backoff.max.msThe maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.long1000[0,...]low
reconnect.backoff.msThe base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.long50[0,...]low
retry.backoff.msThe amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.long100[0,...]low
sasl.kerberos.kinit.cmdKerberos kinit command path.string/usr/bin/kinitlow
sasl.kerberos.min.time.before.reloginLogin thread sleep time between refresh attempts.long60000low
sasl.kerberos.ticket.renew.jitterPercentage of random jitter added to the renewal time.double0.05low
sasl.kerberos.ticket.renew.window.factorLogin thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket.double0.8low
ssl.cipher.suitesA list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.listnulllow
ssl.endpoint.identification.algorithmThe endpoint identification algorithm to validate server hostname using server certificate.stringnulllow
ssl.keymanager.algorithmThe algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.stringSunX509low
ssl.secure.random.implementationThe SecureRandom PRNG implementation to use for SSL cryptography operations.stringnulllow
ssl.trustmanager.algorithmThe algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.stringPKIXlow
transaction.timeout.msThe maximum amount of time in ms that the transaction coordinator will wait for a transaction status update from the producer before proactively aborting the ongoing transaction.If this value is larger than the transaction.max.timeout.ms setting in the broker, the request will fail with a `InvalidTransactionTimeout` error.int60000low
transactional.idThe TransactionalId to use for transactional delivery. This enables reliability semantics which span multiple producer sessions since it allows the client to guarantee that transactions using the same TransactionalId have been completed prior to starting any new transactions. If no TransactionalId is provided, then the producer is limited to idempotent delivery. Note that enable.idempotence must be enabled if a TransactionalId is configured. The default is null, which means transactions cannot be used. Note that transactions requires a cluster of at least three brokers by default what is the recommended setting for production; for development you can change this, by adjusting broker setting `transaction.state.log.replication.factor`.stringnullnon-empty stringlow

For those interested in the legacy Scala producer configs, information can be found here.

Consumer Configs

In 0.9.0.0 we introduced the new Java consumer as a replacement for the older Scala-based simple and high-level consumers. The configs for both new and old consumers are described below.

New Consumer Configs

Below is the configuration for the new consumer:

NameDescriptionTypeDefaultValid ValuesImportance
bootstrap.serversA list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).listhigh
key.deserializerDeserializer class for key that implements the org.apache.kafka.common.serialization.Deserializer interface.classhigh
value.deserializerDeserializer class for value that implements the org.apache.kafka.common.serialization.Deserializer interface.classhigh
fetch.min.bytesThe minimum amount of data the server should return for a fetch request. If insufficient data is available the request will wait for that much data to accumulate before answering the request. The default setting of 1 byte means that fetch requests are answered as soon as a single byte of data is available or the fetch request times out waiting for data to arrive. Setting this to something greater than 1 will cause the server to wait for larger amounts of data to accumulate which can improve server throughput a bit at the cost of some additional latency.int1[0,...]high
group.idA unique string that identifies the consumer group this consumer belongs to. This property is required if the consumer uses either the group management functionality by using subscribe(topic) or the Kafka-based offset management strategy.string""high
heartbeat.interval.msThe expected time between heartbeats to the consumer coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.int3000high
max.partition.fetch.bytesThe maximum amount of data per-partition the server will return. Records are fetched in batches by the consumer. If the first record batch in the first non-empty partition of the fetch is larger than this limit, the batch will still be returned to ensure that the consumer can make progress. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config). See fetch.max.bytes for limiting the consumer request size.int1048576[0,...]high
session.timeout.msThe timeout used to detect consumer failures when using Kafka's group management facility. The consumer sends periodic heartbeats to indicate its liveness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove this consumer from the group and initiate a rebalance. Note that the value must be in the allowable range as configured in the broker configuration by group.min.session.timeout.ms and group.max.session.timeout.ms.int10000high
ssl.key.passwordThe password of the private key in the key store file. This is optional for client.passwordnullhigh
ssl.keystore.locationThe location of the key store file. This is optional for client and can be used for two-way authentication for client.stringnullhigh
ssl.keystore.passwordThe store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.passwordnullhigh
ssl.truststore.locationThe location of the trust store file.stringnullhigh
ssl.truststore.passwordThe password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.passwordnullhigh
auto.offset.resetWhat to do when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted):
  • earliest: automatically reset the offset to the earliest offset
  • latest: automatically reset the offset to the latest offset
  • none: throw exception to the consumer if no previous offset is found for the consumer's group
  • anything else: throw exception to the consumer.
stringlatest[latest, earliest, none]medium
connections.max.idle.msClose idle connections after the number of milliseconds specified by this config.long540000medium
enable.auto.commitIf true the consumer's offset will be periodically committed in the background.booleantruemedium
exclude.internal.topicsWhether records from internal topics (such as offsets) should be exposed to the consumer. If set to true the only way to receive records from an internal topic is subscribing to it.booleantruemedium
fetch.max.bytesThe maximum amount of data the server should return for a fetch request. Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress. As such, this is not a absolute maximum. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config). Note that the consumer performs multiple fetches in parallel.int52428800[0,...]medium
isolation.level

Controls how to read messages written transactionally. If set to read_committed, consumer.poll() will only return transactional messages which have been committed. If set to read_uncommitted (the default), consumer.poll() will return all messages, even transactional messages which have been aborted. Non-transactional messages will be returned unconditionally in either mode.

Messages will always be returned in offset order. Hence, in read_committed mode, consumer.poll() will only return messages up to the last stable offset (LSO), which is the one less than the offset of the first open transaction. In particular any messages appearing after messages belonging to ongoing transactions will be withheld until the relevant transaction has been completed. As a result, read_committed consumers will not be able to read up to the high watermark when there are in flight transactions.

Further, when in read_committed the seekToEnd method will return the LSO

stringread_uncommitted[read_committed, read_uncommitted]medium
max.poll.interval.msThe maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If poll() is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance in order to reassign the partitions to another member.int300000[1,...]medium
max.poll.recordsThe maximum number of records returned in a single call to poll().int500[1,...]medium
partition.assignment.strategyThe class name of the partition assignment strategy that the client will use to distribute partition ownership amongst consumer instances when group management is usedlistclass org.apache.kafka.clients.consumer.RangeAssignormedium
receive.buffer.bytesThe size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.int65536[-1,...]medium
request.timeout.msThe configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.int305000[0,...]medium
sasl.jaas.configJAAS login context parameters for SASL connections in the format used by JAAS configuration files. JAAS configuration file format is described here. The format for the value is: ' (=)*;'passwordnullmedium
sasl.kerberos.service.nameThe Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config.stringnullmedium
sasl.mechanismSASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.stringGSSAPImedium
security.protocolProtocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.stringPLAINTEXTmedium
send.buffer.bytesThe size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.int131072[-1,...]medium
ssl.enabled.protocolsThe list of protocols enabled for SSL connections.listTLSv1.2,TLSv1.1,TLSv1medium
ssl.keystore.typeThe file format of the key store file. This is optional for client.stringJKSmedium
ssl.protocolThe SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.stringTLSmedium
ssl.providerThe name of the security provider used for SSL connections. Default value is the default security provider of the JVM.stringnullmedium
ssl.truststore.typeThe file format of the trust store file.stringJKSmedium
auto.commit.interval.msThe frequency in milliseconds that the consumer offsets are auto-committed to Kafka if enable.auto.commit is set to true.int5000[0,...]low
check.crcsAutomatically check the CRC32 of the records consumed. This ensures no on-the-wire or on-disk corruption to the messages occurred. This check adds some overhead, so it may be disabled in cases seeking extreme performance.booleantruelow
client.idAn id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.string""low
fetch.max.wait.msThe maximum amount of time the server will block before answering the fetch request if there isn't sufficient data to immediately satisfy the requirement given by fetch.min.bytes.int500[0,...]low
interceptor.classesA list of classes to use as interceptors. Implementing the org.apache.kafka.clients.consumer.ConsumerInterceptor interface allows you to intercept (and possibly mutate) records received by the consumer. By default, there are no interceptors.listnulllow
metadata.max.age.msThe period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.long300000[0,...]low
metric.reportersA list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.list""low
metrics.num.samplesThe number of samples maintained to compute metrics.int2[1,...]low
metrics.recording.levelThe highest recording level for metrics.stringINFO[INFO, DEBUG]low
metrics.sample.window.msThe window of time a metrics sample is computed over.long30000[0,...]low
reconnect.backoff.max.msThe maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.long1000[0,...]low
reconnect.backoff.msThe base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.long50[0,...]low
retry.backoff.msThe amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.long100[0,...]low
sasl.kerberos.kinit.cmdKerberos kinit command path.string/usr/bin/kinitlow
sasl.kerberos.min.time.before.reloginLogin thread sleep time between refresh attempts.long60000low
sasl.kerberos.ticket.renew.jitterPercentage of random jitter added to the renewal time.double0.05low
sasl.kerberos.ticket.renew.window.factorLogin thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket.double0.8low
ssl.cipher.suitesA list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.listnulllow
ssl.endpoint.identification.algorithmThe endpoint identification algorithm to validate server hostname using server certificate.stringnulllow
ssl.keymanager.algorithmThe algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.stringSunX509low
ssl.secure.random.implementationThe SecureRandom PRNG implementation to use for SSL cryptography operations.stringnulllow
ssl.trustmanager.algorithmThe algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.stringPKIXlow

Old Consumer Configs

The essential old consumer configurations are the following:

  • group.id
  • zookeeper.connect
    PropertyDefaultDescription
    group.idA string that uniquely identifies the group of consumer processes to which this consumer belongs. By setting the same group id multiple processes indicate that they are all part of the same consumer group.
    zookeeper.connectSpecifies the ZooKeeper connection string in the form hostname:port where host and port are the host and port of a ZooKeeper server. To allow connecting through other ZooKeeper nodes when that ZooKeeper machine is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. The server may also have a ZooKeeper chroot path as part of its ZooKeeper connection string which puts its data under some path in the global ZooKeeper namespace. If so the consumer should use the same chroot path in its connection string. For example to give a chroot path of /chroot/path you would give the connection string as hostname1:port1,hostname2:port2,hostname3:port3/chroot/path.
    consumer.idnullGenerated automatically if not set.
    socket.timeout.ms30 * 1000The socket timeout for network requests. The actual timeout set will be max.fetch.wait + socket.timeout.ms.
    socket.receive.buffer.bytes64 * 1024The socket receive buffer for network requests
    fetch.message.max.bytes1024 * 1024The number of bytes of messages to attempt to fetch for each topic-partition in each fetch request. These bytes will be read into memory for each partition, so this helps control the memory used by the consumer. The fetch request size must be at least as large as the maximum message size the server allows or else it is possible for the producer to send messages larger than the consumer can fetch.
    num.consumer.fetchers1The number fetcher threads used to fetch data.
    auto.commit.enabletrueIf true, periodically commit to ZooKeeper the offset of messages already fetched by the consumer. This committed offset will be used when the process fails as the position from which the new consumer will begin.
    auto.commit.interval.ms60 * 1000The frequency in ms that the consumer offsets are committed to zookeeper.
    queued.max.message.chunks2Max number of message chunks buffered for consumption. Each chunk can be up to fetch.message.max.bytes.
    rebalance.max.retries4When a new consumer joins a consumer group the set of consumers attempt to “rebalance” the load to assign partitions to each consumer. If the set of consumers changes while this assignment is taking place the rebalance will fail and retry. This setting controls the maximum number of attempts before giving up.
    fetch.min.bytes1The minimum amount of data the server should return for a fetch request. If insufficient data is available the request will wait for that much data to accumulate before answering the request.
    fetch.wait.max.ms100The maximum amount of time the server will block before answering the fetch request if there isn’t sufficient data to immediately satisfy fetch.min.bytes
    rebalance.backoff.ms2000Backoff time between retries during rebalance. If not set explicitly, the value in zookeeper.sync.time.ms is used.
    refresh.leader.backoff.ms200Backoff time to wait before trying to determine the leader of a partition that has just lost its leader.
    auto.offset.resetlargestWhat to do when there is no initial offset in ZooKeeper or if an offset is out of range:
  • smallest : automatically reset the offset to the smallest offset
  • largest : automatically reset the offset to the largest offset
  • anything else: throw exception to the consumer
    consumer.timeout.ms | -1 | Throw a timeout exception to the consumer if no message is available for consumption after the specified interval
    exclude.internal.topics | true | Whether messages from internal topics (such as offsets) should be exposed to the consumer.
    client.id | group id value | The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request.
    zookeeper.session.timeout.ms | 6000 | ZooKeeper session timeout. If the consumer fails to heartbeat to ZooKeeper for this period of time it is considered dead and a rebalance will occur.
    zookeeper.connection.timeout.ms | 6000 | The max time that the client waits while establishing a connection to zookeeper.
    zookeeper.sync.time.ms | 2000 | How far a ZK follower can be behind a ZK leader
    offsets.storage | zookeeper | Select where offsets should be stored (zookeeper or kafka).
    offsets.channel.backoff.ms | 1000 | The backoff period when reconnecting the offsets channel or retrying failed offset fetch/commit requests.
    offsets.channel.socket.timeout.ms | 10000 | Socket timeout when reading responses for offset fetch/commit requests. This timeout is also used for ConsumerMetadata requests that are used to query for the offset manager.
    offsets.commit.max.retries | 5 | Retry the offset commit up to this many times on failure. This retry count only applies to offset commits during shut-down. It does not apply to commits originating from the auto-commit thread. It also does not apply to attempts to query for the offset coordinator before committing offsets. i.e., if a consumer metadata request fails for any reason, it will be retried and that retry does not count toward this limit.
    dual.commit.enabled | true | If you are using “kafka” as offsets.storage, you can dual commit offsets to ZooKeeper (in addition to Kafka). This is required during migration from zookeeper-based offset storage to kafka-based offset storage. With respect to any given consumer group, it is safe to turn this off after all instances within that group have been migrated to the new version that commits offsets to the broker (instead of directly to ZooKeeper).
    partition.assignment.strategy | range | Select between the “range” or “roundrobin” strategy for assigning partitions to consumer streams.The round-robin partition assignor lays out all the available partitions and all the available consumer threads. It then proceeds to do a round-robin assignment from partition to consumer thread. If the subscriptions of all consumer instances are identical, then the partitions will be uniformly distributed. (i.e., the partition ownership counts will be within a delta of exactly one across all consumer threads.) Round-robin assignment is permitted only if: (a) Every topic has the same number of streams within a consumer instance (b) The set of subscribed topics is identical for every consumer instance within the group. Range partitioning works on a per-topic basis. For each topic, we lay out the available partitions in numeric order and the consumer threads in lexicographic order. We then divide the number of partitions by the total number of consumer streams (threads) to determine the number of partitions to assign to each consumer. If it does not evenly divide, then the first few consumers will have one extra partition.

More details about consumer configuration can be found in the scala class kafka.consumer.ConsumerConfig.

Kafka Connect Configs

Below is the configuration of the Kafka Connect framework.

NameDescriptionTypeDefaultValid ValuesImportance
config.storage.topicThe name of the Kafka topic where connector configurations are storedstringhigh
group.idA unique string that identifies the Connect cluster group this worker belongs to.stringhigh
key.converterConverter class used to convert between Kafka Connect format and the serialized form that is written to Kafka. This controls the format of the keys in messages written to or read from Kafka, and since this is independent of connectors it allows any connector to work with any serialization format. Examples of common formats include JSON and Avro.classhigh
offset.storage.topicThe name of the Kafka topic where connector offsets are storedstringhigh
status.storage.topicThe name of the Kafka topic where connector and task status are storedstringhigh
value.converterConverter class used to convert between Kafka Connect format and the serialized form that is written to Kafka. This controls the format of the values in messages written to or read from Kafka, and since this is independent of connectors it allows any connector to work with any serialization format. Examples of common formats include JSON and Avro.classhigh
internal.key.converterConverter class used to convert between Kafka Connect format and the serialized form that is written to Kafka. This controls the format of the keys in messages written to or read from Kafka, and since this is independent of connectors it allows any connector to work with any serialization format. Examples of common formats include JSON and Avro. This setting controls the format used for internal bookkeeping data used by the framework, such as configs and offsets, so users can typically use any functioning Converter implementation.classlow
internal.value.converterConverter class used to convert between Kafka Connect format and the serialized form that is written to Kafka. This controls the format of the values in messages written to or read from Kafka, and since this is independent of connectors it allows any connector to work with any serialization format. Examples of common formats include JSON and Avro. This setting controls the format used for internal bookkeeping data used by the framework, such as configs and offsets, so users can typically use any functioning Converter implementation.classlow
bootstrap.serversA list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).listlocalhost:9092high
heartbeat.interval.msThe expected time between heartbeats to the group coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the worker's session stays active and to facilitate rebalancing when new members join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.int3000high
rebalance.timeout.msThe maximum allowed time for each worker to join the group once a rebalance has begun. This is basically a limit on the amount of time needed for all tasks to flush any pending data and commit offsets. If the timeout is exceeded, then the worker will be removed from the group, which will cause offset commit failures.int60000high
session.timeout.msThe timeout used to detect worker failures. The worker sends periodic heartbeats to indicate its liveness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove the worker from the group and initiate a rebalance. Note that the value must be in the allowable range as configured in the broker configuration by group.min.session.timeout.ms and group.max.session.timeout.ms.int10000high
ssl.key.passwordThe password of the private key in the key store file. This is optional for client.passwordnullhigh
ssl.keystore.locationThe location of the key store file. This is optional for client and can be used for two-way authentication for client.stringnullhigh
ssl.keystore.passwordThe store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.passwordnullhigh
ssl.truststore.locationThe location of the trust store file.stringnullhigh
ssl.truststore.passwordThe password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.passwordnullhigh
connections.max.idle.msClose idle connections after the number of milliseconds specified by this config.long540000medium
receive.buffer.bytesThe size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.int32768[0,...]medium
request.timeout.msThe configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.int40000[0,...]medium
sasl.jaas.configJAAS login context parameters for SASL connections in the format used by JAAS configuration files. JAAS configuration file format is described here. The format for the value is: ' (=)*;'passwordnullmedium
sasl.kerberos.service.nameThe Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config.stringnullmedium
sasl.mechanismSASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.stringGSSAPImedium
security.protocolProtocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.stringPLAINTEXTmedium
send.buffer.bytesThe size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.int131072[0,...]medium
ssl.enabled.protocolsThe list of protocols enabled for SSL connections.listTLSv1.2,TLSv1.1,TLSv1medium
ssl.keystore.typeThe file format of the key store file. This is optional for client.stringJKSmedium
ssl.protocolThe SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.stringTLSmedium
ssl.providerThe name of the security provider used for SSL connections. Default value is the default security provider of the JVM.stringnullmedium
ssl.truststore.typeThe file format of the trust store file.stringJKSmedium
worker.sync.timeout.msWhen the worker is out of sync with other workers and needs to resynchronize configurations, wait up to this amount of time before giving up, leaving the group, and waiting a backoff period before rejoining.int3000medium
worker.unsync.backoff.msWhen the worker is out of sync with other workers and fails to catch up within worker.sync.timeout.ms, leave the Connect cluster for this long before rejoining.int300000medium
access.control.allow.methodsSets the methods supported for cross origin requests by setting the Access-Control-Allow-Methods header. The default value of the Access-Control-Allow-Methods header allows cross origin requests for GET, POST and HEAD.string""low
access.control.allow.originValue to set the Access-Control-Allow-Origin header to for REST API requests.To enable cross origin access, set this to the domain of the application that should be permitted to access the API, or '*' to allow access from any domain. The default value only allows access from the domain of the REST API.string""low
client.idAn id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.string""low
config.storage.replication.factorReplication factor used when creating the configuration storage topicshort3[1,...]low
metadata.max.age.msThe period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.long300000[0,...]low
metric.reportersA list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.list""low
metrics.num.samplesThe number of samples maintained to compute metrics.int2[1,...]low
metrics.recording.levelThe highest recording level for metrics.stringINFO[INFO, DEBUG]low
metrics.sample.window.msThe window of time a metrics sample is computed over.long30000[0,...]low
offset.flush.interval.msInterval at which to try committing offsets for tasks.long60000low
offset.flush.timeout.msMaximum number of milliseconds to wait for records to flush and partition offset data to be committed to offset storage before cancelling the process and restoring the offset data to be committed in a future attempt.long5000low
offset.storage.partitionsThe number of partitions used when creating the offset storage topicint25[1,...]low
offset.storage.replication.factorReplication factor used when creating the offset storage topicshort3[1,...]low
plugin.pathList of paths separated by commas (,) that contain plugins (connectors, converters, transformations). The list should consist of top level directories that include any combination of: a) directories immediately containing jars with plugins and their dependencies b) uber-jars with plugins and their dependencies c) directories immediately containing the package directory structure of classes of plugins and their dependencies Note: symlinks will be followed to discover dependencies or plugins. Examples: plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectorslistnulllow
reconnect.backoff.max.msThe maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.long1000[0,...]low
reconnect.backoff.msThe base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.long50[0,...]low
rest.advertised.host.nameIf this is set, this is the hostname that will be given out to other workers to connect to.stringnulllow
rest.advertised.portIf this is set, this is the port that will be given out to other workers to connect to.intnulllow
rest.host.nameHostname for the REST API. If this is set, it will only bind to this interface.stringnulllow
rest.portPort for the REST API to listen on.int8083low
retry.backoff.msThe amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.long100[0,...]low
sasl.kerberos.kinit.cmdKerberos kinit command path.string/usr/bin/kinitlow
sasl.kerberos.min.time.before.reloginLogin thread sleep time between refresh attempts.long60000low
sasl.kerberos.ticket.renew.jitterPercentage of random jitter added to the renewal time.double0.05low
sasl.kerberos.ticket.renew.window.factorLogin thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket.double0.8low
ssl.cipher.suitesA list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.listnulllow
ssl.endpoint.identification.algorithmThe endpoint identification algorithm to validate server hostname using server certificate.stringnulllow
ssl.keymanager.algorithmThe algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.stringSunX509low
ssl.secure.random.implementationThe SecureRandom PRNG implementation to use for SSL cryptography operations.stringnulllow
ssl.trustmanager.algorithmThe algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.stringPKIXlow
status.storage.partitionsThe number of partitions used when creating the status storage topicint5[1,...]low
status.storage.replication.factorReplication factor used when creating the status storage topicshort3[1,...]low
task.shutdown.graceful.timeout.msAmount of time to wait for tasks to shutdown gracefully. This is the total amount of time, not per task. All task have shutdown triggered, then they are waited on sequentially.long5000low

Kafka Streams Configs

Below is the configuration of the Kafka Streams client library.

NameDescriptionTypeDefaultValid ValuesImportance
application.idAn identifier for the stream processing application. Must be unique within the Kafka cluster. It is used as 1) the default client-id prefix, 2) the group-id for membership management, 3) the changelog topic prefix.stringhigh
bootstrap.serversA list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).listhigh
replication.factorThe replication factor for change log topics and repartition topics created by the stream processing application.int1high
state.dirDirectory location for state store.string/tmp/kafka-streamshigh
cache.max.bytes.bufferingMaximum number of memory bytes to be used for buffering across all threadslong10485760[0,...]medium
client.idAn ID prefix string used for the client IDs of internal consumer, producer and restore-consumer, with pattern '-StreamThread--'.string""medium
default.deserialization.exception.handlerException handling class that implements the org.apache.kafka.streams.errors.DeserializationExceptionHandler interface.classorg.apache.kafka.streams.errors.LogAndFailExceptionHandlermedium
default.key.serdeDefault serializer / deserializer class for key that implements the org.apache.kafka.common.serialization.Serde interface.classorg.apache.kafka.common.serialization.Serdes$ByteArraySerdemedium
default.timestamp.extractorDefault timestamp extractor class that implements the org.apache.kafka.streams.processor.TimestampExtractor interface.classorg.apache.kafka.streams.processor.FailOnInvalidTimestampmedium
default.value.serdeDefault serializer / deserializer class for value that implements the org.apache.kafka.common.serialization.Serde interface.classorg.apache.kafka.common.serialization.Serdes$ByteArraySerdemedium
num.standby.replicasThe number of standby replicas for each task.int0medium
num.stream.threadsThe number of threads to execute stream processing.int1medium
processing.guaranteeThe processing guarantee that should be used. Possible values are at_least_once (default) and exactly_once. Note that exactly-once processing requires a cluster of at least three brokers by default what is the recommended setting for production; for development you can change this, by adjusting broker setting `transaction.state.log.replication.factor`.stringat_least_once[at_least_once, exactly_once]medium
security.protocolProtocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.stringPLAINTEXTmedium
application.serverA host:port pair pointing to an embedded user defined endpoint that can be used for discovering the locations of state stores within a single KafkaStreams applicationstring""low
buffered.records.per.partitionThe maximum number of records to buffer per partition.int1000low
commit.interval.msThe frequency with which to save the position of the processor. (Note, if 'processing.guarantee' is set to 'exactly_once', the default value is 100, otherwise the default value is 30000.long30000low
connections.max.idle.msClose idle connections after the number of milliseconds specified by this config.long540000low
key.serdeSerializer / deserializer class for key that implements the org.apache.kafka.common.serialization.Serde interface. This config is deprecated, use default.key.serde insteadclassnulllow
metadata.max.age.msThe period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.long300000[0,...]low
metric.reportersA list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.list""low
metrics.num.samplesThe number of samples maintained to compute metrics.int2[1,...]low
metrics.recording.levelThe highest recording level for metrics.stringINFO[INFO, DEBUG]low
metrics.sample.window.msThe window of time a metrics sample is computed over.long30000[0,...]low
partition.grouperPartition grouper class that implements the org.apache.kafka.streams.processor.PartitionGrouper interface.classorg.apache.kafka.streams.processor.DefaultPartitionGrouperlow
poll.msThe amount of time in milliseconds to block waiting for input.long100low
receive.buffer.bytesThe size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.int32768[0,...]low
reconnect.backoff.max.msThe maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.long1000[0,...]low
reconnect.backoff.msThe base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.long50[0,...]low
request.timeout.msThe configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.int40000[0,...]low
retry.backoff.msThe amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.long100[0,...]low
rocksdb.config.setterA Rocks DB config setter class or class name that implements the org.apache.kafka.streams.state.RocksDBConfigSetter interfaceclassnulllow
send.buffer.bytesThe size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.int131072[0,...]low
state.cleanup.delay.msThe amount of time in milliseconds to wait before deleting state when a partition has migrated. Only state directories that have not been modified for at least state.cleanup.delay.ms will be removedlong600000low
timestamp.extractorTimestamp extractor class that implements the org.apache.kafka.streams.processor.TimestampExtractor interface. This config is deprecated, use default.timestamp.extractor insteadclassnulllow
upgrade.fromAllows upgrading from version 0.10.0 to version 0.10.1 (or newer) in a backward compatible way. Default is null. Accepted values are "0.10.0" (for upgrading from 0.10.0.x).stringnull[null, 0.10.0]low
value.serdeSerializer / deserializer class for value that implements the org.apache.kafka.common.serialization.Serde interface. This config is deprecated, use default.value.serde insteadclassnulllow
windowstore.changelog.additional.retention.msAdded to a windows maintainMs to ensure data is not deleted from the log prematurely. Allows for clock drift. Default is 1 daylong86400000low
zookeeper.connectZookeeper connect string for Kafka topics management. This config is deprecated and will be ignored as Streams API does not use Zookeeper anymore.string""low

AdminClient Configs

Below is the configuration of the Kafka Admin client library.

NameDescriptionTypeDefaultValid ValuesImportance
bootstrap.serversA list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form host1:port1,host2:port2,.... Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).listhigh
ssl.key.passwordThe password of the private key in the key store file. This is optional for client.passwordnullhigh
ssl.keystore.locationThe location of the key store file. This is optional for client and can be used for two-way authentication for client.stringnullhigh
ssl.keystore.passwordThe store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.passwordnullhigh
ssl.truststore.locationThe location of the trust store file.stringnullhigh
ssl.truststore.passwordThe password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.passwordnullhigh
client.idAn id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.string""medium
connections.max.idle.msClose idle connections after the number of milliseconds specified by this config.long300000medium
receive.buffer.bytesThe size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.int65536[-1,...]medium
request.timeout.msThe configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.int120000[0,...]medium
sasl.jaas.configJAAS login context parameters for SASL connections in the format used by JAAS configuration files. JAAS configuration file format is described here. The format for the value is: ' (=)*;'passwordnullmedium
sasl.kerberos.service.nameThe Kerberos principal name that Kafka runs as. This can be defined either in Kafka's JAAS config or in Kafka's config.stringnullmedium
sasl.mechanismSASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.stringGSSAPImedium
security.protocolProtocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.stringPLAINTEXTmedium
send.buffer.bytesThe size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.int131072[-1,...]medium
ssl.enabled.protocolsThe list of protocols enabled for SSL connections.listTLSv1.2,TLSv1.1,TLSv1medium
ssl.keystore.typeThe file format of the key store file. This is optional for client.stringJKSmedium
ssl.protocolThe SSL protocol used to generate the SSLContext. Default setting is TLS, which is fine for most cases. Allowed values in recent JVMs are TLS, TLSv1.1 and TLSv1.2. SSL, SSLv2 and SSLv3 may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.stringTLSmedium
ssl.providerThe name of the security provider used for SSL connections. Default value is the default security provider of the JVM.stringnullmedium
ssl.truststore.typeThe file format of the trust store file.stringJKSmedium
metadata.max.age.msThe period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.long300000[0,...]low
metric.reportersA list of classes to use as metrics reporters. Implementing the org.apache.kafka.common.metrics.MetricsReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.list""low
metrics.num.samplesThe number of samples maintained to compute metrics.int2[1,...]low
metrics.recording.levelThe highest recording level for metrics.stringINFO[INFO, DEBUG]low
metrics.sample.window.msThe window of time a metrics sample is computed over.long30000[0,...]low
reconnect.backoff.max.msThe maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.long1000[0,...]low
reconnect.backoff.msThe base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.long50[0,...]low
retriesThe maximum number of times to retry a call before failing it.int5[0,...]low
retry.backoff.msThe amount of time to wait before attempting to retry a failed request. This avoids repeatedly sending requests in a tight loop under some failure scenarios.long100[0,...]low
sasl.kerberos.kinit.cmdKerberos kinit command path.string/usr/bin/kinitlow
sasl.kerberos.min.time.before.reloginLogin thread sleep time between refresh attempts.long60000low
sasl.kerberos.ticket.renew.jitterPercentage of random jitter added to the renewal time.double0.05low
sasl.kerberos.ticket.renew.window.factorLogin thread will sleep until the specified window factor of time from last refresh to ticket's expiry has been reached, at which time it will try to renew the ticket.double0.8low
ssl.cipher.suitesA list of cipher suites. This is a named combination of authentication, encryption, MAC and key exchange algorithm used to negotiate the security settings for a network connection using TLS or SSL network protocol. By default all the available cipher suites are supported.listnulllow
ssl.endpoint.identification.algorithmThe endpoint identification algorithm to validate server hostname using server certificate.stringnulllow
ssl.keymanager.algorithmThe algorithm used by key manager factory for SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.stringSunX509low
ssl.secure.random.implementationThe SecureRandom PRNG implementation to use for SSL cryptography operations.stringnulllow
ssl.trustmanager.algorithmThe algorithm used by trust manager factory for SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.stringPKIXlow