|
| 1 | += KRaft |
| 2 | +:description: Apache Kafka KRaft mode with the Stackable Operator for Apache Kafka |
| 3 | + |
| 4 | +Apache Kafka's KRaft mode (Kafka Raft Metadata mode) replaces Apache ZooKeeper with Kafka’s own built-in consensus mechanism based on the Raft protocol. |
| 5 | +This simplifies Kafka’s architecture, reducing operational complexity by consolidating cluster metadata management into Kafka itself. |
| 6 | + |
| 7 | +WARNING: The Stackable Operator for Apache Kafka currently does not support automatic cluster upgrades from Apache ZooKeeper to KRaft. |
| 8 | + |
| 9 | +== Overview |
| 10 | + |
| 11 | +* Introduced: Kafka 2.8.0 (early preview, not production-ready). |
| 12 | +* Matured: Kafka 3.3.x (production-ready, though ZooKeeper is still supported). |
| 13 | +* Default & Recommended: Kafka 3.5+ strongly recommends KRaft for new clusters. |
| 14 | +* Full Replacement: Kafka 4.0.0 (2025) removes ZooKeeper completely. |
| 15 | +* Migration: Tools exist to migrate from ZooKeeper to KRaft, but new deployments should start with KRaft. |
| 16 | + |
| 17 | +== Configuration |
| 18 | + |
| 19 | +The Stackable Kafka operator introduces a new xref:concepts:roles-and-role-groups.adoc[role] in the KafkaCluster CRD called KRaft `Controller`. |
| 20 | +Configuring the `Controller` will put Kafka into KRaft mode. Apache ZooKeeper will not be required anymore. |
| 21 | + |
| 22 | +[source,yaml] |
| 23 | +---- |
| 24 | +apiVersion: kafka.stackable.tech/v1alpha1 |
| 25 | +kind: KafkaCluster |
| 26 | +metadata: |
| 27 | + name: kafka |
| 28 | +spec: |
| 29 | + image: |
| 30 | + productVersion: "3.9.1" |
| 31 | + brokers: |
| 32 | + roleGroups: |
| 33 | + default: |
| 34 | + replicas: 1 |
| 35 | + controllers: |
| 36 | + roleGroups: |
| 37 | + default: |
| 38 | + replicas: 3 |
| 39 | +---- |
| 40 | + |
| 41 | +NOTE: This is mutally exclusive with `spec.clusterConfig.zookeeperConfigMapName`. |
| 42 | + |
| 43 | +=== Recommendations |
| 44 | + |
| 45 | +A minimal KRaft setup consisting of at least 3 Controllers has the following https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/[resource requirements]: |
| 46 | + |
| 47 | +* `600m` CPU request |
| 48 | +* `3000m` CPU limit |
| 49 | +* `3000Mi` memory request and limit |
| 50 | +* `6Gi` persistent storage |
| 51 | + |
| 52 | +NOTE: The Controller replicas should sum up to an odd number for the Raft consensus. |
| 53 | + |
| 54 | +=== Resources |
| 55 | + |
| 56 | +Corresponding to the values above, the operator uses the following resource defaults: |
| 57 | + |
| 58 | +[source,yaml] |
| 59 | +---- |
| 60 | +controllers: |
| 61 | + config: |
| 62 | + resources: |
| 63 | + memory: |
| 64 | + limit: 1Gi |
| 65 | + cpu: |
| 66 | + min: 250m |
| 67 | + max: 1000m |
| 68 | + storage: |
| 69 | + logDirs: |
| 70 | + capacity: 2Gi |
| 71 | +---- |
| 72 | + |
| 73 | +=== Affinities |
| 74 | + |
| 75 | +=== PDBs |
| 76 | + |
| 77 | +=== Overrides |
| 78 | + |
| 79 | +== Internal operator details |
| 80 | + |
| 81 | +KRaft mode requires major configuration changes compared to ZooKeeper: |
| 82 | + |
| 83 | +* `cluster-id`: This is set to the `metadata.uid` of the KafkaCluster resource during initial formatting |
| 84 | +* `node.id`: This is a calculated integer, hashed from the `role` and `rolegroup` and `replica` id. |
| 85 | +* `process.roles`: Will always only be `broker` or `controller`. Mixed `broker,controller` servers are not possible. |
| 86 | + |
| 87 | +== Troubleshooting |
| 88 | + |
| 89 | +=== Cluster does not start |
| 90 | + |
| 91 | +Check that at least a quorum (majority) of controllers are reachable. |
| 92 | + |
| 93 | +=== Frequent leader elections |
| 94 | + |
| 95 | +Likely caused by controller resource starvation or unstable Kubernetes scheduling. |
| 96 | + |
| 97 | +=== Migration issues (ZooKeeper to KRaft) |
| 98 | + |
| 99 | +Ensure Kafka version 3.9.x and higher and follow the official migration documentation. |
| 100 | + |
| 101 | +=== Scaling issues |
| 102 | + |
| 103 | +The https://developers.redhat.com/articles/2024/11/27/dynamic-kafka-controller-quorum?utm_source=chatgpt.com#[Dynamic scaling] is only supported from Kafka version 3.9.0. |
| 104 | +If you are using older versions, automatic scaling may not work properly (e.g. adding or removing controller replicas). |
0 commit comments