Skip to main content
Redhat Developers Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Dynamic Kafka controller quorum

With dynamic quorum configuration KRaft clusters can now scale controller nodes without downtime.

November 27, 2024
Federico Valeri Luke Chen
Related topics:
KafkaOperatorsStream processing
Related products:
Streams for Apache Kafka

Share:

    Scalability is the measure of a system's ability to increase or decrease in performance, cost, and availability in response to changes in application and system processing demands. In distributed systems, this can be done by scaling up/down existing server resources (vertical scaling), or scaling out/in entire servers (horizontal scaling). The former is much simpler but limited, while the latter is more expensive but enables very large scaling. 

    Dynamic scalability means that scaling can be done without requiring system downtime. In some environments, these systems can also scale autonomously based on the actual load. In this article, we're looking at the new dynamic quorum configuration and use cases.

    Controllers in KRaft are Kafka nodes that use the Raft consensus algorithm to elect a leader (active controller), and replicate the cluster metadata. Before Kafka v3.9.0, KRaft based clusters only allowed static quorum configurations, where the set of controller nodes (a.k.a., voters) is fixed and cannot be changed without a restart.

    Kafka v3.9.0 introduces support for dynamic quorum with KIP-853. This is one of the missing features required to reach feature parity with ZooKeeper-based clusters. From now on, the dynamic quorum should be the preferred way of configuring a KRaft based cluster, as it doesn't require cluster downtime.

    Dynamic cluster scaling is important for a number of use cases:

    • Scale: The operator wants to scale the number of controllers by adding or removing a controller (this is pretty rare in practice, as the controller quorum is set once and seldom changed).
    • Replace: The operator wants to replace a controller because of a disk or hardware failure.
    • Migrate: The operator wants to migrate the cluster from old to new machines, or change KRaft architecture (e.g., moving from dedicated controllers to combined mode nodes).

    Implementation

    The dynamic KRaft quorum extends the basic consensus algorithm. Specifically, it introduces three additional remote procedure calls (RPCs): AddRaftVoter, RemoveRaftVoter, and UpdateRaftVoter. Additionally, two new metadata records have been added: KRaftVersionRecord and VotersRecord.

    In addition to the RPCs and metadata records, which enable this new feature, there are important safety and availability constraints to be considered.

    Safety

    Cluster metadata is replicated asynchronously through the __cluster_metadata single partition topic (a.k.a., metadata log), so it is not possible to atomically switch all controller configurations at once. For this reason, adding or removing many controllers in one go is not safe, as it can potentially split into two independent majorities (disjoint majorities). See Figure 1.

    Disjoint majorities problem.
    Figure 1: Disjoint majorities problem. Slightly modified version of figure 10 from the Raft paper by Diego Ongaro and John Ousterhout.

    To address this issue, KIP-853 adds the following restriction: only one controller can be added or removed from the cluster at a time. In other words, it is only safe to start another quorum change once a majority of the old quorum has moved to operating under the new quorum. This rule ensures overlap between the majority of the old and new quorums, maintaining a controlling majority during the transition. The active controller acknowledges the configuration change by committing to the metadata log after hearing back from the new majority.

    More complex quorum changes are implemented as a series of single-controller changes. In this case, it is preferable to add controllers before removing controllers. For example, to replace a controller in a three-controller cluster, adding one controller and then removing the other allows the system to handle one controller failure at all times throughout the whole process.

    Availability

    It's important to let new controllers catch up with the active controller's metadata log before joining the quorum. A three-controller cluster can tolerate one failure. However, if a fourth controller with an empty log is added to the same cluster, and one of the original three controllers fails, the cluster will be temporarily unable to commit new metadata entries. A similar availability issue can occur if many new controllers are added to a cluster in quick succession, where the new controllers are needed to form a majority of the cluster.

    To avoid this issue, KIP-853 adds an additional phase before the quorum configuration change, in which a new controller always joins the cluster as an observer. This means that the new controller starts to replicate the metadata log, but does not yet count towards majorities. Once the new controller has caught up with the active controller, it can be added to the quorum set. The active controller aborts the change if the new controller is unavailable (e.g., hardware failure, configuration error), or it is so slow that it will never catch up.

    Removing a controller can also be disruptive, as it could lead to unnecessary and repeated leader elections. Until KIP-996 has been implemented and released, it is recommended to shut down the controller to be removed before running the remove-controller command. If the removed controller is the active controller, it will only resign from the quorum when the new quorum is committed, but it won't count itself when computing the high watermark (commit check).

    Practical examples

    In the Kafka documentation, there are some basic command line examples to show how to scale up or down the controller set. Instead, here we present a couple of practical examples of how dynamic scaling can be used to perform some common maintenance tasks without cluster downtime.

    1. Replace a controller with a failed disk.
    2. Migrate to a different KRaft architecture.

    In order to be able to configure a dynamic quorum, the kraft.version feature must be at version 1 or above for all cluster nodes. Note that downgrading is not supported. You can find out the cluster's finalized KRaft version by running the following command:

    $ bin/kafka-features.sh --bootstrap-controller localhost:8000 describe | grep kraft.version Feature: kraft.version SupportedMinVersion: 0 SupportedMaxVersion: 1 FinalizedVersionLevel: 1 Epoch: 5

    When using a static quorum, the configuration file for each node must specify all controllers in controller.quorum.voters. Instead, when using a dynamic quorum, you should set controller.quorum.bootstrap.servers, and either use the standalone flag or pass the initial controller set when formatting the disk. In the following examples, we'll use the initial controller set way to format the controllers for quickly forming a three node quorum.

    Create a test cluster

    Let's first create a brand new cluster with three controllers (0, 1, 2) and three brokers (7, 8, 9). In this case, we are running all cluster nodes on localhost. We show the configuration of the first controller and broker as a reference for creating the other nodes. The quorum bootstrap servers are configured similarly to the bootstrap servers configuration in Kafka clients. See below:

    # controller node.id=0 process.roles=controller listeners=CONTROLLER://localhost:8000 listener.security.protocol.map=CONTROLLER:PLAINTEXT controller.listener.names=CONTROLLER controller.quorum.bootstrap.servers=localhost:8000,localhost:8001,localhost:8002 metadata.log.dir=/opt/kafka/server0/metadata log.dirs=/opt/kafka/server0/data # broker node.id=7 process.roles=broker listeners=REPLICATION://localhost:8007,CLIENT://localhost:9092 listener.security.protocol.map=CONTROLLER:PLAINTEXT,REPLICATION:PLAINTEXT,CLIENT:PLAINTEXT advertised.listeners=REPLICATION://localhost:8007,CLIENT://localhost:9092 controller.listener.names=CONTROLLER inter.broker.listener.name=REPLICATION controller.quorum.bootstrap.servers=localhost:8000,localhost:8001,localhost:8002 metadata.log.dir=/opt/kafka/server7/metadata log.dirs=/opt/kafka/server7/data

    Before starting each controller node, we have to format the disk. This can be done via "standalone" mode or "initial controllers" mode. Here, we use the latter by passing the initial controller set. Both cluster and directory IDs can be generated using the storage tool. Dynamic controllers include a directory UUID within their meta.properties. This unique identifier helps to differentiate log directories in the controller (see disk replacement example below):

    $ CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)" \ DIR_ID_0="$(bin/kafka-storage.sh random-uuid)" \ DIR_ID_1="$(bin/kafka-storage.sh random-uuid)" \ DIR_ID_2="$(bin/kafka-storage.sh random-uuid)" $ bin/kafka-storage.sh format \ --config /opt/kafka/server0/config/server.properties \ --cluster-id "$CLUSTER_ID" \ --initial-controllers "0@localhost:8000:$DIR_ID_0,1@localhost:8001:$DIR_ID_1,2@localhost:8002:DIR_ID_2"

    After that, there will be some files generated under the metadata log directory:

    $ tree /opt/kafka/server0/metadata /opt/kafka/server0/metadata ├── bootstrap.checkpoint ├── __cluster_metadata-0 │ ├── 00000000000000000000-0000000000.checkpoint │ ├── 00000000000000000000.index │ ├── 00000000000000000000.log │ ├── 00000000000000000000.timeindex │ ├── leader-epoch-checkpoint │ ├── partition.metadata │ └── quorum-state └── meta.properties 2 directories, 9 files

    The meta.properties is a plaintext file that contains the metadata of the node, including node ID, directory ID, and cluster ID:

    $ cat /opt/kafka/server0/metadata/meta.properties # #Thu Nov 07 15:19:01 CST 2024 node.id=0 directory.id=pbvuBlaTTwKRxS5NLJwRFQ version=1 cluster.id=ucil9Sd9R7Ss8st7gV-Krg

    The bootstrap.checkpoint file is a binary file that contains the metadata version of this node:

    $ bin/kafka-dump-log.sh --cluster-metadata-decoder --files /opt/kafka/server0/metadata/bootstrap.checkpoint | grep metadata.version | offset: 1 CreateTime: 1731059573622 keySize: -1 valueSize: 23 sequence: -1 headerKeys: [] payload: {"type":"FEATURE_LEVEL_RECORD","version":0,"data":{"name":"metadata.version","featureLevel":21}}

    The 00000000000000000000-0000000000.checkpoint snapshot file is new in v3.9.0, and its purpose is to store the above-mentioned records. Here we can see it contains the information for all of the controllers, including endpoints:

    $ bin/kafka-dump-log.sh --cluster-metadata-decoder --files /opt/kafka/server0/metadata/__cluster_metadata-0/00000000000000000000-0000000000.checkpoint | grep "KRaftVersion\|KRaftVoters" | offset: 1 CreateTime: 1731059573639 keySize: 4 valueSize: 5 sequence: -1 headerKeys: [] KRaftVersion {"version":0,"kRaftVersion":1} | offset: 2 CreateTime: 1731059573639 keySize: 4 valueSize: 157 sequence: -1 headerKeys: [] KRaftVoters {"version":0,"voters":[{"voterId":0,"voterDirectoryId":"YQfYGinOSneT_INBzNf-Ew","endpoints":[{"name":"CONTROLLER","host":"localhost","port":8000}],"kRaftVersionFeature":{"minSupportedVersion":0,"maxSupportedVersion":1}},{"voterId":1,"voterDirectoryId":"bnbzYpVnQwGU10389c73eg","endpoints":[{"name":"CONTROLLER","host":"localhost","port":8001}],"kRaftVersionFeature":{"minSupportedVersion":0,"maxSupportedVersion":1}},{"voterId":2,"voterDirectoryId":"yHVmF0CFTtunPZU31gkobQ","endpoints":[{"name":"CONTROLLER","host":"localhost","port":8002}],"kRaftVersionFeature":{"minSupportedVersion":0,"maxSupportedVersion":1}}]}

    Once all cluster nodes are started, the quorum should be formed and in-sync:

    $ bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:8000 describe --re --hu NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status 0 pbvuBlaTTwKRxS5NLJwRFQ 105 0 7 ms ago 7 ms ago Leader 1 QjRpFtVDTtCa8OLXiSbmmA 105 0 84 ms ago 84 ms ago Follower 2 slcsM5ZAR0SMIF_u__MAeg 105 0 84 ms ago 84 ms ago Follower 8 aXLz3ixjqzXhCYqKHRD4WQ 105 0 85 ms ago 85 ms ago Observer 7 KCriHQZm3TlxvEVNgyWKJw 105 0 85 ms ago 85 ms ago Observer 9 v5nnIwK8r0XqjyqlIPW-aw 105 0 86 ms ago 86 ms ago Observer

    Replace a controller with a failed disk

    When one of the controllers has a disk failure, the operator can replace this disk with a new one, as shown in Figure 2.

    Controller disk replacement.
    Figure 2: Controller disk replacement.

    The replaced disk needs to be formatted with a new directory ID:

    $ CLUSTER_ID="$(bin/kafka-cluster.sh cluster-id --bootstrap-server localhost:9092 | awk -F': ' '{print $2}')" $ bin/kafka-storage.sh format \ --config /opt/kafka/server2/config/server.properties \ --cluster-id "$CLUSTER_ID" \ --no-initial-controllers \ --ignore-formatted Formatting metadata directory /opt/kafka/server2/metadata with metadata.version 3.9-IV0.

    After restarting the controller, the quorum will have two nodes with an ID of 2: the original incarnation with a failed disk and an ever growing lag and follower status, plus a new one with a different directory ID and observer status. See below:

    $ bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:8000 describe --re --hu NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status 0 pbvuBlaTTwKRxS5NLJwRFQ 535 0 6 ms ago 6 ms ago Leader 1 QjRpFtVDTtCa8OLXiSbmmA 535 0 283 ms ago 283 ms ago Follower 2 slcsM5ZAR0SMIF_u__MAeg 407 128 63307 ms ago 63802 ms ago Follower 2 wrqMDI1WDsqaooVSOtlgYw 535 0 281 ms ago 281 ms ago Observer 8 aXLz3ixjqzXhCYqKHRD4WQ 535 0 284 ms ago 284 ms ago Observer 7 KCriHQZm3TlxvEVNgyWKJw 535 0 284 ms ago 284 ms ago Observer 9 v5nnIwK8r0XqjyqlIPW-aw 535 0 284 ms ago 284 ms ago Observer

    Once the new controller is in sync with the leader (lag near zero), we scale down the quorum to remove the old controller, followed by a quorum scale up to add the new controller to the quorum set:

    $ bin/kafka-metadata-quorum.sh \ --bootstrap-controller localhost:8000 \ remove-controller \ --controller-id 2 \ --controller-directory-id slcsM5ZAR0SMIF_u__MAeg Removed KRaft controller 2 with directory id slcsM5ZAR0SMIF_u__MAeg $ bin/kafka-metadata-quorum.sh \ --bootstrap-controller localhost:8000 \ --command-config /opt/kafka/server2/config/server.properties \ add-controller Added controller 2 with directory id wrqMDI1WDsqaooVSOtlgYw and endpoints: CONTROLLER://localhost:8002

    We have now successfully replaced the failed disk and restored the original quorum:

    $ bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:8000 describe --re --hu NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status 0 pbvuBlaTTwKRxS5NLJwRFQ 3367 0 7 ms ago 7 ms ago Leader 1 QjRpFtVDTtCa8OLXiSbmmA 3367 0 229 ms ago 229 ms ago Follower 2 wrqMDI1WDsqaooVSOtlgYw 3367 0 230 ms ago 230 ms ago Follower 8 aXLz3ixjqzXhCYqKHRD4WQ 3367 0 230 ms ago 230 ms ago Observer 7 KCriHQZm3TlxvEVNgyWKJw 3367 0 230 ms ago 230 ms ago Observer 9 v5nnIwK8r0XqjyqlIPW-aw 3367 0 230 ms ago 230 ms ago Observer

    Migrate to a different KRaft architecture

    Migrating from a ZooKeeper-based cluster to a KRaft-based cluster requires three dedicated controller nodes. Once the migration is completed, the operator wants to reduce the resource usage to just three nodes, where each cluster node acts as controller and broker at the same time (combined mode). See Figure 3.

    KRaft architecture migration.
    Figure 3: KRaft architecture migration.

    First, we update all controllers adding the broker role and missing configuration. Here we are reporting the diff output with the previous configuration of the first controller:

    $ diff prev.properties updated.properties 2,4c2,5 < process.roles=controller < listeners=CONTROLLER://localhost:8000 < listener.security.protocol.map=CONTROLLER:PLAINTEXT --- > process.roles=controller,broker > listeners=CONTROLLER://localhost:8000,REPLICATION://localhost:8010,CLIENT://localhost:9095 > listener.security.protocol.map=CONTROLLER:PLAINTEXT,REPLICATION:PLAINTEXT,CLIENT:PLAINTEXT > advertised.listeners=REPLICATION://localhost:8010,CLIENT://localhost:9095 5a7 > inter.broker.listener.name=REPLICATION

    This configuration change requires a rolling update where you restart all controllers one at a time. In order to minimize the number of leader elections, you can restart the current active controller last:

    $ bin/kafka-metadata-quorum.sh --bootstrap-controller :8000 describe --re --hu NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status 2 wrqMDI1WDsqaooVSOtlgYw 10009 0 6 ms ago 6 ms ago Leader 0 pbvuBlaTTwKRxS5NLJwRFQ 10009 0 362 ms ago 362 ms ago Follower 1 QjRpFtVDTtCa8OLXiSbmmA 10009 0 362 ms ago 362 ms ago Follower 8 aXLz3ixjqzXhCYqKHRD4WQ 10009 0 363 ms ago 363 ms ago Observer 7 KCriHQZm3TlxvEVNgyWKJw 10009 0 363 ms ago 363 ms ago Observer 9 v5nnIwK8r0XqjyqlIPW-aw 10009 0 363 ms ago 363 ms ago Observer

    Now, we have to move all topic replicas from nodes 7, 8, and 9 to nodes 0, 1, and 2 using the partition reassignment tool. In our cluster we only have a few topics, but on a real cluster we recommend using the remove broker endpoint from the Cruise Control project, which moves all topics while keeping the cluster balanced. Once the reassignment is completed, the topics should have partitions spread among the combined nodes, like in the following example:

    $ bin/kafka-topics.sh --bootstrap-server localhost:9095 --describe --topic my-topic Topic: my-topic TopicId: RHzP92QGSwy-RTcm5a5IpA PartitionCount: 5 ReplicationFactor: 3 Configs: Topic: my-topic Partition: 0 Leader: 2 Replicas: 2,1,0 Isr: 0,1,2 Elr: LastKnownElr: Topic: my-topic Partition: 1 Leader: 1 Replicas: 1,0,2 Isr: 0,1,2 Elr: LastKnownElr: Topic: my-topic Partition: 2 Leader: 0 Replicas: 0,2,1 Isr: 0,1,2 Elr: LastKnownElr: Topic: my-topic Partition: 3 Leader: 1 Replicas: 1,0,2 Isr: 0,1,2 Elr: LastKnownElr: Topic: my-topic Partition: 4 Leader: 2 Replicas: 2,1,0 Isr: 0,1,2 Elr: LastKnownElr:

    At this point, all data has been moved to the combined nodes, so we can stop the dedicated brokers, scaling down the cluster. Before stopping, we have to first unregister the broker to avoid issues like newly created partitions still being assigned to the removed replicas, or errors when the metadata version is updated after a cluster upgrade:

    $ bin/kafka-cluster.sh unregister --bootstrap-server localhost:9095 --id 7 \ && pkill -SIGKILL -ef "server7" && sleep 5 Broker 7 is no longer registered. $ bin/kafka-cluster.sh unregister --bootstrap-server localhost:9095 --id 8 \ && pkill -SIGKILL -ef "server8" && sleep 5 Broker 8 is no longer registered. $ bin/kafka-cluster.sh unregister --bootstrap-server localhost:9095 --id 9 \ && pkill -SIGKILL -ef "server9" && sleep 5 Broker 9 is no longer registered

    This is the final result with only three cluster nodes:

    $ bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:8000 describe --re --hu NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status 2 wrqMDI1WDsqaooVSOtlgYw 13840 0 6 ms ago 6 ms ago Leader 0 pbvuBlaTTwKRxS5NLJwRFQ 13840 0 387 ms ago 387 ms ago Follower 1 QjRpFtVDTtCa8OLXiSbmmA 13840 0 387 ms ago 387 ms ago Follower

    Limitations and future work

    At the time of writing, the dynamic controller quorum feature has the following limitations:

    • It is not possible to convert KRaft based clusters using a static controller quorum to a dynamic controller quorum configuration (KAFKA-16538).
    • To avoid disrupting or slowing down leader elections and metadata changes, the operator needs to ensure that the new controller is able to keep up with the leader before adding it to the quorum set.

    In future releases, the Kafka community will focus on the following improvements:

    • Converting KRaft-based clusters using a static controller quorum to a dynamic controller quorum configuration. This is important because it allows the KRaft quorum formatted in v3.8.1 or earlier to upgrade to the dynamic controller quorum.
    • Add new metrics for the dynamic controller quorum. During membership change, it is helpful to have metrics to help the operator know the current status, like the number of uncommitted voters change and number of voters.
    • Implement the controller.quorum.auto.join.enable configuration to control whether a KRaft controller should automatically join the cluster metadata partition based on its cluster ID. This helps ensure that the new controller is synchronized with the leader and adds it to the quorum set automatically.

    Conclusion

    Dynamic scaling Kafka clusters offer significant improvements in flexibility and efficiency, especially for handling changing workloads, hardware failures, and cluster migrations. By introducing the dynamic quorum configuration in Kafka v3.9.0, KRaft clusters can now scale controller nodes without downtime.

    This new functionality enables Kafka operators to scale and manage clusters more effectively, whether by adding or removing controllers to meet demand or by conducting maintenance tasks like disk replacement and KRaft architecture migration.

    While there are limitations and safety precautions, such as the inability to migrate from static to dynamic quorum, and ensuring new controllers synchronize before joining the quorum, the dynamic KRaft quorum greatly enhances Kafka's resilience and adaptability.

    Related Posts

    • Kubernetes-native Apache Kafka with Strimzi, Debezium, and Apache Camel (Kafka Summit 2020)

    • HTTP-based Kafka messaging with Red Hat AMQ Streams

    • Announcing Red Hat AMQ streams 1.2 with Apache Kafka 2.2 support

    • Developer preview of Debezium Apache Kafka connectors for Change Data Capture (CDC)

    Recent Posts

    • using the Argo CD Agent with OpenShift GitOps

    • Optimize and deploy LLMs for production with OpenShift AI

    • DeepSeek-V3.2-Exp on vLLM, Day 0: Sparse Attention for long-context inference, ready for experimentation today with Red Hat AI

    • How to deploy the Offline Knowledge Portal on OpenShift

    • Autoscaling vLLM with OpenShift AI

    What’s up next?

    Operators make it easier to automate the lifecycle of complex, stateful programs by adding application-specific skills to a Kubernetes cluster. Read Kubernetes Operators for realistic examples that show you how Operators work and how to build them with the Operator Framework and SDK.

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Platforms

    • Red Hat AI
    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform
    • See all products

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue