Kueue: Scheduling and Queuing for Batch Workloads

Kueue is a Kubernetes-native job queue that controls how batch and HPC workloads are admitted to the cluster. This module introduces why a queue layer is needed on top of the Kubernetes scheduler and outlines the core concepts that make Kueue work.

Why Kueue Is Needed

The default Kubernetes scheduler places Pods onto nodes as soon as resources are available. For batch and HPC jobs, that behavior leads to problems:

No admission control — Every submitted job competes immediately for capacity. A large batch job can starve smaller jobs or cause resource fragmentation.
No fair sharing — There is no built-in way to give different teams or projects a share of cluster capacity or to prioritize certain workloads.
No queue semantics — Users expect to submit jobs to a queue and have them start when capacity allows, rather than having jobs fail or sit unscheduled indefinitely.
Mismatch with batch lifecycle — Batch jobs have a clear start and end. The system should hold them until there is enough capacity to run them (often with gang scheduling), then release them as a unit.

Kueue adds a queue layer between job submission and the scheduler. It holds workloads in queues, respects quotas and capacity, and admits them only when the cluster can run them. The Kubernetes scheduler then places the admitted Pods. Kueue integrates with the scheduler and with job frameworks (e.g., Job API, Kubeflow, Ray) so that batch workloads get predictable, fair, and efficient scheduling.

Core Concepts

Kueue organizes capacity and workloads around a few key resources. Understanding these gives you the foundation for configuring and using Kueue on OpenShift.

ClusterQueue

A ClusterQueue is a cluster-scoped resource that represents a pool of capacity. It defines:

Resource quotas — How much of each resource (CPU, memory, GPUs, etc.) this queue can use, often broken down by resource flavor (e.g., GPU nodes vs CPU-only nodes).
Fair sharing — How capacity is shared among multiple ClusterQueues (e.g., via cohorts and borrowing limits).
Admission rules — Which resource flavors can be used together and in what order.

ClusterQueues are typically created and managed by cluster or batch administrators. Users do not submit jobs directly to a ClusterQueue; they submit to a LocalQueue that points to a ClusterQueue.

LocalQueue

A LocalQueue is a namespace-scoped resource that represents a queue for a tenant, team, or project. It:

References a ClusterQueue — All workloads in the LocalQueue draw capacity from that ClusterQueue.
Groups related workloads — Users submit jobs to a LocalQueue (e.g., team-alpha) so that workloads are organized by ownership or purpose.

Users and applications submit batch jobs to a LocalQueue. Kueue then considers them for admission against the ClusterQueue’s quotas and policies.

ResourceFlavor

ResourceFlavor describes what kind of capacity exists in the cluster. It is usually tied to node characteristics, for example:

Availability — On-demand vs spot or preemptible.
Hardware — GPU model, CPU architecture (e.g., x86 vs ARM).
Cost or tier — Different “flavors” for different pricing or SLA.

ClusterQueues define quotas per ResourceFlavor. When a workload requests, for example, “4 GPUs,” Kueue matches that to a flavor (e.g., gpu-nvidia-a100) and checks whether the ClusterQueue has quota for that flavor. ResourceFlavors let you separate capacity (e.g., GPU vs CPU-only) and apply different limits and policies to each.

Admission

Admission is the process by which Kueue decides that a workload is allowed to run. Until a workload is admitted, its Pods are not created or are held in a pending state. When Kueue admits a workload, it releases the job so that the Kubernetes scheduler can place the Pods.

Admission typically involves:

Quota check — Does the ClusterQueue have enough quota (for the requested ResourceFlavors) for this workload?
Admission checks (optional) — Custom checks (e.g., cluster capacity, external approvals) that must pass before the workload can start.

Once admitted, the workload consumes quota from the ClusterQueue until it finishes; then the quota is released for the next workload in the queue.

Workload

In Kueue, a Workload is the unit of work that is queued and admitted. For a Kubernetes Job, Kueue creates or manages a corresponding Workload that represents that job in the queue. The Workload tracks the job’s resource requests, which LocalQueue (and thus ClusterQueue) it belongs to, and its status (e.g., queued, admitted, finished). Job frameworks that integrate with Kueue (e.g., Kubeflow Training Operator, Ray) submit jobs in a way that creates these Workloads so they are subject to Kueue’s queuing and admission.

Multikueue and Multi-Cluster Scheduling with ACM

When you have multiple OpenShift or Kubernetes clusters (e.g., on-premises and in the cloud, or several regions), you want batch jobs to use capacity across the fleet rather than only on a single cluster. Multikueue extends Kueue to multi-cluster: a single logical queue can distribute workloads across many clusters. Together with Red Hat Advanced Cluster Management (ACM), you get a full multi-cluster scheduling story—ACM manages the cluster fleet and optional placement policies, while Multikueue handles queue-based admission and job dispatch across that fleet.

Multikueue in a Nutshell

Multikueue uses a manager cluster and one or more worker clusters:

Manager cluster — Runs Kueue and the Multikueue controllers. Users submit jobs here to LocalQueues that back ClusterQueues configured for Multikueue. The manager holds the queue, reserves quota, and dispatches workloads to worker clusters. It creates and monitors remote Workloads (and eventually Jobs) on workers and keeps status in sync.
Worker clusters — Each worker is a normal Kueue cluster with its own capacity and queues. The manager does not run workloads itself; it copies Workloads (and when one is admitted, the corresponding Job) to the chosen worker. The first worker to admit a workload “wins”; the manager then removes the workload from the other workers and tracks the job on the selected cluster until it finishes.

So from a user’s perspective: submit a job to the manager as usual. Kueue queues it; when quota is available, Multikueue sends it to one or more workers; one worker admits it and runs the job. No change to the job spec is required.

Dispatching and Cluster Selection

Multikueue supports several dispatching strategies for choosing which worker clusters see a workload:

AllAtOnce (default) — The workload is sent to all worker clusters as soon as it has quota in the manager. The first cluster to admit it runs the job; others are cleaned up. This minimizes admission latency.
Incremental — Worker clusters are tried in rounds (e.g., a few at a time). If no cluster admits within a time window, more clusters are added in the next round. This can reduce load on worker clusters when many jobs are queued.
External — An external controller sets which worker clusters are nominated for each workload. Multikueue then only dispatches to those clusters. This allows custom placement logic—including integration with placement systems such as Open Cluster Management (OCM) Placement, used by ACM.

The external dispatching mode is where integration with ACM’s placement and cluster selection fits: you can use ACM to decide which clusters are valid targets, and an external controller (or integration layer) can feed that into Multikueue’s nominated clusters.

Using Multikueue with Red Hat Advanced Cluster Management

Red Hat Advanced Cluster Management for Kubernetes (ACM) provides multi-cluster lifecycle, governance, and observability. It is built on Open Cluster Management (OCM) and exposes concepts such as ManagedClusters, Placement, and PlacementDecision. You can use ACM together with Multikueue to provide multi-cluster scheduling for batch workloads in the following way:

Cluster discovery and management — ACM discovers and manages your OpenShift (or Kubernetes) clusters as managed clusters. Those clusters can be registered as Multikueue worker clusters. So the set of workers Multikueue uses is the set of clusters you already manage with ACM (or a subset of them).
Placement-driven selection — OCM Placement selects a set of managed clusters based on predicates and priorities (e.g., region, labels, capacity). A PlacementDecision lists the chosen clusters. You can use this to decide which clusters are eligible for certain batch jobs—for example, “only GPU clusters in US-East.” With Multikueue’s External dispatching mode, an external controller can read PlacementDecisions (or ACM placement policies) and set the nominated worker clusters for each workload. Multikueue then dispatches only to those clusters, so placement and queue-based admission work together.
Single control plane — Users submit batch jobs to the manager cluster (which may be the ACM hub or a cluster that uses ACM-managed clusters as workers). Kueue and Multikueue handle queuing and cross-cluster dispatch; ACM handles cluster lifecycle, policies, and optional placement. The result is multi-cluster batch scheduling: one queue, many clusters, with capacity and placement under your control.

In practice, you configure Kueue and Multikueue on the manager, register ACM-managed clusters as Multikueue workers, and optionally wire an external dispatcher to ACM Placement so that workload placement respects your multi-cluster policies.

Summary

Kueue provides scheduling and queuing for batch workloads on Kubernetes and OpenShift by introducing a queue layer with quotas, flavors, and admission control. ClusterQueues define capacity pools and policies; LocalQueues give users a place to submit work; ResourceFlavors describe types of capacity; and admission controls when work can run. With these concepts in place, you can configure fair sharing, multi-tenant queues, and predictable behavior for HPC and batch jobs. Multikueue extends Kueue to multi-cluster: a manager cluster queues and dispatches workloads to worker clusters, with dispatching modes (AllAtOnce, Incremental, or External) controlling which workers are tried. Used with Red Hat Advanced Cluster Management (ACM), Multikueue’s worker set can be your ACM-managed clusters, and optional placement-driven selection (e.g., via OCM Placement and External dispatching) gives you multi-cluster scheduling that respects both queue capacity and placement policies. Later sections in this showroom cover installing Kueue, defining queues and flavors, and integrating with job frameworks and Multikueue.