Skip to content

KubeRay

KubeRay provides a Kubernetes-native way to run vLLM workloads on Ray clusters. A Ray cluster can be declared in YAML, and the operator then handles pod scheduling, networking configuration, restarts, and blue-green deployments — all while preserving the familiar Kubernetes experience.

Why KubeRay instead of manual scripts?

Feature Manual scripts KubeRay
Cluster bootstrap Manually SSH into every node and run a script One command to create or update the whole cluster: kubectl apply -f cluster.yaml
Autoscaling Manual Automatically patches CRDs for adjusting cluster size
Upgrades Tear down & re-create manually Blue/green deployment updates supported
Declarative config Bash flags & environment variables Git-ops-friendly YAML CRDs (RayCluster/RayService)

Using KubeRay reduces the operational burden and simplifies integration of Ray + vLLM with existing Kubernetes workflows (CI/CD, secrets, storage classes, etc.).

Learn more