Add a Karta workload definition for KubeRay's RayService (Ray Serve) under docs/samples/rayservice.yaml, completing the Ray family — we already ship raycluster.yaml and rayjob.yaml.
Why
RayService is the standard way to run Ray Serve inference apps on Kubernetes. A Karta definition lets schedulers, dashboards, and autoscalers consume it without RayService-specific code.
Details (verified against KubeRay source)
- apiVersion / kind:
ray.io/v1 / RayService (GA v1, not v1alpha1)
- Head pod template:
spec.rayClusterConfig.headGroupSpec.template
- Worker groups:
spec.rayClusterConfig.workerGroupSpecs[].template; replicas .replicas, min .minReplicas, max .maxReplicas; group name .groupName
- Pod labels (same as RayCluster):
ray.io/node-type=head|worker, ray.io/cluster, ray.io/group
- Status:
status.serviceStatus ("Running"); prefer status.conditions[] (Ready, UpgradeInProgress) since v1.3.0
- Ownership chain: RayService → RayCluster → Pods. Caveat: the head/worker Pods are owned by the child RayCluster (named in
status.activeServiceStatus.rayClusterName), not by the RayService directly. During zero-downtime upgrades a second pending RayCluster exists (status.pendingServiceStatus).
Suggested approach
Reuse raycluster.yaml's childComponents mapping with paths shifted under spec.rayClusterConfig, rooted at RayService. Add a gangScheduling podGroup grouped by ray.io/cluster.
Open question for the implementer: how to express the RayService→RayCluster→Pod indirection — model the RayCluster as an intermediate component, or select Pods directly via the ray.io/cluster label. Worth confirming with maintainers.
References
Add a Karta workload definition for KubeRay's RayService (Ray Serve) under
docs/samples/rayservice.yaml, completing the Ray family — we already shipraycluster.yamlandrayjob.yaml.Why
RayService is the standard way to run Ray Serve inference apps on Kubernetes. A Karta definition lets schedulers, dashboards, and autoscalers consume it without RayService-specific code.
Details (verified against KubeRay source)
ray.io/v1/RayService(GAv1, notv1alpha1)spec.rayClusterConfig.headGroupSpec.templatespec.rayClusterConfig.workerGroupSpecs[].template; replicas.replicas, min.minReplicas, max.maxReplicas; group name.groupNameray.io/node-type=head|worker,ray.io/cluster,ray.io/groupstatus.serviceStatus("Running"); preferstatus.conditions[](Ready,UpgradeInProgress) since v1.3.0status.activeServiceStatus.rayClusterName), not by the RayService directly. During zero-downtime upgrades a second pending RayCluster exists (status.pendingServiceStatus).Suggested approach
Reuse
raycluster.yaml'schildComponentsmapping with paths shifted underspec.rayClusterConfig, rooted at RayService. Add agangSchedulingpodGroup grouped byray.io/cluster.Open question for the implementer: how to express the RayService→RayCluster→Pod indirection — model the RayCluster as an intermediate component, or select Pods directly via the
ray.io/clusterlabel. Worth confirming with maintainers.References
docs/samples/raycluster.yaml,docs/samples/rayjob.yaml