Add a Karta workload definition for SchedMD Slinky NodeSet (slurm-operator) under docs/samples/nodeset.yaml.
Why
NodeSet manages Slurm compute nodes (slurmd pods) on Kubernetes — the core scalable unit of Slinky. A Karta definition surfaces it to schedulers, dashboards, and autoscalers as a standard pod-set workload.
Details (verified against SlinkyProject/slurm-operator source)
- apiVersion / kind:
slinky.slurm.net/v1beta1 / NodeSet (v1beta1, not v1alpha1)
- Manages a homogeneous set of
slurmd pods. StatefulSet-like by default (stable identity, ordinals, PVC templates). spec.scalingMode = StatefulSet (default) | DaemonSet.
- Pod template:
spec.template (a custom PodTemplate type; pod spec at spec.template.spec, labels at spec.template.metadata.labels — same shape as a podTemplateSpec for podTemplateSpecPath purposes)
- Replicas:
spec.replicas (ignored in DaemonSet mode). No min/max fields — autoscaled via the HPA scale subresource (.spec.replicas / .status.replicas / .status.selector).
- Status:
status.replicas, readyReplicas, availableReplicas, updatedReplicas, desired; Slurm-specific status.slurmIdle / slurmAllocated / slurmDown / slurmDrain; status.conditions[]
- Ownership: NodeSet directly owns its pods (flat StatefulSet-like set, ordinal naming
<nodeset>-<ordinal>); no intermediate StatefulSet object.
- Pod selector labels:
app.kubernetes.io/name=slurmd, app.kubernetes.io/instance=<nodeset name>, app.kubernetes.io/component=worker, slinky.slurm.net/cluster=<controller name>
Suggested mapping
rootComponent = NodeSet; one childComponent (Pod) with podTemplateSpecPath: .spec.template, replicasPath: .spec.replicas, scale-subresource paths for autoscaling; podSelector by app.kubernetes.io/instance. Status phase from conditions / readyReplicas vs replicas. Optional: surface Slurm node states (slurmIdle/Allocated/Down/Drain) if the status mapping can carry them.
References
Add a Karta workload definition for SchedMD Slinky NodeSet (slurm-operator) under
docs/samples/nodeset.yaml.Why
NodeSet manages Slurm compute nodes (
slurmdpods) on Kubernetes — the core scalable unit of Slinky. A Karta definition surfaces it to schedulers, dashboards, and autoscalers as a standard pod-set workload.Details (verified against SlinkyProject/slurm-operator source)
slinky.slurm.net/v1beta1/NodeSet(v1beta1, notv1alpha1)slurmdpods. StatefulSet-like by default (stable identity, ordinals, PVC templates).spec.scalingMode=StatefulSet(default) |DaemonSet.spec.template(a customPodTemplatetype; pod spec atspec.template.spec, labels atspec.template.metadata.labels— same shape as a podTemplateSpec forpodTemplateSpecPathpurposes)spec.replicas(ignored inDaemonSetmode). No min/max fields — autoscaled via the HPA scale subresource (.spec.replicas/.status.replicas/.status.selector).status.replicas,readyReplicas,availableReplicas,updatedReplicas,desired; Slurm-specificstatus.slurmIdle/slurmAllocated/slurmDown/slurmDrain;status.conditions[]<nodeset>-<ordinal>); no intermediate StatefulSet object.app.kubernetes.io/name=slurmd,app.kubernetes.io/instance=<nodeset name>,app.kubernetes.io/component=worker,slinky.slurm.net/cluster=<controller name>Suggested mapping
rootComponent= NodeSet; onechildComponent(Pod) withpodTemplateSpecPath: .spec.template,replicasPath: .spec.replicas, scale-subresource paths for autoscaling;podSelectorbyapp.kubernetes.io/instance. Status phase from conditions /readyReplicasvsreplicas. Optional: surface Slurm node states (slurmIdle/Allocated/Down/Drain) if the status mapping can carry them.References
docs/samples/raycluster.yaml(childComponent / scale shape)