Scheduling

Kubernetes 1.35: The Release That Finally Gets AI Workloads Right

I’ve been running mixed clusters with ML training jobs and regular services for about two years. Scheduling has been the biggest headache. A distributed training run would get only some pods placed, GPUs would sit there doing nothing, and everyone would lose time. Kubernetes 1.35 came out last week, so I spent the weekend testing it on our staging cluster. A few of these changes are genuinely useful. Gang Scheduling Finally Exists The biggest addition is workload-aware scheduling with gang scheduling support. It’s still alpha, so I would not put it in production yet, but the model is exactly what we needed: a group of pods either gets scheduled together, or not at all. ...

Reclaiming Idle GPUs in Kubernetes Before They Burn Your Budget

Last month I finally looked at our GPU utilization dashboards properly. What I saw made me physically uncomfortable: 14 A100 GPUs across our cluster, average utilization hovering around 15%. We were paying for dedicated hardware that spent most of its time doing absolutely nothing. This is embarrassingly common. Teams request a full GPU for a workload that uses it for training bursts of 20 minutes, then idles for hours. Kubernetes treats GPUs as integer resources — you either have one or you don’t. There’s no native way to share. ...

Kubernetes Node Readiness Controller - Finally, Proper Node Bootstrap Gates

Last week I ran into a familiar mess, pods landing on nodes before the CNI plugin was actually ready. Kubelet marks the node as Ready, scheduler starts placing workloads, then everything sits in ContainerCreating because Calico is still coming up. I have worked around this with init containers and postStart tricks for way too long. I came across the Node Readiness Controller announcement on the Kubernetes blog. It is a new SIG project (v0.1.1), and it is basically what I wanted, custom readiness gates for nodes managed through a CRD. ...