Kubernetes 1.35: The Release That Finally Gets AI Workloads Right

I’ve been running mixed clusters with ML training jobs and regular services for about two years. Scheduling has been the biggest headache. A distributed training run would get only some pods placed, GPUs would sit there doing nothing, and everyone would lose time. Kubernetes 1.35 came out last week, so I spent the weekend testing it on our staging cluster. A few of these changes are genuinely useful. Gang Scheduling Finally Exists The biggest addition is workload-aware scheduling with gang scheduling support. It’s still alpha, so I would not put it in production yet, but the model is exactly what we needed: a group of pods either gets scheduled together, or not at all. ...

February 25, 2026

Reclaiming Idle GPUs in Kubernetes Before They Burn Your Budget

Last month I finally looked at our GPU utilization dashboards properly. What I saw made me physically uncomfortable: 14 A100 GPUs across our cluster, average utilization hovering around 15%. We were paying for dedicated hardware that spent most of its time doing absolutely nothing. This is embarrassingly common. Teams request a full GPU for a workload that uses it for training bursts of 20 minutes, then idles for hours. Kubernetes treats GPUs as integer resources — you either have one or you don’t. There’s no native way to share. ...

February 15, 2026