* Understanding which workloads share a node's memory/CPU, and isolating certain workloads for security reasons
* Running specific workloads on specific instance types (e.g. with GPU or extra CPU)
* Configuring network policy between workloads
* Airgapping certain workloads
* Setting priority levels for different workloads, so some scale more rapidly while others have to wait for a new node to be provisioned
* Customized scaling behavior (e.g. based on the depth of a queue or latency metrics)
* Multi-region support for DR
I could probably go on :)