Is the answer the same as the question "who needs a microservice architecture"?
A bit OT, but I'd like to see how this works...
Ah, very cool - https://www.youtube.com/watch?v=wyY-lTmgb8c
It works, but the GPUs aren't very stable at 4x vs. a normal 16x.
Another use case is in media for transcoding. It is not a trivial job to orchestrate transcoding at scale, and Kubernetes with or without GPUs is an excellent solution for that as it is trivial to setup a completely automated job queue.
Also another interesting field will eventually be HPC but there are some constraints about compute that K8s does not tick scheduling wise at this point in time. There is a pluggable scheduler in the works I think, and this will eventually help. Also the LXD example is a nice optimization but it would not replace the scheduler in any way.
[0]: https://medium.com/intuitionmachine/gpus-kubernetes-for-deep...
[1]: https://hackernoon.com/job-concurrency-in-kubernetes-lxd-and...