Putting GPUs to work with Kubernetes (opens in new tab)

(medium.com)

133 pointsmarcoceppi9y ago52 comments

52 comments

13 comments · 5 top-level

shaklee39y ago· 3 in thread

Is the author of this working on official support or just testing? I know there's a gpu roadmap for k8s, but I can't tell from this blog if this was part of it.

samnco9y ago

Canonical will officially support GPUs when they lands GA upstream. The flag is beta as of now in the Canonical Distribution of Kubernetes. Paying customers either for the managed or supported solutions get a best effort for GPU, and this feature is enabled by default.

puzzle9y ago

What is the requirement for privileged containers? The post never explains it.

1 more reply

stubish9y ago

Its the Canonical distribution of Kubernetes. Supported by Canonical.

Seanny1239y ago· 2 in thread

I keep seeing Kubernetes appear on HackerNews. Is there a quick thing I can read to explain why everyone's so excited about? I know it's container orchestration, but I'm not sure what people are using it for or what pain point it is revealing.

moondev9y ago

https://vishh.github.io/docs/concepts/overview/what-is-kuber...

Seanny1239y ago

If you'd like to engage with me further, how does a company know it needs Kubernetes? If I'm Soylent and I'm processing a few orders a minute, I'm probably safe with a few redundant monoliths. Do I have to be Uber? What's the middle-ground between Soylent and Uber that would still need this?

Is the answer the same as the question "who needs a microservice architecture"?

2 more replies

nrki9y ago· 2 in thread

"1060GTX at home but on consumer grade Intel NUC"

A bit OT, but I'd like to see how this works...

Ah, very cool - https://www.youtube.com/watch?v=wyY-lTmgb8c

samnco9y ago

Actually, it was a fun DIY project I did a while ago. You can read about it here: https://hackernoon.com/installing-a-diy-bare-metal-gpu-clust...

It works, but the GPUs aren't very stable at 4x vs. a normal 16x.

jacquesm9y ago

That's one problem, another is the size of the powersupply. And maybe that's the only problem, I don't see why a GPU would become unstable when using fewer lanes, all it should do is get slower.

1 more reply

Tossrock9y ago· 1 in thread

Unprofitable for ETH mining maybe, but it seems like a natural fit to rent time on it to deep learning people with slow training models. Although that could still be unprofitable after the cost of electricity, I guess it's a question of market size/demand. A lot of deep learning is already at big infrastructure players anyway who wouldn't need the service, leaving academics / smaller companies. But maybe some people would find a reliable, scalable GPU cluster valuable.

samnco9y ago

Ahah, good point. Really the ETH stuff was "because I can". But in the same charts repository you will find a Tensorflow chart. My previous series of blogs [0] was about exactly that. A nice addition as well for compute intensive workloads is the use of LXD [1]

Another use case is in media for transcoding. It is not a trivial job to orchestrate transcoding at scale, and Kubernetes with or without GPUs is an excellent solution for that as it is trivial to setup a completely automated job queue.

Also another interesting field will eventually be HPC but there are some constraints about compute that K8s does not tick scheduling wise at this point in time. There is a pluggable scheduler in the works I think, and this will eventually help. Also the LXD example is a nice optimization but it would not replace the scheduler in any way.

[0]: https://medium.com/intuitionmachine/gpus-kubernetes-for-deep...

[1]: https://hackernoon.com/job-concurrency-in-kubernetes-lxd-and...

aub3bhat9y ago

Great read, on a smaller scale, I have found nvidia-docker with nvidia-docker-compose to be a great solution for deploying docker containers on AWS P2 machines with 8 GPUs.

j / k navigate · click thread line to collapse

52 comments

13 comments · 5 top-level

shaklee39y ago· 3 in thread

Is the author of this working on official support or just testing? I know there's a gpu roadmap for k8s, but I can't tell from this blog if this was part of it.

samnco9y ago

puzzle9y ago

What is the requirement for privileged containers? The post never explains it.

1 more reply

stubish9y ago

Its the Canonical distribution of Kubernetes. Supported by Canonical.

Seanny1239y ago· 2 in thread

moondev9y ago

https://vishh.github.io/docs/concepts/overview/what-is-kuber...

Seanny1239y ago

Is the answer the same as the question "who needs a microservice architecture"?

2 more replies

nrki9y ago· 2 in thread

"1060GTX at home but on consumer grade Intel NUC"

A bit OT, but I'd like to see how this works...

Ah, very cool - https://www.youtube.com/watch?v=wyY-lTmgb8c

samnco9y ago

Actually, it was a fun DIY project I did a while ago. You can read about it here: https://hackernoon.com/installing-a-diy-bare-metal-gpu-clust...

It works, but the GPUs aren't very stable at 4x vs. a normal 16x.

jacquesm9y ago

That's one problem, another is the size of the powersupply. And maybe that's the only problem, I don't see why a GPU would become unstable when using fewer lanes, all it should do is get slower.

1 more reply

Tossrock9y ago· 1 in thread

samnco9y ago

[0]: https://medium.com/intuitionmachine/gpus-kubernetes-for-deep...

[1]: https://hackernoon.com/job-concurrency-in-kubernetes-lxd-and...

aub3bhat9y ago

Great read, on a smaller scale, I have found nvidia-docker with nvidia-docker-compose to be a great solution for deploying docker containers on AWS P2 machines with 8 GPUs.

j / k navigate · click thread line to collapse