My Experience and Advice for Using GPUs in Deep Learning: Which GPU to get (opens in new tab)

(timdettmers.com)

158 pointsmichaeln7y ago32 comments

32 comments

30 comments · 9 top-level

lern_too_spel7y ago· 6 in thread

The "I have almost no money" recommendation should include Colab. https://medium.com/deep-learning-turkey/google-colab-free-gp...

Somebody who has almost no money isn't going to be able to equip a desktop with a GTX 1050 Ti ($175), fast disk ($50), and RAM ($50) on an entry level cpu/motherboard/power supply/case/monitor/peripherals ($300) and pay for the electricity used during training. Colab can be accessed from a free public computer or a cheap Chromebook ($200).

gaius7y ago

A cheap (but not free) option is leadergpu.com - no affiliation, they just seem like super nice people and have per-minute billing. They are Dutch.

andy_ppp7y ago

What are the rules about datasets I upload to this free service? Do Google now own them?

ColanR7y ago

My guess it's the standard caveat: if you don't pay for the service, you and your stuff is the commodity.

lern_too_spel7y ago

I would imagine no more than it owns the files you upload to Google Drive. The disk on the Colab instance is ephemeral, so you will need external storage for your dataset anyway.

abcdefgh2147y ago

If you have the programming skills necessary to develop deep learning applications, it should be assumed that you can also easily get a well-paying job so this isn't really even relevant.

giomasce7y ago

Maybe you want to learn. Maybe you want to retain your current job because you like it even if it pay less then you could achieve, and you still would like to develop your other skills. Or maybe you want to stay close to your family, or in a place where they speak your mother tongue. Or maybe you just want to spend carefully your money even if you have plenty.

ageitgey7y ago· 6 in thread

This is a great article and I highly respect his opinions.

However, since you are probably eagerly reading this to see how fast the new RTX cards are, so you should know upfront that the numbers he has so far are just estimates based on specs:

> Note that the numbers for the RTX 2080 and RTX 2080 Ti should be taken with a grain of salt since no hard performance numbers existed. I estimated performance according to a roofline model of matrix multiplication and convolution under this hardware together with Tensor Core benchmarks from the V100 and Titan V.

shaklee37y ago

I'd guess that the performance could be slightly better than the 1080 scaled by cores/MHz/FLOPS. The reason being that the memory bandwidth is higher on the 2080, and that's hard to model unless the person knows exactly how efficient the kernel is and if it's memory bound.

steve_musk7y ago

Plus the architecture improvements. Do we know how many cores per SM? They’ve decoupled int and FP execution units which could give larger increases for certain kernels (although FP heavy deep learning kernels aren’t likely to benefit as much, they will still get address calculation benefits).

1 more reply

nolok7y ago

A great way to turn a listing you can trust enough to use as one of your comparison basis, into a listing made up of imaginary marketing numbers.

I guess the click baiting is needed / the best option, but I hate that's it's what most web resources are like now.

r1nkgrl7y ago

But the article isn't hiding the fact that the numbers are estimates. People are curious how the new cards will stack up, and this article provides the best evaluation of that given the information they have available.

p1necone7y ago

The clock rates, number of CUDA cores, memory size/type etc in the new cards aren't really "imaginary marketing numbers". NVidia could have changed their hardware so they could put bigger numbers on paper without corresponding real world performance gains, but that's a big assumption for you to seemingly take as fact.

alkonaut7y ago

No one minds comparing some products as guesses/estimates/extrapolations with some products as being real performance figures. So long as it's clear which products have which type of figure.

fermienrico7y ago· 6 in thread

The cost/performance plot - shouldn't it be "Lower is better"? It says "Higher is better".

Lower value would indicate lower cost per unit level of performance.

It should be "Lower is better" or the plot needs to say "Performance/Cost". Am I missing something?

timdettmers7y ago

Thanks for your feedback! Someone mentioned this on twitter as well and I thought it was a good point so I implemented that change.

fermienrico7y ago

Thanks for being receptive. I wouldn’t call it a “good point” if it was a mistake that was corrected.

KSS427y ago

Do you mean "Figure 3: Normalized performance/cost numbers"?

Its performance/cost and not cost/performance.

Or maybe the author fixed a typo?

songgao7y ago

I think it used to be cost/performance and was later fixed. GP left comment before the fix.

wmf7y ago

You're missing the principle of charity.

fermienrico7y ago

huh?

sabalaba7y ago· 1 in thread

The 2080Ti numbers are likely going to be a lot lower than that.

We’ve benched the 1080Ti vs the Titan V and the Titan V is nowhere near 2x faster at training than the 1080Ti as suggested in that graph. We observed a 30% to 40% speedup during our benchmarking:

https://deeptalk.lambdalabs.com/t/benchmarking-the-titan-v-v...

This is consistent with the 32% increase in FP32 flops from 11.3TFlops for the 1080Ti to 15TFlops for the Titan V. Additional speedups can be explained by the increase in memory bandwidth for HBM2 and the mixed precision fused multiply adds provided by the TensorCores.

Thus, given the quoted 13Tflop numbers for the 2080Ti, I would expect the 2080Ti to present something more like a 15-20% speedup over the 1080Ti. So 2080Ti is less bang for your buck. But benchmarking is the only way to tell what’s better on a FLOPS/$ basis.

timdettmers7y ago

Your data are inconsistent with the benchmarks that I mention in the blog post: https://github.com/u39kun/deep-learning-benchmark

You also do not benchmark LSTMs: https://www.xcelerit.com/computing-benchmarks/insights/bench...

If you put both of those benchmarks together my conclusion is quite reasonable. But I see that you could also come to your conclusion with your benchmarks. It is just a question which benchmarks are less biased and that is too difficult to evaluate.

I guess we have to wait for real data, but thanks for putting your data out there to get a discussion going.

syntaxing7y ago· 1 in thread

Hacker news hug of death? Anyone here have any experience using AMD cards with something like PlaidML? I have a 1050Ti SSC but I'm starting to feel the limitation as my complexity grows. But getting a 1080 is a bit out of my budget right now. I'm tempted to get the new Vega 56 released recently.

steve_musk7y ago

You could wait and see how pascal prices fall after Turing comes out.

dostres7y ago· 1 in thread

An open question for me is the performance of two 2080tis using NVLink as one virtual GPU. I imagine it’ll be close to linear, but I’ll be interested to know for sure.

shaklee37y ago

It won't be linear for memory-bound applications. The v100 was able to make it close to linear with large enough transfer sizes, but it has 50% more memory bandwidth than these.

pirocks7y ago

Seems down for me:

https://web.archive.org/web/20180821173206/http://timdettmer...

scottlegrand27y ago

The biggest advance here is that Nvidia has produced a consumer card that has all the high-end deep-learning features. This was missing in both the Pascal and Volta Generations even though in Pascal fp32 was full power. I think the TPU scared them and that's a good thing.

KayL7y ago

Good article, but as a new learner, I'm interested in (your experiences on) how much time taken for the common task to train a model? 1min vs 2mins, probably I will get a cheaper GPU but if there's 5h vs 10h or 1 day vs 2 days, I'd save more money for one with good performance

j / k navigate · click thread line to collapse

32 comments

30 comments · 9 top-level

lern_too_spel7y ago· 6 in thread

The "I have almost no money" recommendation should include Colab. https://medium.com/deep-learning-turkey/google-colab-free-gp...

gaius7y ago

A cheap (but not free) option is leadergpu.com - no affiliation, they just seem like super nice people and have per-minute billing. They are Dutch.

andy_ppp7y ago

What are the rules about datasets I upload to this free service? Do Google now own them?

ColanR7y ago

My guess it's the standard caveat: if you don't pay for the service, you and your stuff is the commodity.

lern_too_spel7y ago

I would imagine no more than it owns the files you upload to Google Drive. The disk on the Colab instance is ephemeral, so you will need external storage for your dataset anyway.

abcdefgh2147y ago

If you have the programming skills necessary to develop deep learning applications, it should be assumed that you can also easily get a well-paying job so this isn't really even relevant.

giomasce7y ago

ageitgey7y ago· 6 in thread

This is a great article and I highly respect his opinions.

However, since you are probably eagerly reading this to see how fast the new RTX cards are, so you should know upfront that the numbers he has so far are just estimates based on specs:

shaklee37y ago

steve_musk7y ago

1 more reply

nolok7y ago

A great way to turn a listing you can trust enough to use as one of your comparison basis, into a listing made up of imaginary marketing numbers.

I guess the click baiting is needed / the best option, but I hate that's it's what most web resources are like now.

r1nkgrl7y ago

p1necone7y ago

alkonaut7y ago

No one minds comparing some products as guesses/estimates/extrapolations with some products as being real performance figures. So long as it's clear which products have which type of figure.

fermienrico7y ago· 6 in thread

The cost/performance plot - shouldn't it be "Lower is better"? It says "Higher is better".

Lower value would indicate lower cost per unit level of performance.

It should be "Lower is better" or the plot needs to say "Performance/Cost". Am I missing something?

timdettmers7y ago

Thanks for your feedback! Someone mentioned this on twitter as well and I thought it was a good point so I implemented that change.

fermienrico7y ago

Thanks for being receptive. I wouldn’t call it a “good point” if it was a mistake that was corrected.

KSS427y ago

Do you mean "Figure 3: Normalized performance/cost numbers"?

Its performance/cost and not cost/performance.

Or maybe the author fixed a typo?

songgao7y ago

I think it used to be cost/performance and was later fixed. GP left comment before the fix.

wmf7y ago

You're missing the principle of charity.

fermienrico7y ago

huh?

sabalaba7y ago· 1 in thread

The 2080Ti numbers are likely going to be a lot lower than that.

We’ve benched the 1080Ti vs the Titan V and the Titan V is nowhere near 2x faster at training than the 1080Ti as suggested in that graph. We observed a 30% to 40% speedup during our benchmarking:

https://deeptalk.lambdalabs.com/t/benchmarking-the-titan-v-v...

timdettmers7y ago

Your data are inconsistent with the benchmarks that I mention in the blog post: https://github.com/u39kun/deep-learning-benchmark

You also do not benchmark LSTMs: https://www.xcelerit.com/computing-benchmarks/insights/bench...

I guess we have to wait for real data, but thanks for putting your data out there to get a discussion going.

syntaxing7y ago· 1 in thread

steve_musk7y ago

You could wait and see how pascal prices fall after Turing comes out.

dostres7y ago· 1 in thread

An open question for me is the performance of two 2080tis using NVLink as one virtual GPU. I imagine it’ll be close to linear, but I’ll be interested to know for sure.

shaklee37y ago

It won't be linear for memory-bound applications. The v100 was able to make it close to linear with large enough transfer sizes, but it has 50% more memory bandwidth than these.

pirocks7y ago

Seems down for me:

https://web.archive.org/web/20180821173206/http://timdettmer...

scottlegrand27y ago

KayL7y ago

j / k navigate · click thread line to collapse