Volta: Advanced Data Center GPU (opens in new tab)

kanwisher9y ago

I'ved thought that but the per unit volume is huge. Every game console, phone, tablet, PC needs a GPU. Even low-end devices are expected to run games. Thats billions of units, albeit at lower margins

https://finance.yahoo.com/news/heres-much-nvidia-will-make-o...

Cshelton9y ago

I mean, at what point will we go full circle of going back to a "mainframe" where consumers don't really own/posses the computing power, rather it's down in datacenters. Like, you play your game through a VM basically, and your personal computer is just an AWS instance...

5 more replies

ben1749y ago

They've done quite well on the Nintendo Switch.

gruturo9y ago

Not necessarily. So many of the improvements would anyway have a dual use, and it's not like their margins in the gaming/end user GPU business are razor thin. Moreover, the volume is probably immensely higher, so despite lower per-unit profit, they probably make it up in quantity. Won't be like this forever given the different speeds at which the 2 sectors grow, but it's gonna be some time before the roles are reversed.

hatsunearu9y ago

Perhaps, but the desktop gaming market is still growing and is a huge part of NVIDIA's income.

zokier9y ago

Isn't that what this post was all about? Releasing brand new architecture on compute first seems to me pretty much like prioritizing compute market over consumers.

coldtea9y ago

Citation needed for the "much more profitable" part.

kakarot9y ago

Nah. This is NVIDIA. They will just continue to focus on both markets as long as they're kicking ass in them.

AndrewKemendo9y ago

It can only play Crysis on 50% texture.

josephpmay9y ago

I know you're getting downvotes, but in the Keynote they showed a cinematic-quality live rendered "gaming demo" scene

Tossrock9y ago

For those wondering, this was (around) the 44 minute mark.

arca_vorago9y ago· 11 in thread

More great hardware being stuck behind proprietary CUDA when OpenCL is the thing they should be helping with. Once again proprietary lock in that will result in inflexibility and digital blow-back in the long run. Yes I understand OpenCL has some issues and CUDA tends to be a bit easier and less buggy, but that doesn't detract from the principles of my statement.

nicwilson9y ago

I am the author of DCompute, a compiler/library/runtime framework for abstracting OpenCL/CUDA for D. You can write kernels already, although the API automation is still a work in progress. I'm hoping that this should level the field a bit, because let's face it, people use CUDA for two reasons: the OpenCL driver API sucks; and the utility libraries (cuDNN et al) for CUDA. Possibly driver quality as well.

By having an API thats not horrible to use, that advantage is gone. The utility libraries will be more of a challenge to undermine, but since it targets CUDA natively there is no disadvantage to users of nvidia's hardware, but there is no advantage to others, yet (see GLAS[1] for what is possible with relative ease). Using D as the kernel language will also bring significant advantages over C/C++, static reflection, sane templates and compile time code generation to name a few.

You can find it at https://github.com/libmir/dcompute.

If you have any question, please ask!

[1] https://github.com/libmir/mir-glas

slizard9y ago

^ This!

Please read this before moving on: https://twitter.com/jrprice89/status/667466444355993600

Also, NVIDIA's CUDA compilers are built on clang which does have OpenCL frontend, so all they would need to do is to put some resources into making that frontend work with their current nvcc toolchain.

Many request and want this, but instead they are trying hard to hold back OpenCL just because providing OpenCL 2.0 support (and extensions for their GPUs features) may help adoption of OpenCL which in turn may end up helping other folks and companies too.

MichaelBurge9y ago

Nobody else is even bothering to compete, so standards don't really matter. Let them do their job: I'd rather have faster GPUs.

tanderson929y ago

Standards matter if you care about software and hardware freedom.

slizard9y ago

I don't wish you the suffering vendor lock-in can cause after 10 years (hell, even less) of faithfully following the NVIDIA path, but... actually I do because that probably the best way to realize what's wrong with proprietary systems that pitch themselves as "de-facto" standards.

paulsutter9y ago

Build your systems around GEMM/blas. Every vendor will give you a fast GEMM, and you'll be set for basically all the architectures that are coming out.

arcanus9y ago

Except that not all problems in computation are GEMMs. CNNs in Machine learning certainly are, but many 'real' systems cannot be posted in such a manner.

In supercomputing this is the problem with using high performance linpack for benchmarks, which typically exceeds actual scientific codes by an order of magnitude in terms of floating point operations per second.

[1]: https://research.google.com/pubs/pub45226.html

ahelwer9y ago

I thought NVIDIA GPUs support OpenCL? Or do they not do that anymore?

eslaught9y ago

It's always been 10-20% slower than CUDA and frankly NVIDIA doesn't have an incentive to make it faster than that.

On the other hand, I believe Google is working on a CUDA compiler [1] so we may actually see meaningful improvement in the sense that it may become possible to run CUDA on other GPUs. (Edit: And Google actually has an incentive to achieve performance parity, so it might really happen.)

6 more replies

hawski9y ago

Can Vulkan fill that space?

nicwilson9y ago

Nvidia will have to support the SPIR-V Vulkan environment that is different to the OpenCL SPIR-V environment. But Vulkan is a graphics API not a compute API. Yes in theory you can write compute shaders but from my experience if you have a compute workload: use a compute API, they're much more suited for the job.

So, no.

hesdeadjim9y ago· 11 in thread

I find it so cool that technology created to make games like Quake look pretty has ended up becoming a core foundation of high performance computing and AI.

Florin_Andrei9y ago

I think it's even cooler how matrix multiplication dominates both the universe at large, and the systems that understand it (neural networks).

nostrebored9y ago

Well a large portion of that is desiring the data to be in that form. BLAS operations are ruthlessly efficient and use the system hardware so well.

thearn49y ago

Linear algebra is the ultimate common variable in technical computing and applied mathematics at large.

flukus9y ago

I find it incredible that even with all these cool applications of matrix multiplication it gets taught so horribly in schools.

https://www.technologyreview.com/s/602344/the-extraordinary-...

halflings9y ago

Relevant article:

anonymfus9y ago

It backwards: first common application of the technology gave it the name.

Like in one of the Stanisław Lem's stories about Ijon Tichy people call intelligent anthropomorphic robots washing machines.

matt40779y ago

You'll be delighted to hear that traffic signals are called "robots" in South Africa.

fulafel9y ago

Yet another step in the progression in which mass market GPU silicon kills traditional vector and memory bandwidth rich4 HPC/supercomputing hardware. Cray-on-a-chip.

Edit: traditional vector machines like the nec sx still hold the programmability crown because you get a usable single system image, right?

pgodzin9y ago

Matrix multiplication is important for graphics and important for finding the weights of a neural network

hesdeadjim9y ago

Yep, hard to imagine though that the original creators of the Nvidia TNT or Voodoo had any idea that GPUs would become fully programmable computing hardware used for non-graphical applications.

https://www.youtube.com/watch?v=ooLO2xeyJZA

dom09y ago

gigatexal9y ago· 8 in thread

These tensor cores sound exotic: "Each Tensor Core performs 64 floating point FMA mixed-precision operations per clock (FP16 multiply and FP32 accumulate) and 8 Tensor Cores in an SM perform a total of 1024 floating point operations per clock. This is a dramatic 8X increase in throughput for deep learning applications per SM compared to Pascal GP100 using standard FP32 operations, resulting in a total 12X increase in throughput for the Volta V100 GPU compared to the Pascal P100 GPU. Tensor Cores operate on FP16 input data with FP32 accumulation. The FP16 multiply results in a full precision result that is accumulated in FP32 operations with the other products in a given dot product for a 4x4x4 matrix multiply," Curious to see how the ML groups and others take to this. Certainly ML and other GPGPU usage has helped Nvidia climb in value. I wonder if Nvidia saw the writing on the wall so to speak with Google releasing their specialty hardware called the Tensor hardware that Nvidia decided to use it in their branding as well.

deepnotderp9y ago

"Tensor hardware" is a very vague term that's more marketing than an actual hardware type, I guarantee you that these are really SIMD or matrix units like the Google tpu that they just devised to call "Tensor", because, you know, it sells.

Symmetry9y ago

They're matrix units just like in the Google TPU but the TPU stands for "Tensor Processing Unit" so that's consistent. There's no reason to add special SIMD units when the entire core is already running in SIMT mode and by establishing a dataflow for NxNxN matrix multiplies you can reduce your register read bandwidth by a factor of N. Which isn't as huge for NVidia's N=4 as for Google's N=256 but is still a big deal, and diminishing returns might mean that NVidia is getting most of the possible benefit when stopping at 4 and preserving more flexibility for other workloads.

gigatexal9y ago

For me, the laymen, reading the matrix multiply stuff that's what it sounded like to me as well given my understanding of SIMD and such. Especially when they made mention to BLAS. But I am no expert.

bmiranda9y ago

Google's hardware is for inference, not training.

josephpmay9y ago

Volta is for both inferencing and training, but has an emphasis on inferencing

gigatexal9y ago

thanks for clarifying.

It doesn't matter, operations are the same in forward and backward mode.

"Made for inference" just means "too slow for training" if you are pessimistic or "optimized for power efficiency" if you are optimistic.

Otherwise training and inference are basically the same

Symmetry9y ago

It's really cool how much performance you can get out of hardware dataflows.

mattnewton9y ago· 7 in thread

Wow, this is just Nvidia running laps around themselves at this point. Xenon Phi still not competitive, AMD focused on the consumer space, looks like the future of training hardware (and maybe even inferencing) belongs to Nvidia. (Disclosure: I am and have been long Nvidia since I found out cudnn existed and how far ahead it was)

coldtea9y ago

>Xenon Phi still not competitive, AMD focused on the consumer space, looks like the future of training hardware (and maybe even inferencing) belongs to Nvidia.

Assuming there's a big future to training hardware and inferencing. Many of those "new paradigms" / "silver bullet technologies" have come and gone in the last decades.

mattnewton9y ago

That's true, but there is reason to believe this time is different™, with killer applications in medical image understanding, natural language understanding, and self driving cars, all of which could drive demand of these chips by themselves. It is possible we will discover new dominant architectures that don't use this hardware well but I am putting my money on us coming up with even more applications that do use this hardware well.

deepnotderp9y ago

There's something coming for them: deep learning processors.

I'm biased, since I'm part of one, but there's little to no modification of the software stack necessary, so it's a credible threat to nvidia.

mattnewton9y ago

I hope so, if only because it keeps them running at this pace! Kudos for charging the 800lb gorilla head on.

p1esk9y ago

What do you think about them open sourcing DLA of Xavier?

ndesaulniers9y ago

> I am and have been long Nvidia

Today was a great day to be!

kobeya9y ago

The potential disrupter here is RISV-V with vector extensions, which are currently being standardized.

lowglow9y ago· 4 in thread

I'm really happy our startup didn't go all in on Tesla (Pascal architecture) yet. These look amazing.

mattnewton9y ago

I feel like every time I buy cards, Nividia announces the successor with absurd improvements.

lowglow9y ago

Yeah, I just sprung for a Titan Xp -- waiting for it to become obsolete next month.

dom09y ago

OTOH improvements in the mainstream segments seem to go slower: Mainstream cards are about twice as fast now as they were five years ago.

deepnotderp9y ago

Cuda should run on both, right? Unless you're talking about shader assembly or hardware.

randyrand9y ago· 3 in thread

What are the silver boxes that line both sides of the card? Huge Capacitors?

smitty11109y ago

Ferrite chokes, part of the power delivery system.

hatsunearu9y ago

Inductor, not chokes. Part of the buck converter to create Vcore.

randyrand9y ago

Why are they needed?

1024core9y ago· 3 in thread

FTA: "GV100 supports up to 6 NVLink links at 25 GB/s for a total of 300 GB/s."

The math doesn't add up.

p1esk9y ago

Bidirectional bandwidth.

orik9y ago

Maybe 25 GB/s each way?

1024core9y ago

That's what I thought too, but then why would they quote unidirectional b/w in one part of the sentence, and bidirectional in the other?

http://www.nvidia.com/object/drive-px.html

gwbas1c9y ago· 3 in thread

How long until Tesla sues for trademark infringement? "from detecting lanes on the road to teaching autonomous cars to drive" makes it sound like there is an awful lot of overlap in product function.

cr0sh9y ago

I doubt anything like that would happen. While Tesla Motors was founded prior to the creation of the Tesla GPU architecture, there's not really any overlap - in fact, I wouldn't be surprised if Tesla Motors wasn't using something like this from NVidia:

As far as any overlap software-wise is concerned, while it isn't super clear what Tesla Motors is doing for their self-driving systems, based on what I've seen it seems like they are using only "basic" lane-detection and identification along with some other algorithmic vision-based systems. I'm not saying that's everything they are doing, just what I have seen released publicly on their vehicle platform.

NVidia, on the other hand, has been experimenting with using neural networks (deep learning CNNs specifically) to drive vehicles using only camera information:

https://arxiv.org/abs/1604.07316

This is actually a fun CNN to implement - I (and many others) implemented variations of it in the first term on Udacity's Self-Driving Car Engineer Nanodegree. We weren't told to do it this way, but I chose to do so after reviewing the various literature, plus it seemed like a challenge (and it was for me). Udacity supplied a simulator:

https://github.com/udacity/self-driving-car-sim

...and we wrote code in Python (Tensorflow and Keras) to train and drive the virtual car. For my part, I had set up my home workstation with CUDA so that Tensorflow would utilize my GPU (a lowly GTX 750 TI SC - though it seems like it might have a similar GPU capability as NVidia's Drive-PX system, based on what I've researched - a Mini-ITX mobo, a PCI-E slot riser, and a GTX 750 would make a decent low-end deep-learning platform for self-driving vehicle experiments, and cost a fraction of what the Drive-PX sells for).

sargun9y ago

Tesla Motors uses Tegra chips to power their console. So, nVidia is probably okay.

What hardware do you think Tesla is using...?

bmiranda9y ago· 2 in thread

815 mm^2 die size!

That's at the reticle limit of TSMC, a truly absurd chip.

kurthr9y ago

I agree... there's not much more they can do to scale since off die is still slow. Unless they stitch across the exposure boundary!

However, they have been at the reticle limit since they were in 28nm. GM200 (980 Ti and Titan X) was 601 mm^2 at TSMC... the maximum possible at the time.

tostitos19799y ago

I've seen some huge mainframe die back in the day. What is reticle limit exactly? Thanks for educating a SW guy :)

https://devblogs.nvidia.com/parallelforall/inside-volta/

arnon9y ago· 2 in thread

This is odd for NVIDIA. They usually push out revised versions in the second year, not change the entire architecture to the new one.

Feels like they're feeling AMD breathing down their necks with their VEGA architecture, which should be very interesting.

AMD have also stepped up their game with ROCm which might take a chunk out of CUDA.

Robadob9y ago

As I recall, Volta (3d memory) has been delayed multiple times due to supply and this is only a very limited release of their highest end hardware for deep learning all pegged for Q3/Q4 release. A field where they haven't really any competition.

Can't imagine we will be seeing any Volta GeForce cards released till next year.

dogma11389y ago

Volta GeForce will come early 2018 likely with GDDR6 at this point.

Athas9y ago· 2 in thread

Does this architecture improve on 64-bit integer performance? Have any of the GPU manufacturers said anything about that? At some point it becomes a necessity for address calculations on large arrays.

sipherhex9y ago

"With independent, parallel integer and floating point datapaths, the Volta SM is also much more efficient on workloads with a mix of computation and addressing calculations"

Under "New SM" in "Key Features" section

jabl9y ago

But if you read the article it seems the integer units are int32, so not capable of 64-bit computations.

caenorst9y ago· 2 in thread

Did they communicate any release date and price during the show ?

abhshkdzOP9y ago

DGX-1 with Volta — $149k, Q3; DGX Home Station with Volta — $69k, Q3

tanderson929y ago

Any information about when this architecture will make it onto Tesla or Quadro products available to "mass" market?

https://en.wikipedia.org/wiki/Summit_(supercomputer)

grondilu9y ago· 1 in thread

I was wondering if this will be used in supercomputers. Apparently yes:

> Summit is a supercomputer being developed by IBM for use at Oak Ridge National Laboratory.[1][2][3] The system will be powered by IBM's POWER9 CPUs and Nvidia Volta GPUs.

Summit is supposed to be finished in 2017, though. I'm quite surprised this is possible since the Volta architecture has only just now been announced.

Scaevolus9y ago

The Summit contract was signed in November 2014: http://www.anandtech.com/show/8727/nvidia-ibm-supercomputers

Supercomputers have very long planning and development cycles. So do GPUs and CPUs. The contract specified chips that didn't yet exist (Volta and POWER9) as much more than codenames on a roadmap.

Etheryte9y ago· 1 in thread

Interesting to note that Nvidia's stock rose about 18% (!, 102.94USD on May 9, 121.29USD on May 10) in a single day after this announcement. I expected the market to react, but this seems disproportionate.

virtuallynathan9y ago

They announced this the day after earnings, earnings caused the jump, this compounded (maybe).

Symmetry9y ago

I wonder if the individual lane PCs will pave the way for implementing some of Andy Glew's ideas for increased lane utilization in future revisions?

http://parlab.eecs.berkeley.edu/sites/all/parlab/files/20090...

braindead_in9y ago

So when are the new AWS instances are coming?

boulos9y ago

My favorite outcome of Volta is that it's the first GPU they've produced that actually can claim this SIMT thing due to its separate program counters (we had a spirited debate about whether or not just doing masking but presenting the programming model meant the chip was SIMT or just that CUDA was but GPUs weren't).

j / k navigate · click thread line to collapse

158 comments

91 comments · 18 top-level

tobyhinloopen9y ago· 13 in thread

Time to play some games on it

mtgx9y ago

I have a feeling eventually Nvidia will, like Intel, de-prioritize the consumer market in favor of the much more profitable server/machine learning market.

jra1019y ago

Gaming GPUs still more than 50% of revenue for NVIDIA:

http://www.anandtech.com/show/11361/nvidia-announces-earning...

kanwisher9y ago

I'ved thought that but the per unit volume is huge. Every game console, phone, tablet, PC needs a GPU. Even low-end devices are expected to run games. Thats billions of units, albeit at lower margins

https://finance.yahoo.com/news/heres-much-nvidia-will-make-o...

Cshelton9y ago

5 more replies

ben1749y ago

They've done quite well on the Nintendo Switch.

gruturo9y ago

hatsunearu9y ago

Perhaps, but the desktop gaming market is still growing and is a huge part of NVIDIA's income.

zokier9y ago

Isn't that what this post was all about? Releasing brand new architecture on compute first seems to me pretty much like prioritizing compute market over consumers.

coldtea9y ago

Citation needed for the "much more profitable" part.

kakarot9y ago

Nah. This is NVIDIA. They will just continue to focus on both markets as long as they're kicking ass in them.

AndrewKemendo9y ago

It can only play Crysis on 50% texture.

josephpmay9y ago

I know you're getting downvotes, but in the Keynote they showed a cinematic-quality live rendered "gaming demo" scene

Tossrock9y ago

For those wondering, this was (around) the 44 minute mark.

arca_vorago9y ago· 11 in thread

nicwilson9y ago

You can find it at https://github.com/libmir/dcompute.

If you have any question, please ask!

[1] https://github.com/libmir/mir-glas

slizard9y ago

^ This!

Please read this before moving on: https://twitter.com/jrprice89/status/667466444355993600

MichaelBurge9y ago

Nobody else is even bothering to compete, so standards don't really matter. Let them do their job: I'd rather have faster GPUs.

tanderson929y ago

Standards matter if you care about software and hardware freedom.

slizard9y ago

paulsutter9y ago

Build your systems around GEMM/blas. Every vendor will give you a fast GEMM, and you'll be set for basically all the architectures that are coming out.

arcanus9y ago

Except that not all problems in computation are GEMMs. CNNs in Machine learning certainly are, but many 'real' systems cannot be posted in such a manner.

[1]: https://research.google.com/pubs/pub45226.html

ahelwer9y ago

I thought NVIDIA GPUs support OpenCL? Or do they not do that anymore?

eslaught9y ago

It's always been 10-20% slower than CUDA and frankly NVIDIA doesn't have an incentive to make it faster than that.

6 more replies

hawski9y ago

Can Vulkan fill that space?

nicwilson9y ago

So, no.

hesdeadjim9y ago· 11 in thread

I find it so cool that technology created to make games like Quake look pretty has ended up becoming a core foundation of high performance computing and AI.

Florin_Andrei9y ago

I think it's even cooler how matrix multiplication dominates both the universe at large, and the systems that understand it (neural networks).

nostrebored9y ago

Well a large portion of that is desiring the data to be in that form. BLAS operations are ruthlessly efficient and use the system hardware so well.

thearn49y ago

Linear algebra is the ultimate common variable in technical computing and applied mathematics at large.

flukus9y ago

I find it incredible that even with all these cool applications of matrix multiplication it gets taught so horribly in schools.

https://www.technologyreview.com/s/602344/the-extraordinary-...

halflings9y ago

Relevant article:

anonymfus9y ago

It backwards: first common application of the technology gave it the name.

Like in one of the Stanisław Lem's stories about Ijon Tichy people call intelligent anthropomorphic robots washing machines.

matt40779y ago

You'll be delighted to hear that traffic signals are called "robots" in South Africa.

fulafel9y ago

Yet another step in the progression in which mass market GPU silicon kills traditional vector and memory bandwidth rich4 HPC/supercomputing hardware. Cray-on-a-chip.

Edit: traditional vector machines like the nec sx still hold the programmability crown because you get a usable single system image, right?

pgodzin9y ago

Matrix multiplication is important for graphics and important for finding the weights of a neural network

hesdeadjim9y ago

Yep, hard to imagine though that the original creators of the Nvidia TNT or Voodoo had any idea that GPUs would become fully programmable computing hardware used for non-graphical applications.

https://www.youtube.com/watch?v=ooLO2xeyJZA

dom09y ago

gigatexal9y ago· 8 in thread

deepnotderp9y ago

Symmetry9y ago

gigatexal9y ago

For me, the laymen, reading the matrix multiply stuff that's what it sounded like to me as well given my understanding of SIMD and such. Especially when they made mention to BLAS. But I am no expert.

bmiranda9y ago

Google's hardware is for inference, not training.

josephpmay9y ago

Volta is for both inferencing and training, but has an emphasis on inferencing

gigatexal9y ago

thanks for clarifying.

It doesn't matter, operations are the same in forward and backward mode.

"Made for inference" just means "too slow for training" if you are pessimistic or "optimized for power efficiency" if you are optimistic.

Otherwise training and inference are basically the same

Symmetry9y ago

It's really cool how much performance you can get out of hardware dataflows.

mattnewton9y ago· 7 in thread

coldtea9y ago

>Xenon Phi still not competitive, AMD focused on the consumer space, looks like the future of training hardware (and maybe even inferencing) belongs to Nvidia.

Assuming there's a big future to training hardware and inferencing. Many of those "new paradigms" / "silver bullet technologies" have come and gone in the last decades.

mattnewton9y ago

deepnotderp9y ago

There's something coming for them: deep learning processors.

I'm biased, since I'm part of one, but there's little to no modification of the software stack necessary, so it's a credible threat to nvidia.

mattnewton9y ago

I hope so, if only because it keeps them running at this pace! Kudos for charging the 800lb gorilla head on.

p1esk9y ago

What do you think about them open sourcing DLA of Xavier?

ndesaulniers9y ago

> I am and have been long Nvidia

Today was a great day to be!

kobeya9y ago

The potential disrupter here is RISV-V with vector extensions, which are currently being standardized.

lowglow9y ago· 4 in thread

I'm really happy our startup didn't go all in on Tesla (Pascal architecture) yet. These look amazing.

mattnewton9y ago

I feel like every time I buy cards, Nividia announces the successor with absurd improvements.

lowglow9y ago

Yeah, I just sprung for a Titan Xp -- waiting for it to become obsolete next month.

dom09y ago

OTOH improvements in the mainstream segments seem to go slower: Mainstream cards are about twice as fast now as they were five years ago.

deepnotderp9y ago

Cuda should run on both, right? Unless you're talking about shader assembly or hardware.

randyrand9y ago· 3 in thread

What are the silver boxes that line both sides of the card? Huge Capacitors?

smitty11109y ago

Ferrite chokes, part of the power delivery system.

hatsunearu9y ago

Inductor, not chokes. Part of the buck converter to create Vcore.

randyrand9y ago

Why are they needed?

1024core9y ago· 3 in thread

FTA: "GV100 supports up to 6 NVLink links at 25 GB/s for a total of 300 GB/s."

The math doesn't add up.

p1esk9y ago

Bidirectional bandwidth.

orik9y ago

Maybe 25 GB/s each way?

1024core9y ago

That's what I thought too, but then why would they quote unidirectional b/w in one part of the sentence, and bidirectional in the other?

http://www.nvidia.com/object/drive-px.html

gwbas1c9y ago· 3 in thread

How long until Tesla sues for trademark infringement? "from detecting lanes on the road to teaching autonomous cars to drive" makes it sound like there is an awful lot of overlap in product function.

cr0sh9y ago

NVidia, on the other hand, has been experimenting with using neural networks (deep learning CNNs specifically) to drive vehicles using only camera information:

https://arxiv.org/abs/1604.07316

https://github.com/udacity/self-driving-car-sim

sargun9y ago

Tesla Motors uses Tegra chips to power their console. So, nVidia is probably okay.

What hardware do you think Tesla is using...?

bmiranda9y ago· 2 in thread

815 mm^2 die size!

That's at the reticle limit of TSMC, a truly absurd chip.

kurthr9y ago

I agree... there's not much more they can do to scale since off die is still slow. Unless they stitch across the exposure boundary!

However, they have been at the reticle limit since they were in 28nm. GM200 (980 Ti and Titan X) was 601 mm^2 at TSMC... the maximum possible at the time.

tostitos19799y ago

I've seen some huge mainframe die back in the day. What is reticle limit exactly? Thanks for educating a SW guy :)

https://devblogs.nvidia.com/parallelforall/inside-volta/

arnon9y ago· 2 in thread

This is odd for NVIDIA. They usually push out revised versions in the second year, not change the entire architecture to the new one.

Feels like they're feeling AMD breathing down their necks with their VEGA architecture, which should be very interesting.

AMD have also stepped up their game with ROCm which might take a chunk out of CUDA.

Robadob9y ago

Can't imagine we will be seeing any Volta GeForce cards released till next year.

dogma11389y ago

Volta GeForce will come early 2018 likely with GDDR6 at this point.

Athas9y ago· 2 in thread

sipherhex9y ago

"With independent, parallel integer and floating point datapaths, the Volta SM is also much more efficient on workloads with a mix of computation and addressing calculations"

Under "New SM" in "Key Features" section

jabl9y ago

But if you read the article it seems the integer units are int32, so not capable of 64-bit computations.

caenorst9y ago· 2 in thread

Did they communicate any release date and price during the show ?

abhshkdzOP9y ago

DGX-1 with Volta — $149k, Q3; DGX Home Station with Volta — $69k, Q3

tanderson929y ago

Any information about when this architecture will make it onto Tesla or Quadro products available to "mass" market?