http://www.anandtech.com/show/11361/nvidia-announces-earning...
https://finance.yahoo.com/news/heres-much-nvidia-will-make-o...
By having an API thats not horrible to use, that advantage is gone. The utility libraries will be more of a challenge to undermine, but since it targets CUDA natively there is no disadvantage to users of nvidia's hardware, but there is no advantage to others, yet (see GLAS[1] for what is possible with relative ease). Using D as the kernel language will also bring significant advantages over C/C++, static reflection, sane templates and compile time code generation to name a few.
You can find it at https://github.com/libmir/dcompute.
If you have any question, please ask!
Please read this before moving on: https://twitter.com/jrprice89/status/667466444355993600
Also, NVIDIA's CUDA compilers are built on clang which does have OpenCL frontend, so all they would need to do is to put some resources into making that frontend work with their current nvcc toolchain.
Many request and want this, but instead they are trying hard to hold back OpenCL just because providing OpenCL 2.0 support (and extensions for their GPUs features) may help adoption of OpenCL which in turn may end up helping other folks and companies too.
In supercomputing this is the problem with using high performance linpack for benchmarks, which typically exceeds actual scientific codes by an order of magnitude in terms of floating point operations per second.
On the other hand, I believe Google is working on a CUDA compiler [1] so we may actually see meaningful improvement in the sense that it may become possible to run CUDA on other GPUs. (Edit: And Google actually has an incentive to achieve performance parity, so it might really happen.)
So, no.
https://www.technologyreview.com/s/602344/the-extraordinary-...
Like in one of the Stanisław Lem's stories about Ijon Tichy people call intelligent anthropomorphic robots washing machines.
Edit: traditional vector machines like the nec sx still hold the programmability crown because you get a usable single system image, right?
"Made for inference" just means "too slow for training" if you are pessimistic or "optimized for power efficiency" if you are optimistic.
Otherwise training and inference are basically the same
Assuming there's a big future to training hardware and inferencing. Many of those "new paradigms" / "silver bullet technologies" have come and gone in the last decades.
I'm biased, since I'm part of one, but there's little to no modification of the software stack necessary, so it's a credible threat to nvidia.
Today was a great day to be!
The math doesn't add up.
http://www.nvidia.com/object/drive-px.html
As far as any overlap software-wise is concerned, while it isn't super clear what Tesla Motors is doing for their self-driving systems, based on what I've seen it seems like they are using only "basic" lane-detection and identification along with some other algorithmic vision-based systems. I'm not saying that's everything they are doing, just what I have seen released publicly on their vehicle platform.
NVidia, on the other hand, has been experimenting with using neural networks (deep learning CNNs specifically) to drive vehicles using only camera information:
https://arxiv.org/abs/1604.07316
This is actually a fun CNN to implement - I (and many others) implemented variations of it in the first term on Udacity's Self-Driving Car Engineer Nanodegree. We weren't told to do it this way, but I chose to do so after reviewing the various literature, plus it seemed like a challenge (and it was for me). Udacity supplied a simulator:
https://github.com/udacity/self-driving-car-sim
...and we wrote code in Python (Tensorflow and Keras) to train and drive the virtual car. For my part, I had set up my home workstation with CUDA so that Tensorflow would utilize my GPU (a lowly GTX 750 TI SC - though it seems like it might have a similar GPU capability as NVidia's Drive-PX system, based on what I've researched - a Mini-ITX mobo, a PCI-E slot riser, and a GTX 750 would make a decent low-end deep-learning platform for self-driving vehicle experiments, and cost a fraction of what the Drive-PX sells for).
That's at the reticle limit of TSMC, a truly absurd chip.
However, they have been at the reticle limit since they were in 28nm. GM200 (980 Ti and Titan X) was 601 mm^2 at TSMC... the maximum possible at the time.
Feels like they're feeling AMD breathing down their necks with their VEGA architecture, which should be very interesting.
AMD have also stepped up their game with ROCm which might take a chunk out of CUDA.
Can't imagine we will be seeing any Volta GeForce cards released till next year.
https://devblogs.nvidia.com/parallelforall/inside-volta/
Under "New SM" in "Key Features" section
> Summit is a supercomputer being developed by IBM for use at Oak Ridge National Laboratory.[1][2][3] The system will be powered by IBM's POWER9 CPUs and Nvidia Volta GPUs.
https://en.wikipedia.org/wiki/Summit_(supercomputer)
Summit is supposed to be finished in 2017, though. I'm quite surprised this is possible since the Volta architecture has only just now been announced.
Supercomputers have very long planning and development cycles. So do GPUs and CPUs. The contract specified chips that didn't yet exist (Volta and POWER9) as much more than codenames on a roadmap.
http://parlab.eecs.berkeley.edu/sites/all/parlab/files/20090...