undefined | Better HN

0 pointsrm44512y ago0 comments

I'm daring to post from a position of ignorance, trusting intuition to be correct. But this is a framework for shipping (graphical and other) workloads over to the GPU. Surely it isn't mobile-only?

Just for instance, Apple sell an extremely expensive workstation with two very powerful graphics cards. Providing better ways to use a Mac Pro's GPUs is a good way to sell more Mac Pros, and a way to unlock more computational power as the power of CPU cores has levelled off.

0 comments

14 comments · 5 top-level

sliverstorm12y ago· 3 in thread

What we know: the stated goal of Metal is exactly the same as the goal of Mantle; reducing CPU overhead.

Hypothesis 1: Metal aims to replicate that for iOS, while Mantle can be used in the Mac Pro (which uses ATI cards)

Hypothesis 2: Metal could wrap around Mantle on OSX and some other similar interface on iOS where Mantle is not available, for a unified Apple interface without having to write their own ATI drivers

wmf12y ago

I suspect Apple wants to hedge their bets between ATI and Nvidia and thus will not support Mantle (or CUDA, which exists on OS X but appears to receive no love from Apple).

TazeTSchnitzel12y ago

Don't Apple already write their own graphics drivers on OS X?

sliverstorm12y ago

Sorry, I was unintentionally vague, I meant without having to write their own version of Mantle. The other upside being if Metal wraps around Mantle, they can still expose Mantle to software that targets Mantle.

1 more reply

statusgraph12y ago· 3 in thread

This may be the future, but recall that the GPUs in Macs are not made by Apple or exclusive to their platform in the way they are for the iDevices.

modeless12y ago

The GPUs in iOS devices are not made by Apple or exclusive to their platform. They are PowerVR GPUs designed by Imagination Technologies (and fabricated by Samsung). Many Android devices feature the same GPU designs that Apple uses.

statusgraph12y ago

How interesting! I believed too much of the A7 stuff =)

amaranth12y ago

But you can't replace the GPU and Apple has full source code to the driver stack (and probably wrote most of it themselves instead of using the source from ImgTec).

2 more replies

probablybanned12y ago· 2 in thread

I agree with you. I'll bet this is coming to OS X sooner rather than later, and people are largely missing its significance as a GPGPU programming environment.

Despite Apple's best efforts, OpenCL uptake seems to be sluggish. CUDA continues to dominate developer mindshare, by providing a far better language, API, and toolchain.

Compare the C++11 subset supported by the Metal shading language to the device language of CUDA C++. Templates are a huge feature. Ahead-of-time compilation is huge (vs shipping strings to the driver like OpenCL). It retains the basic workgroup structure of OpenCL with local and global memory, so it looks feasible to map to NVIDIA and AMD hardware. Is there really anything PowerVR specific in here? People seem to be inferring an awful lot from the name, but nothing sticks out at first glance.

The features of the shading language would make porting applications from CUDA less painful. If they went all in on XCode dev tools to make it a rival to NSight for profiling/debugging, maybe those Mac Pro GPUs wouldn't seem so neglected.

foxhill12y ago

> Despite Apple's best efforts, OpenCL uptake seems to be sluggish. CUDA continues to dominate developer mindshare, by providing a far better language, API, and toolchain.

i'm afraid i have to disagree with you there. over the past 5 years, CUDA popularity has peaked and is actually starting to decline. i would cite my source for that, but i'm on my phone.

aot vs online compilation is another kettle of fish. aot isn't necessarily better, although it is a more attractive offer for developers if they don't wish to ship kernel source. regardless, OpenCL 2.0 has SPIR, an llvm-it dialect that addresses this issue.

templates are not a big deal. GPGPU silicon is not well suited for complex computation (for some values of "complex") at least. i really wish C++ language features wouldn't get into the language. it's not going to be good.

if metal can replace OpenGL, then maybe i can get behind this. the failure of longs peak really set into motion it's relative demise. the API is in serious need of work.

probablybanned12y ago

For me, AOT compilation is not about shipping binaries vs. source. The software I write is mostly used for internal research, so it's not going anywhere. It's difference between getting compilation errors from 'make' or having to pull them out of the OpenCL C API at runtime. Setting a breakpoint in an OpenCL kernel... I remember Intel's stuff basically working fine, but you had to pass a vendor-specific option to hint at the file path. In all these little ways, the workflow is just sucky. Much of it can be worked around, but I'm too lazy to write more application code to do all the chores that the CUDA toolchain takes care of already. I'm glad to hear that 2.0 is fixing this.

What's your alternative to templates for generic code, exactly? C macros? Scripted pre-processing that further screws up the already marginal tool support for debugging and profiling? Copy+paste? Templates are completely orthogonal to "complex computation" -- I just want to use a device function on different data types without run-time overhead.

On the topic of complex computation, I'm constantly surprised at the kind of features NV adds to CUDA and how well they actually work. I'm also surprised at the kinds of things people do on the hardware. If someone implements a high performance lock-free data structure on the GPU, you can't look at that and say oh, that's too complex, you shouldn't do that.

Also, until we're all working on computers that look like the PS4 with a unified global memory, there's a huge incentive to cram the awkward bits of your program onto the GPU any way you can, even if it drags a bit, because that's where the data is.

NVIDIA has really gone to some extremes. malloc in device functions. vtable support. Dynamic parallelism. Metal has none of this stuff.

<EDIT: Just now saw your other replies downthread. I'm leaving this comment because it reflects my personal experience and opinions, but don't feel like you need to repeat yourself to clarify your position re: templates, etc>

fleitz12y ago· 1 in thread

The Metal API is akin to an ObjC port of OpenGL.

If you have access to OpenGL/CL Metal isn't going to give you much (other than perhaps a prettier API). Since most PC/Mac games are written in OpenGL/CL it will only make things slower.

The Metal API only really gives you extra perf if you're currently using SceneKit.

valleyer12y ago

That’s not what they said in the keynote… they compared it favorably in terms of performance to GL

cwyers12y ago

The name "Metal" gives a hint, though -- the API provides better performance by getting you closer to the metal. The metal in the A7 is different than the metal in those Mac Pros.

j / k navigate · click thread line to collapse