Show HN: Eyot, A programming language where the GPU is just another thread (opens in new tab)

(cowleyforniastudios.com)

79 pointssteeleduncan3mo ago18 comments

18 comments

18 comments · 6 top-level

sourcegrift3mo ago· 5 in thread

Don't mean to be rust fanatic or whatever but anyone know of anything similar for rust?

Not similar in the way of "Decorate any function and now it's a thread on the GPU", but Candle been pretty neat for experimenting with ML on Rust, and easy to move things between CPU and GPU, more of a library than a DSL though: https://github.com/huggingface/candle

notnullorvoid3mo ago

It seems somewhat similar to rust-gpu https://github.com/Rust-GPU/rust-gpu

steeleduncanOP3mo ago

I'm not totally sure what it is, but I believe there is something for running Rust code on the GPU easily

ModernMech3mo ago

You could use wgpu to replicate this demo.

https://wgpu.rs

wingertge3mo ago

I hate doing self-promotion, but this is basically exactly what CubeCL does. CubeCL is a bit more limited because as a proc macro we can't see any real type info, but it's the closest thing I'm aware of. Other solutions need a bunch of boilerplate and custom (nightly-only) compiler backends.

MeteorMarc3mo ago· 4 in thread

That is fun: it lends c-style block markers (curly braces) and python-style line separation (new lines). No objection.

steeleduncanOP3mo ago

It uses the same trick as Go [1]. The grammar has semicolons, but the tokeniser silently inserts them for ease of use. I think quite a few languages do it now

[1] https://go.dev/doc/effective_go#semicolons

maxloh3mo ago

JavaScript and Kotlin do that too.

NuclearPM3mo ago

Lends? What does that mean?

skavi3mo ago

blends

shubhamintech3mo ago· 2 in thread

The latency point matters more than it looks imo like the GPU work isn't just async CPU work at a different speed, the cost model is completely different. In LLM inference, the hard scheduling problem is batching non-uniform requests where prompt lengths and generation lengths vary, and treating that like normal thread scheduling leads to terrible utilization. Would be curious if Eyot has anything to say about non-uniform work units.

steeleduncanOP3mo ago

Not right now, it is far too early days. I'm currently working through bugs, and missing stdlib, to get a simple backpropagation network efficient. Once I'm happy with that I'd like to move onto more complex models.

CyberDildonics3mo ago

What is the new language doing that can't be done with an already established language that is worth sacrificing an entire standard library?

LorenDB3mo ago· 1 in thread

This reminds me that I'd love to see SYCL get more love. Right now, out of the computer hardware manufacturers, it seems that only Intel is putting any effort into it.

jamiejquinn3mo ago

CUDA having had such a wide moat for so long has completely warped the GPU software ecosystem. There just isn't any incentive for Nvidia to meaningfully contribute to any external, standards-driven effort like SYCL or OpenCL. Real shame because it leads to a tonne of duplicated effort as AMD and Intel try to reimplement the exact same libraries as Nvidia (and usually worse because neither seem to prioritise good software for whatever reason).

teleforce3mo ago

Perhaps any new language targetting GPU acceleration would consider TILE based concept and primitive recently supported by major GPU vendors including Nvidia [1],[2],[3],[4].

For more generic GPU targets there's TRITON [5],[6].

[1] NVIDIA CUDA 13.1 Powers Next-Gen GPU Programming with NVIDIA CUDA Tile and Performance Gains:

https://developer.nvidia.com/blog/nvidia-cuda-13-1-powers-ne...

[2] Nvidia Tilus: A Tile-Level GPU Kernel Programming Language:

https://github.com/NVIDIA/tilus

[3] Simplify GPU Programming with NVIDIA CUDA Tile in Python:

https://developer.nvidia.com/blog/simplify-gpu-programming-w...

[4] Tile Language:

https://github.com/tile-ai/tilelang

[5] Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations:

https://dl.acm.org/doi/10.1145/3315508.3329973

[6] Triton:

https://github.com/triton-lang/triton

CyberDildonics3mo ago

Every time someone does something with threading and makes it a language feature it always seems like it could just be done with stock C++.

Whatever this is doing could be wrapped up in another language.

Either way it's arguable that is even a good idea, since dealing with a regular thread in the same memory space, getting data to and from the GPU and doing computations on the GPU are all completely separate and have different latency characteristics.

j / k navigate · click thread line to collapse

18 comments

18 comments · 6 top-level

sourcegrift3mo ago· 5 in thread

Don't mean to be rust fanatic or whatever but anyone know of anything similar for rust?

embedding-shape3mo ago

notnullorvoid3mo ago

It seems somewhat similar to rust-gpu https://github.com/Rust-GPU/rust-gpu

steeleduncanOP3mo ago

I'm not totally sure what it is, but I believe there is something for running Rust code on the GPU easily

ModernMech3mo ago

You could use wgpu to replicate this demo.

https://wgpu.rs

wingertge3mo ago

MeteorMarc3mo ago· 4 in thread

That is fun: it lends c-style block markers (curly braces) and python-style line separation (new lines). No objection.

steeleduncanOP3mo ago

It uses the same trick as Go [1]. The grammar has semicolons, but the tokeniser silently inserts them for ease of use. I think quite a few languages do it now

[1] https://go.dev/doc/effective_go#semicolons

maxloh3mo ago

JavaScript and Kotlin do that too.

NuclearPM3mo ago

Lends? What does that mean?

skavi3mo ago

blends

shubhamintech3mo ago· 2 in thread

steeleduncanOP3mo ago

CyberDildonics3mo ago

What is the new language doing that can't be done with an already established language that is worth sacrificing an entire standard library?

LorenDB3mo ago· 1 in thread

This reminds me that I'd love to see SYCL get more love. Right now, out of the computer hardware manufacturers, it seems that only Intel is putting any effort into it.

jamiejquinn3mo ago

teleforce3mo ago

Perhaps any new language targetting GPU acceleration would consider TILE based concept and primitive recently supported by major GPU vendors including Nvidia [1],[2],[3],[4].

For more generic GPU targets there's TRITON [5],[6].

[1] NVIDIA CUDA 13.1 Powers Next-Gen GPU Programming with NVIDIA CUDA Tile and Performance Gains:

https://developer.nvidia.com/blog/nvidia-cuda-13-1-powers-ne...

[2] Nvidia Tilus: A Tile-Level GPU Kernel Programming Language:

https://github.com/NVIDIA/tilus

[3] Simplify GPU Programming with NVIDIA CUDA Tile in Python:

https://developer.nvidia.com/blog/simplify-gpu-programming-w...

[4] Tile Language:

https://github.com/tile-ai/tilelang

[5] Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations:

https://dl.acm.org/doi/10.1145/3315508.3329973

[6] Triton:

https://github.com/triton-lang/triton

CyberDildonics3mo ago

Every time someone does something with threading and makes it a language feature it always seems like it could just be done with stock C++.

Whatever this is doing could be wrapped up in another language.

j / k navigate · click thread line to collapse