I can say it is a delight to work with. All the usual GPU tips and tricks still apply, of course, and you need to pay careful attention to sequential memory accesses and so on (as with all GPU programming). But staying in the one, high level language is a real boon, and having access to native types and methods directly in my kernels is fantastic. I can’t speak highly enough of it.
And for performance comparison, I see between 3-4 orders of magnitude improvement in speed, about as fast as native CUDA.
One of julia's strengths is it's macro and type system. StructArrays.jl uses them to create a SoA struct out of a AoS. This is a killer feature that generally requires some form of code generation in C/C++.
Even if you're just doing something on the cpu, it should set you up to be both simd & gpu friendly. They have a guide on how to swap out the underlying array storage from cpu to gpu memory
fwiw, cuda is a "Tier 1" supported architecture[1], where "Tier 1" is defined as
> Tier 1: Julia is guaranteed to build from source and pass all tests on these platforms when built with the default options. Official binaries are always available and CI is run on every commit to ensure support is actively maintained.
Do you happen to have some examples of these you could share? Sounds interesting. Why is gpu needed for radio imaging?
Why is bounds checking a performance killer?
It's also great to see how well cuda is supported in julia. I've started to pick up julia lately, and find it incredibly pleasant to work with. It feels like a lovely mix of haskell, lisp, and python, with a really nice repl.
https://rogerjohansson.blog/2008/12/07/genetic-programming-e...