If you want cross-platform compatibility (kinda), go for OpenCL, if you want the best performance go for Metal. Both use a very similar language for kernels, but Metal is generally more efficient.
> Have you had any luck?
Not in ML, but I'm doing a lot of GPGPU on Metal, I recently started doing it in Rust. A bit less convenient than with Swift/Objective-C, but still possible. Worst case you'll have to add an .mm file and bridge it with `extern "C"`. That said, doing GPGPU is not doing ML, and most ML libraries are in Python.
> I also got confuses as to whether a 'shader' was more for the visual GPU output of things, or if it was also a building block for model training/networks/machine learning/etc.
A shader is basically a function that runs for every element of the output buffer. We generally call them kernels for GPGPU, and shaders (geometry, vertex, fragment) for graphics stuff. You have to write them in a language that kinda looks like C (OpenGL GLSL, DirectX HSL, Metal MSL), but is optimized for the SMT properties of GPUs.
Learning shaders will let you run code on the GPU, to do ML you also need to learn what are tensors, how to compute them on the GPU, and how to build ML systems using them.
I recommend ShaderToy [0] if you want a cool way to understand and play with shaders.