But if you are after performance how do do the following in Java? - Build an AOS so that memory access is linear re cache. Prefetch. Use things like _mm_stream_ps() to tell the CPU the cache line you're writing to doesn't need to be fetched. Share a buffer of memory between processes by atomic incrementing a head pointer.
I'm pretty sure you could build an indie game without low-level C++, but there is a reason that commercial gamedev is typically C++.