The thing that blew my mind when i ran into this was that the jvm had actually been NUMA aware for many years at that point but it was all turned off by default. This was jdk8 days so things are probably better now.
After i figured the root cause, I spent about a day logging results in excel as i tested various GC options, the only saving grace was i recall the Oracle documentation for GC options and their NUMA effects was pretty good. That optimisation time was really well spent though, i got meaningful improvements in both service latency and throughput.