I am a well known OSS developer with hundreds of commits in OpenZFS and many commits in other projects like Gentoo and the Linux kernel. You keep misreading what I wrote and insist that I said something I did not. The issue is your lack of understanding, not mine.
I said that supporting 2 AVX-512 reads per cycle instead of 1 AVX-512 read per cycle does not actually matter very much for performance. You decided that means I said that AVX-512 does not matter. These are very different things.
If you try to use 2 AVX-512 reads per cycle for some workload (e.g. checksumming, GEMV, memcpy, etcetera), then you are going to be memory bandwidth bound such that the code will run no faster than if it did 1 AVX-512 read per cycle. I have written SIMD accelerated code for CPUs and the CPU being able to issue 2 SIMD reads per cycle would make zero difference for performance in all cases where I would want to use it. The only way 2 AVX-512 reads per cycle would be useful would be if system memory could keep up, but it cannot.