I think the reason for reducing clock speed when vector units are in heavy use is to keep power usage in check.
You might also find https://blog.cloudflare.com/on-the-dangers-of-intels-frequen... helpful, which goes into detail about a specific case where dynamic frequency scaling resulted in AVX-512 code running slower than AVX2 code.