And also, it would just make sense. If copying entire blocks or memory pages, such as "BitBlt", is one command, why would I need CPU cycles to actually do it? It would seem like the lowest hanging fruit to automate in SDRAM
It just seems like the easiest example of SIMD