This is exactly the problem with CORDIC. 52 dependent adds requires moving data from a register to the ALU and back 52 times.