Seriously, this keeps getting overhyped as some gigantic insight when it was really just a consequence of the Pentium having been released in 1993. And with the Pentium, you got both reliable FPU availability (none of the 486SX pain), and the cycle count for FDIV dropped by almost 50% (73->39 IIRC)
Everybody doing 3d gfx knew you needed a perspective divide and was looking at ways to do that cheaply. Interpolation + a long-latency instruction that doesn't block the main pipelines is a fairly straightforward answer.