I wish that Intel would have spent some transistors on an arbitrary precision decimal arithmetic floating point unit. That would have helped scientific processing but in the past has been 'too expensive' in terms of transistors to implement. Now that we have more transistors than we know what to do with, seems like that should be revisited.
Then generalize the vector coproccessing abilities of the GPU and that would be a pretty flexible base to work from.