The other big impossible task is that most code isn't written to be able to take advantage of theoretical autoparallelization--you really want data to be in struct-of-arrays format, but most code tends to be written in array-of-struct format. This means that vectorization cost model (even if proven, whether by user assertion or sufficiently smart compiler, legal) sees it needs to do a lot of gathers and scatters and gives up on a viable path to vectorization really quickly.