I'm going off a first pass through the paper, but it appears that what this paper shows is that the training error can be 0 on an entirely randomized data set, but the generalization error - the difference between the error on the
test set and the
training set, does increase dramatically as label corruption increases.
My understanding is that cross validation does multiple combinations of splitting the input data into test and training sets... so if cross validation measures the generalization error, wouldn't this catch the low predictive value resulting from randomization of labels or input?
I'm not saying the paper doesn't have value, but I think it's more about the fact that neural nets can obtain a training error of zero on randomized data, not a testing error (or generalization error, which represents the difference between training error and testing error, as far as I can tell).
To be clear, I'm not an expert, and this is just what I gleaned from a first pass over the paper.