I acknowledge that the reported accuracy of a system will be higher if you take the max accuracy of 10 methods which have the same 'true' accuracy +- some noise. The results presented in table 5 of the paper are very unlikely in my opinion (as someone currently in the ML field and who has worked in the field of computational neuroscience) to be solely due to randomly trying different ML techniques without the underlying data providing a noteworthy difference between the target classes.
If these results are replicated independently with a different dataset then the magnitude of the overselling of the method will be seen. I just don't think that it makes sense to doubt the results (i.e. with a grid of EEG sensors and bandpower features it is possible to identify a portion of autism cases) based upon this factor alone.