CNN's exploit spatial locality and LSTMs exploit temporal locality. The SOTA models are architect-ed with even stronger assumptions about the nature of the task. Methods like Neural Networks, Random Forests and SVMs when used as unconstrained universal function approximators for unstructured data only learn some non-linear polynomial/ exponential/logarithmic combination of data itself, without much nuance.
It is critical to help a model out by constraining the space of models it searches over to find the right answer. I think, unless we figure out a way to constrain architectures to exploit specific traits of task they are trying to solve, (universal function approximator type) ML won't succeed in the same way that it has in vision / language.
As it of now, the alternative is to use PGMs where the model is fully interpretable as a graph structured combination of explicitly parameterized random variables. PGMs work well with low data and give really good uncertainty estimates, to evaluate the quality of a model. PGMs of course suffer from the problem where they are excruciatingly slow for large datasets and require require a decent amount of prior knowledge about the problem to explicitly define the type of graph structure / random variables we are going to be using.
I think ML is most certainly capable of solving this problem, but the community is probably waiting for another break through along the lines of AlexNet/LSTMs before that it the case.
No comments yet.