Sort of. They can readily do really basic feature engineering along the lines of doing nonlinear transformations of the input. With a bit more doing, they can do spatial feature engineering (e.g., convolutional nets), and with a bit more foresight and planning they can learn the kinds of complex "hidden Markov process" style features you typically use in natural language processing.
But, as far as I'm aware, anyway, they can't necessarily do a great job with things like irregular time series (which is a huge chunk of big data), so you're still stuck doing some of that basic feature engineering. And I hesitate to say that some of the fancier architectures like LSTMs can be characterized as a turnkey solution for feature engineering, considering how much thought and effort and pre-existing knowledge and theory about what the engineered features should look like in the first place needed to go into designing them. So I feel like the "they can learn their own features" thing is a bit overhyped.