To me, taking "principled approach" means you understand and can justify the eventual outcome of the approach, or at least guarantee that the outcome satisfies some constraints. How would you justify the number of channels in each layer of a convolutional network? The number of self-attention heads in a transformer? The depth? Can you certify its prediction performance?
Yes, the "just add more layers" approach typically works (in a very narrow sense of the word "works"), but we don't really understand why. We likewise don't understand the failure modes of the system, and cannot engineer around them. Thus it's not really principled in my view.