The way our brains have evolved to build up a functional language model is by observing lots and lots (and lots) of examples of the language being used for communication. Which implies that, even from the very beginning, graded readers, level-appropriate dialogues, etc. should be the foundation of a language study program and not, as most language instruction courses and apps make it, just a little bit of icing on top.
The primacy of input is also kind of a big deal, and, at least for me, it took a long time before I was willing to let go of forced production exercises such as "translate this sentence into your target language". Perhaps in part because they're so endemic to language learning resources. You can't abandon them without also abandoning Duolingo and most formal classroom programs. But the problem with these kind of practice exercises is, we now know that it's normal for there to be a long delay between when someone can comprehend a grammatical structure, and when they can use it in a natural setting. (Anyone with kids over a certain age should be familiar with the phenomenon.) Forced production relies on - and reinforces - that aforementioned misplaced encoding, and there's a mountain of research demonstrating that skill in performing those sorts of exercises simply doesn't correlate with the development of communicative fluency.
Tangentially, the transformer architecture that's taken natural language processing by storm has some interesting similarities to the leading model among SLA researchers for how language is represented in the human brain. Which isn't the monitor model itself, but might hint at a mechanism for a few parts of the model. Acquisition order, for example.