Agreed, understanding how a method works and how it would be done helps with developing an intuition for its limitations -- what it can and what it can't do
I wouldn't pretrain from scratch, but continued pretraining is pretty popular for adapating LLMs to recent and/or custom data. (Sometimes this is referred to 'finetuning', however, not to be confused with 'instruction finetuning').
contains hundreds of video lessons, originally seemingly originating from Sebastian Raschka teaching at Wisconsin-Madison Uni (before he went full-time entrepreneur).