A "small Large Language Model", you say? So a "Language Model"? ;-)
> Such an LLM could have handled grammar and code autocompletion, basic linting, or documentation queries and summarization.
No, not even close. You're off by 3 orders of magnitude if you want even the most basic text understanding, 4 OOM if you want anything slightly more complex (like code autocompletion), and 5–6 OOM for good speech recognition and generation. Hardware was very much a limiting factor.