At the moment Chat-GPT and other instruct-based models show what's possible with modern LLM models. Although most SOTA models need cloud-compute to predict, it's feasible to assume that they might work on beefy standard desktops in the next 5 to 10 years (IMHO).
Now, historically we had:
- thin-client main frame architecture (1970s - 1980s)
- fat-client "home computers" (1980s - 2010s)
- thin-client SaaS software platforms (2010s - 2025s)
- fat-client LLM inference engines (2025s - ?)
In particular I think there will be a lot of ethical questions and legal work for companies to sell LLMs as SaaS, and because of fear to "recommend stuff against the status-quo", they might be inferior to "open" (unconstrained) models and that might be just possible for private persons (at first).Just my 2 cents, what do you think?