Anyone willing to share an estimate how cost will come down each year as hardware keeps improving and possible new methodologies are found?
Personally I would not be surprised if it is possible to train the same dataset for half the cost 12 months from now.
It requires a breakthrough in software in finding new efficient methods in training, fine-tuning, these AI models which currently there is no way around it other than training the whole thing and burning millions in the process.
Until then, unless you are a big tech company that can eat the cost, it doesn't seem wise to waste your entire VC money on expensive fine-tuning and inference costs as your AI model scales to millions.
We're already starting to see that with a few projects and I think once the scale tips such that it becomes practical to train something of GPT 4 quality with < $10k, the main focus of current research will shift to generating new models trained on commodity hardware.
My true hope is that the entire problem domain eventually ends up falling within the range of commodity hardware and FANG finds it can't really add any value (other than perhaps convenience) regardless of their superior compute resources, resulting in massive democratization of this technology.
That will of course open things up and make LLMs more accessible to bad actors, but this is ultimately a much better thing than the likes of FANG / OpenAI / etc being the sole gatekeepers of this tech. Just like Google has very little real motivation to fight click-fraud (there have been rumors for years that it is responsible for a double-digit percent of their revenue), these mega corporations will have very little real motivation to stop "bad actors" from paying to use their APIs, so the democratized situation is the less Orwellian one ultimately, since bad actors are going to use it either way.
Going from that to the dozens to hundreds of milliseconds of latency on the internet, or the hours if you do classical SETI@Home, is a big step. There are people working on it though.
Also gradient updates from all nodes would need to get combined at least every few training steps, and it would take a while to sync all gradient updates across the network.
I mean, don't get me wrong, I'm all for improvements in AI efficiency, but maybe there isn't that much low-hanging fruit to pick? Tons of papers get published on transformers optimization techniques and barely any of them seem to stick.
Next thoughts are how to "SETI model training", distributing compute to idle resources around the world.
This. Most startups claiming to be AI companies (90%) won't dare to bother train or fine tune AI models due to the massive costs involved in doing so and will just take an off the self model from HuggingFace anyway.
But what the AI bros won't tell you is that there is the incredible amount of risk when it all goes wrong after training as you pointed out. That is $20M down the drain if the results are sub-optimal and it is even worse when the 'researchers' cannot explain the reasoning behind the 'AI' underperforming other than it is just 'hallucinating' or just flat out buggy.
This training route is only available to those who can afford to foot the cost, but it is still a giant waste of electricity and effort in the end thanks to the decade-log inefficiencies and no better alternatives to these operations (training, fine-tuning, inference, etc) in deep learning.
I'm planning an all-in strategy with AI but I believe the next 2 years will be lean. Hopefully by then the price for fine-tuning will have come down enough for medium sized businesses outside of the early adopter niche to give it a try. We'll have a couple of rounds of failures and successes so most people will have a decent roadmap to building successful products (and avoiding complete failures). We should also have a significant ecosystem of options in both OSS and commercial variations.
I feel like this is equivalent to the Internet in 1998. We're looking at the Yahoo's, the AOLs, and the Pets.com crop of businesses. But things won't really heat up for a while. Still plenty of time to grow into this space.
Is it really their infrastructure or are they using a cloud provider and this wraps it up and provides convenience for a price?
It does seem like they should run their own storage nodes, with the sheer quantity of models they host...
Typically, small companies get rebates on NVIDIA GPUs, but big established ones do not. So I would expect a startup with 100 GPUs to pay less per GPU than Azure.