This is a system which lets you talk to NPCs in video games. It's a collection of off the shelf components held together by some Python code. The components do this:
- Listen to the user talking and convert speech to text.
- Watch the user's facial expressions via webcam.
- Watch the game, and use face recognition on the game images to determine what character is being addressed.
- Run the user's text through a LLM preloaded with about 30 lines of info about the NPC to generate a reply.
- Generate voice output in a voice generated to match the character's persona.
- Modify the image of the character on screen to animate their facial expressions to match the voice output. This is done on the output image, not by animating the 3D character.
Five years ago, that was science fiction. A year ago, half that stuff wouldn't work right. Now it's someone's hobby project.
[1] https://github.com/AkshitIreddy/Interactive-LLM-Powered-NPCs
So the majority of the value gets captured not by companies focused on writing new components that nobody else can match, but by the incumbents with the wealth of proprietary data to feed into the components, or the infrastructure to run the more infrastructure-dependent models at scale, or the customer base to milk for AI-enabled versions of their existing product or selling AI consulting services.
Don't get me wrong, it is cool for indie game developers to be able to procedurally generate NPC conversations. But indie game developers are not likely to capture more value from being able to generate stuff very easily than Microsoft.
It just never got widespread adoption because it's just not interesting, and LLMs are no different here. The dialogue is still empty, despite being deeper and more grammatically complex than previous attempts.
If every farmer in an RPG hands out the same "collect 20 bear asses" quest it doesn't matter if they all have "detailed" randomly generated backstories and can opine about the game world, real world philosophy, or the 2024 US elections.
Have you ever gone to a living history museum? (Old Sturbridge Village is one example, my favorite I've been to). All these people in character, able to talk about the period, it makes for an amazing experience.
In traditional video games, if we try to or even accidentally push any deeper, we see the cracks in the universe. "Oh, I spoke to this person again, and they said the same thing to me." AI can help fix those cracks, and fill them in wherever the player ventures.
This certainly doesn't change Fortnite, but I think it could change immersive RPGs and MMOs.
True. There have been NPC systems where the NPCs had motivations and a life of their own, even when no one was around. Those haven't helped gameplay much.
The current problem is that LLMs don't know enough about the game world. Recent progress on that.[1]
I work in a domain of applying AI to specific enterprise domain. It's not like you can crawl our data in the open web. Getting any data from clients is years of lawyer struggles and chicken and egg problems to solve. Fine-tuning models to client expectations - they are not going to go through the process again with someone else.
And moat in B2C AI is owning tons of your personal data and habits that Google and Facebook do. It's just not trully utilized with GPT models yet.
Instead, the best moat is to know that your product isn't a thin replicable wrapper for ChatGPT but instead has a large surface area, with lots of well-built features. Continue building those features at a fast pace, and you can win.
The new AI platform may over time enable more products than even the shift to mobile.
ADDED:
What's implied is that the moat could be built and some kinds of moats are not yet well-known or prevalent. Proprietary data is often mentioned. But also the application on top of LLMs (or LFMs) needs not be just a thin layer with little technical barrier.
Now that the era of free money is over, and you have to pay nonzero interest, profits matter again. Is anybody in the AI space actually profitable? Is OpenAI losing money on every token to build volume?
This. Why does Silicon Valley always miss the effing obvious?
The best way to assess a startup's value is to play a bank evaluating them for a traditional no-frills loan. How risky a debt the bank considers it is a fair measure of the value of the company (and in most non-public cases, it will be negative EV, future revenues be damned). Not the BS analyses made by IBD teams at banks, and not the "valuations" ascribed to the startup by its cash-rich, opportunity-deprived VCs.
In Shoe Dog by Nike co-founder Phil Knight, he describes how the banks kept refusing to borrow them money because they refused to value based on future cash flows. Eventually, Nike switched to another bank. The first bank could have made a lot of money there.
In general, even Buffett after years of very conservative valuations (Sigar Butt Investing) switched to "buying great companies at fair prices". Why would you buy a company that barely keeps up with inflation if you could buy one that literally grows exponentially. If you hop from Sigar Butt to Sigar Butt, you can also grow your money exponentially, but it's harder because you pay more taxes, brokerage fees, and have to work more on finding the right enters and exits. Conversely, if you are as clever as the Nomad Investment Partnership and just only bought and hold Costco, Berkshire, and Amazon from 2005 to now, you would have gotten great returns on investement without having to do a thing.
There are an estimated 1 billion knowledge workers worldwide. The alleged operating costs of OpenAI are around $700,000/day. That's $0.25 billion / year. Add to that salaries and retraining. Salaries: 375 employees at an average $350,000 / year comes to $0.13 billion / year. And retraining cost seems to be on the order of tens of millions per training run.
With the right subscription fee it does seem possible to balance the books and be profitable. Especially when they start selling bulk contracts to governments and schools and big corporations.
And they're all our customers! I've seen pitch decks like that.
How many paid users do they have?
Likely NVidia, AWS, Azure, Google Cloud will capture a lot of the value. OpenAI might, but they are playing a game of tennis where they are "Advantage" but could still lose.
1. A lot of value was supposed to come from selling to enterprises. The narrative was that they would move slowly and hence nimble startups could sell to them and generate quick revenue. The assumptions are really tested on this one. First, the virality and popularity meant any Engg leader working on AI related projects got social capital and prestige (and a promotion) inside the company, making it preferable for companies to build than buy. An API form factor helped immensely in getting to a POC within a day. Second, for those buying, many startups (in LLMOps) ended up selling the same thing, so they slowed down to evaluate. Third, the data privacy issues meant no enterprise was willing to go for cloud solutions.
2. A lot of startups never picked up the tougher problems. Eg: Training an open source model, or finetuning as a service, the core aspects to change the underlying behavior of a model was picked up in open source, but most startups never picked that part up. Partly to do with things that got hype. An LLM wrapper would show off a cool demo, gets shared widely, thus encouraging others to build something similar, rather than go deep. A very clear indication of this was how Open AI and then Anthropic stopped offering finetuning services on newer models electing to just enable zero/few shot learning and bigger context windows. Easy for them, but tough for consumers who really wanted a customized solution.
There are still very cool moonshots out there, and probably unlock the value not captured by incumbents. At this point, my working assumption is that for an AI startup to capture value, they would have to go deeper into the stack, and offer a service their competitors would take effort to do (and by extension enterprises would take time to do). Eg: Ability/Training a open source model locally for search and summarization based on proprietary data. I know BCG[1] did it pretty well and got spectacular results.
[1]https://bcg.com/press/10may2023-intel-bcg-announce-collabora...
Google and Facebook are today's knowledge dealers. They do not profit from providing an LLM that sidesteps all their products and gives you the answer you are searching for directly. They want to influence your eyeballs. They will try to do this by injecting their own thought manipulation crap in their LLMs. I instinctively would not trust them. I would want an LLM that is pure in some sense. Unfortunately, even OpenAI is already debased, but for another reason.
But here you can see the value that a startup can provide over the current incumbents. A startup can provide an unadulterated knowledge base of the internet and be profitable. Whether that is OpenAI or one of its competitors I do not know, but Google and Facebook cannot do that. There is no gain for them.
At the end of the day, the incumbents are mostly providing the APIs or the hardware to do anything significant. There may be a handful of outliers, but it seems a vast majority of AI startups these days are a new iteration of resellers.
Before, it was hosting that was resold, now it is APIs or if their customers have any gumption TPUs and GPUs, which is arguably still hosting.
I don’t see startups ruling AI.
Also, a lot of people misattribute the label of startup to established companies. I could go on a rant about tech journalists being the cause, but I’ll just say OpenAI is not a startup.
The reason why ChatGPT changed that is because they developed an algorithm/model good enough to offer a consumer-grade conversational interface and they scraped the web to train it.
That is, they offered a whole product and nailed distribution so they could own the relationship with the user.