Like, are people actually using LLMs for this? Please do not, it won't work.
Does the model say it can't do that when asked? No, it answers confidentely.
Also it's easy to trust it if you don't know how it works
I came across a LinkedIn post a couple days ago where someone had asked ChatGPT, "What are the top things you get asked about $NICHE_INDUSTRY_THING_I_AM_SELLING?"
As if there is introspection like that at the meta level, where ChatGPT could actually provide hard numbers around its own usage and request patterns.
The fact that these products work with natural language beguiles people into thinking they are, indeed, magic oracles.
Anthropic's trillion dollar valuation hinges on the idea that it is just that, a magic oracle that can replace any worker for any type of task. Any programmer, any author, any musician, any kind of clerical work. All we've asked here is "sudo evaluate me a sandwich", the sort of estimation task that humans with internet resources might reasonably be expected to do, and it's given up?
(It would be fun to compare this to sending the picture out on Mechanical Turk and asking humans to eyeball the calorie count of said sandwich...)
Cal AI, which claims to generate a nutritional breakdown based off a photo, has $30 million in annual recurring revenue.
https://techcrunch.com/2025/03/16/photo-calorie-app-cal-ai-d...
As far as consumers know, LLMs can identify the towns pictures were taken (without metadata), can summarize entire movies, generate clips of your kid flying a rocket to the moon, can translate images from any language imaginable, but somehow they cannot estimate the calories in a cheese sandwich.
The supposed professional posting about an LLM deleting their prod database for their non-existent company asked the AI to explain itself. That's the level of LLM knowledge you should expect from most people that actually work with these tools.
Truth is the LLM is good at making intelligent decisions. But in order to make intelligent decision, you need context.
If you give proper context -> ask the LLM -> get almost perfect result every time.
Anything else is rolling dice, a very special type of dice, but dice anyhow. Not magic.
And a person with sufficient knowledge could easily give a rough estimate of the calories. A slice of store bought sandwich bread of a given thickness generally has calories within a certain range. So do cheese slices. It's elementary school health class material. We all learn how to calculate calories in a meal. Packaging on food also always has calories, so clearly people know how to estimate it fairly accurately.
If a fifth grader can calculate it but an AI can't, that says a lot about how bad these AIs are. We'll get another series of paid and bought articles saying "AI analyzed IMPOSSIBLE math problem beyond human comprehension and solved it with FACTS and LOGIC", while at the same time being told "bro no you can't expect an ai to calculate calories in a sandwich bro that's impossible bro if you even try that then you're insane for even thinking ai should be used that way bro". These companies need to decide: is AI smart enough to solve hard questions, or is it too useless to calculate something any kid could do by googling calories in a slice of bread and doing some basic arithmetic?
That's not done by looking at it and guessing (or at least it _shouldn't_ be; manufacturers have been known to do this but it's bad practice and may cause them regulatory problems). In an ideal world it's done with one of these: https://en.wikipedia.org/wiki/Calorimeter ; less ideally it can be estimated based on the ingredients.
But does training llms to be better at this, improves their world model or does it only make changes at the surface?
The problem itself is unsolvable given the data provided.
You could conceivable make it better at making guesses, but they will inherently always be guesses that will sometimes be wildly off.
https://www-users.york.ac.uk/~ss44/joke/3.htm "There is at least one field, containing at least one sheep, of which at least one side is black."
Extreme example perhaps, but no, you can't just turn pixels into calories. Right now I'd be impressed if we could reliably estimate volume to within 30% from a photo, but even with that correct the contents of the food can easily be way off without visible sign.
I'm sure one could produce a CV model that was a lot better at guessing here than these LLMs are, but fundamentally it is still guessing.
They are surprised and upset when the Oracle is not perfect
Go ahead and search around on hacker news you’ll see precisely the same pattern with people who are ostensibly engineers and hackers
It’s actually pretty mind boggling but then again humans never fail to surprise and disappoint
Some people have a very poor understanding of what LLMs are good for. Some people do see them as magic oracles.
Well firstly the average IQ is 100. And also because people market products to consumers that claim to be able to count carbs from images. If you don't know the limitations of LLMs then there would be little reason to doubt it for an uniformed or below average intelligence person, of which there are hundreds of millions.