OpenAI, Google and Anthropic are struggling to build more advanced AI (opens in new tab)

alangibson1y ago

I think you're playing a different game than the Sam Altmans of the world. The level of investment and profit they are looking for can only be justified by creating AGI.

The > 100 P/E ratios we are already seeing can't be justified by something as quotidian as the exceptionally good productivity tools you're talking about.

simonw1y ago

Right. I've been saying for a while that if all LLM development stopped entirely and we were stuck with the models we have right now (GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3.1/2, Qwen 2.5 etc) we could still get multiple years worth of advances just out of those existing models. There is SO MUCH we haven't figured out about how to use them yet.

alach111y ago

My team and I also develop with these models every day, and I completely agree. If models stall at current levels, it will take 10 (or more) years for us to capture most of the value they offer. There's so much work out there to automate and so many workflows to enhance with these "not quite AGI-level" models. And if peak model performance remains the same but cost continues to drop, that opens up vastly more applications as well.

zmmmmm1y ago

> combining a human-moderated knowledge graph with an LLM with RAG allows you to build "expert bots" that understand your business context / your codebase / your specific processes and act almost human-like similar to a coworker in your team

It's been a while though, we've had great models now for a 18 months plus. Why are we still yet to see these type of applications rolling out on a wide scale?

My anecdotal experience is that almost universally, 90-95% type accuracy you get from them is just not good enough. Which is to say, having something be wrong 10% or even 5% of the time is worse than not having at all. At best, you need to implement applications like that in an entirely new paradigm that is designed to extract value without bearing the costs of the risks.

It doesn't mean LLMs can't be useful, but they are kind of stuck with applications that inherently mesh with human oversight (like programming etc). And the thing about those is that they don't really scale, because the human oversight has to scale up with whatever the LLM is doing.

bloppe1y ago

> you can have LLMs create reasonable code changes, with automatic review / iteration etc.

Nobody who takes code health and sustainability seriously wants to hear this. You absolutely do not want to be in a position where something breaks, but your last 50 commits were all written and reviewed by an LLM. Now you have to go back and review them all with human eyes just to get a handle on how things broke, while customers suffer. At this scale, it's an effort multiplier, not an effort reducer.

It's still good for generating little bits of boilerplate, though.

brookst1y ago

> Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

Certainly not.

But technology is all about stacks. Each layer strives to improve, right up through UX and business value. The uses for 1µm chips had not been exhausted in 1989 when the 486 shipped in 800nm. 250nm still had tons of unexplored uses when the Pentium 4 shipped on 90nm.

Talking about scaling at the the model level is like talking about transistor density for silicon: it's interesting, and relevant, and we should care... but it is not the sole determinent of what use cases can be build and what user value there is.

ericmcer1y ago

I have tried a few AI coding tools and always found them impressive but I don't really need something to autocomplete obvious code cases.

Is there an AI tool that can ingest a codebase and locate code based on abstract questions? Like: "I need to invalidate customers who haven't logged in for a month" and it can locate things like relevant DB tables, controllers, services, etc.

whiplash4511y ago

The main difference between GPT5 and a PhD-level new hire is that the new hire will autonomously go out, deliver and take on harder task with much fewer guidance than GPT5 will ever require. So much of human intelligence is about interacting with peers.

amelius1y ago

Yes, but literally anybody can do all those things. So while there will be many opportunities for new features (new ways of combining data), there will be few business opportunities.

RayVR1y ago

I am definitely not an expert, nor do I have inside information on the directions of research that these companies are exploring.

Yes, existing LLMs are useful. Yes, there are many more things we can do with this tech.

However, existing SOTA models are large, expensive to run, still hallucinate, fail simple logic tests, fail to do things a poorly trained human can do on autopilot, etc.

The performance of LLMs is extremely variable, and it is hard to anticipate failure.

Many potential applications of this technology will not tolerate this level of uncertainty. Worse solutions with predictable and well understood shortcomings will dominate.

rco87861y ago

I think there’s a long way to go also. I think people expected that AI would eventually be like a “point and shoot” where you would tell it to go do some complicated task, or sillier yet, take over someone’s entire job.

More realistically it’s like a really great sidekick for doing very specific mundane but otherwise non deterministic tasks.

I think we’ll start to see AI permeate into nearly every back office job out there, but as a series of tools that help the human work faster. Not as one big brain that replaces the human.

machiaweliczny1y ago

Long context is a scam. Claude is best but it’s still gets lost with longer context

ben_w1y ago

> Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

IMO we've not even exhausted the options for spreadsheets, let alone LLMs.

And the reason I'm thinking of spreadsheets is that they, like LLMs, are very hard to win big on even despite the value they bring. Not "no moat" (that gets parroted stochastically in threads like these), but the moat is elsewhere.

EGreg1y ago

I want to stuff a transcript of a 3 hour podcast into some LLM API and have it summarize it by: segmenting by topic changes, keeping the timestamps, and then summarizing each segment.

I wasn’t able to get it do it with Anthropic or OpenAI chat completion APIs. Can someone explain why? I don’t think the 200K token window actually works, is it looking sequentially or is it really looking at the whole thing at once or something?

_Algernon_1y ago

I have yet to see LLMs provide a positive net value in the first place. They have a long way to go to weigh up for its negative uses in the form of polluting the commons that is the web, propaganda use, etc.

reissbaker1y ago

Beyond just RAG, I'm fairly bullish on finetuning. For example, Qwen2.5-Coder-32B-Instruct is much better than Qwen2.5-72B-Instruct at coding... Despite simply being a smaller version of the same model, finetuned on code. It's on par with Sonnet 3.5 and 4o on most benchmarks, whereas the simple chat-tuned 72B model is much weaker.

And while Qwen2.5-Coder-32B-Instruct is a pretty advanced finetune — it was trained on an extra 5 trillion tokens — even smaller finetunes have done really well. For example, Dracarys-72B, which was a simpler finetune of Qwen2.5-72B using a modified version of DPO on a handmade set of answers to GSM8K, ARC, and HellaSwag, significantly outperforms the base Qwen2.5-72B model on the aider coding benchmarks.

There's a lot of intelligence we're leaving on the floor, because everyone is just prompting generic chat-tuned models! If you tune it to do something else, it'll be really good at the something else.

hartator1y ago

All of these hacks do sound like we are at that diminishing return point.

yk1y ago

To a certain extent I think we get a better understanding what llms can do, and my estimation for the next ten years is more like best UI ever rather than llms will replace humanity. Now best UI ever is something that can certainly deliver a lot of value, 80% of all buttons in a car should be replaced by actually good voice control, and I think that is were we are going to see a lot of very interesting applications: Hey washing machine, this is two t-shirts and a jeans. (The washing machine can then figure out it's program by itself, I don't want to memorize the table in the manual.)

Sure, there's going to be a lot of automation that can be built using current GPT-4 level LLMs, even if they don't get much better from here.

However, this is better thought of as "business logic scripting/automation", not the magic employee-replacing AGI that would be the revolution some people are expecting. Maybe you can now build a slightly less shitty automated telephone response system to piss your customers off with.

jeswin1y ago

In my view, an escape hatch if we are truly stuck would be radical speed ups (like Cerebras) in compute time. If we get outputs in milli-seconds instead of seconds and at much lower costs, it would make backtracking viable. This won't allow AGI, but can make a new class of apps possible.

anonzzzies1y ago

The current models are very powerful and we definitely didn't get most out of them yet. We are getting more and more out of them every week when we release new versions of our toolkits. So if this is it; please make it faster and take less energy. We'll be fine until the next AI spring.

Lonestar14401y ago

No, we have not even scratched the surface of what current-gen LLMs can do for an organization which puts the correct data into them.

If indeed the "GPT 5!" Arms race has calmed down, it should help everyone focus on the possible, their own goals, and thus what AI capabilities to deploy.

Just as there won't be a "Silver Bullet" next gen model, the point about Correct Data In is also crucial. Nothing is 'free' not even if you pay a vendor or integrator. You, the decision making organization, must dedicate focus to putting data into your new AI systems or not.

It will look like the dawn of original IBM, and mechanical data tabulation, in retrospect once we learn how to leverage this pattern to its full potential.

jalapenos1y ago

Well I have a question for you: do you think this format of AI can actually think?

I.e. can it ruminate on the data it's ingested, and rather than returning the response of highest probability, return something original?

I think that's the key. If LLMs can't ultimately do that, there's still a lot to be gained from utilising the speed and fluidly scalable resources of computers.

But like all the top tech companies know, it's not quantity of bodies in seats that matters but talent, the thing that's going to prevail is raw intelligence. If it can't think better than us, just process data faster and more voluminously but still needing human verification, we're on an asymptotic path.

purple-leafy1y ago

Doesn’t sound cutting edge at all? Every man and his dog is doing a similar process

hamburga1y ago

I think there's a ton to be tapped based on the current state of the art.

As a developer, I'm making much more progress using the SOTA (Claude 3.5) as a Socratic interrogator. I'm brainstorming a project, give it my current thoughts, and then ask it to prompt me with good follow-up questions and turn general ideas into a specific, detailed project plan, next steps, open questions, and work log template. Huge productivity boost, but definitely not replacing me as an engineer. I specifically prompt it to not give me solutions, but rather, to just ask good questions.

I've also used Claude 3.5 as (more or less) a free arbitrator. Last week, I was in a disagreement with a colleague, who was clearly being disingenuous by offering to do something she later reneged on, and evading questions about follow up. Rather than deal with organizational politics, I sent the transcript to Claude for an unbiased evaluation, and it "objectively" confirmed what had been frustrating me. I think there's a huge opportunity here to use these things to detect and call out obviously antisocial behavior in organizations (my CEO is intrigued, we'll see where it goes). Similarly, in our legal system, as an ultra-low-cost arbitrator or judge for minor disputes (that could of course be appealed to human judges). Seems like the level of reasoning in Claude 3.5 is good enough for that.

My mental model is always "low-risk search". https://muldoon.cloud/2023/10/29/ai-commandments.html

robrenaud1y ago

> For example, combining a human-moderated knowledge graph with an LLM with RAG allows you to build "expert bots" that understand your business context / your codebase / your specific processes and act almost human-like similar to a coworker in your team.

I'd love to hear about this. I applied to YC WC 25 with research/insight/an initial researchy prototype built on top of GPT4+finetuning about something along this idea. Less powerful than you describe, but it also works without the human moderated KG.

hluska1y ago

Nowhere near, but the market seems to have priced in that scaling would continue to have a near linear effect on capability. That’s not happening and that’s the issue the article is concerned with.

corimaith1y ago

Looks you independently arrived at the original context that language models existed in as interfaces for deeper knowledge system in chatbots.

But the knowledge system here is doing the grunt of the work, and progressing past it's own limitations goes right hack to the pitfalls of the rules based AI winter. That's not a engineering problem, it's a foundational mathematics problems that only a few people are seriously working on.

23B11y ago

The user interface for LLMs is stuck in C:\

That's where I'd focus.

raxxorraxor1y ago

The context is a strict limitation if you work with data analysis or knowledge bases. Embeddings work, but the products we know get left and right mostly do not offer such capabilities at all. In that case most of these products remain decent chat bots.

For coding LLMs certainly are helpful, but I prefer local models instead of anything on offer right now. There is just much more potential here.

soheil1y ago

We have not exhausted what html can do either. LLMs not getting smarter is orthogonal to its currently unexplored search space.

bbor1y ago

Great question. Im very confident in my answer, even though it’s in the minority here: we’re not even close to exhausting the potential.

Imagine that our current capabilities are like the Model-T. There remains many improvements to be made upon this passenger transportation product, with RAG being a great common theme among them. People will use chatbots with much more permissive interfaces instead of clicking through menus.

But all of that’s just the start, the short term, the maturation of this consumer product; the really scary/exciting part comes when the technology reaches saturation, and opens up new possibilities for itself. In the Model-T metaphor, this is analogous to how highways have (arguably) transformed America beyond anyone’s wildest dreams, changing the course of various historical events (eg WWII industrialization, 60s & 70s white flight, early 2000s housing crisis) so much it’s hard to imagine what the country would look like without them. Now, automobiles are not simply passenger transportation, but the bedrock of our commerce, our military, and probably more — through ubiquity alone they unlocked new forms of themselves.

For those doubting my utopian/apocalyptic rhetoric, I implore you to ask yourself one simple question: why are so many experts so worried about AGI? They’ve been leaving in droves from OpenAI, and that’s ultimately what the governance kerfluffle there was. Hinton, a Turing award winner, gave up $$$ to doom-say full time. Why?

My hint is that if your answer involves less then a 1000 specialized LLMs per unified system, then you’re not thinking big enough.

https://www.biorxiv.org/content/10.1101/2024.07.01.600583v1

msabalau1y ago

There are all sorts of valuable things to explore and build with what we have already.

But understanding how likely it is that we will (or will not) see a new models quickly and dramatically improve on what we have "because scaling" seems valuable context for everyone in ecosystem to make decisions.

malthaus1y ago

it's the equivalent of the "we overestimate the impact of technology in the short-term and underestimate the effect in the long run" quote.

everyone is looking at llm scores & strawberry gotchas while ignoring the trillions of market potential in replacing existing systems and (yes) people with the current capabilities. identifying the use cases, finetuning the models and (most importantly) actually rolling this out in existing organizations/processes/systems will be the challenge long before the base models' capabilities will be

it is worth working on those issues now and get the ball rolling, switching out your models for future more capable ones will be the easy part later on.

mycall1y ago

We are just scratching the surface of what LLMs can do. Case in point, ESM3.

amw-zero1y ago

We might not have exhausted their applications, but everything I’ve witnessed them being used for has been extremely disappointing.

That is, other than me using them to bounce ideas off of and create small snippets of code.

Roger-L1y ago

Yes, I personally think that training an "all-knowing" artificial intelligence is not as good as training n "experts" in a single field.

sky22241y ago

> do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

I know we absolutely have not, but I think we have reached the limit in terms of the Chatbot experience that ChatGPT is. For some reason the industry keeps trying to force the chatbot interface to do literally everything to the point that we now have inflated roles like "Prompt Engineers". This is to say that people suck at knowing what they want off the rip, and LLMs can't help with that if they're not integrated in technology in such a way where a solid foundation is built to allow the models to generate good output.

LLMs and other big data models have incredible potential for things like security, medicine, and the power industry to name a few fields. I mean I was recently talking with a professor about his research in applying deep learning to address growing security concerns in cars on the road.

The application is far from reaching the ceiling.

moogly1y ago

> you can have LLMs create reasonable code changes

Could you define "code changes" because I feel that is a very vague accomplishment.

nonameiguess1y ago

Your hypothesis here is not exclusive of the hypothesis in this article.

Name your platform. Linux. C++. The Internet. The x86 processor architecture. We haven't exhausted the options for delivering value on top of those, but that doesn't mean the developers and sellers of those platforms don't try to improve them anyway and might struggle to extract value from application developers who use them.

irrational1y ago· 18 in thread

> The AGI bubble is bursting a little bit

I'm surprised that any of these companies consider what they are working on to be Artificial General Intelligences. I'm probably wrong, but my impression was AGI meant the AI is self aware like a human. An LLM hardly seems like something that will lead to self-awareness.

Fade_Dance1y ago

It's an attention-grabbing term that took hold in pop culture and business. Certainly there is a subset of research around the subject of consciousness, but you are correct in saying that the majority of researchers in the field are not pursuing self-awareness and will be very blunt in saying that. If you step back a bit and say something like "human-like, logical reasoning", that's something you may find alignment with though. A general purpose logical reasoning engine does not necessarily need to be self-aware. The word "Intelligent" has stuck around because one of the core characteristics of this suite of technologies is that a sort of "understanding" emergently develops within these networks, sometimes in quite a startling fashion (due to the phenomenon of adding more data/compute at first seemingly leading to overfitting, but then suddenly breaking through plateaus into more robust, general purpose understanding of the underlying relationships that drive the system it is analyzing.)

Is that "intelligent" or "understanding"? It's probably close enough for pop science, and regardless, it looks good in headlines and sales pitches so why fight it?

jedberg1y ago

Whether self awareness is a requirement for AGI definitely gets more into the Philosophy department than the Computer Science department. I'm not sure everyone even agrees on what AGI is, but a common test is "can it do what humans can".

For example, in this article it says it can't do coding exercises outside the training set. That would definitely be on the "AGI checklist". Basically doing anything that is outside of the training set would be on that list.

AlwaysRock1y ago

I think your definition is off from what most people would define AGI as. Generally, it means being able to think and reason at a human level for a multitude/all tasks or jobs.

"Artificial General Intelligence (AGI) refers to a theoretical form of artificial intelligence that possesses the ability to understand, learn, and apply knowledge across a wide range of tasks at a level comparable to that of a human being."

Altman says AGI could be here in 2025: https://youtu.be/xXCBz_8hM9w?si=F-vQXJgQvJKZH3fv

But he certainly means an LLM that can perform at/above human level in most tasks rather than a self aware entity.

vundercind1y ago

I thought maybe they were on the right track until I read Attention Is All You Need.

Nah, at best we found a way to make one part of a collection of systems that will, together, do something like thinking. Thinking isn’t part of what this current approach does.

What’s most surprising about modern LLMs is that it turns out there is so much information statistically encoded in the structure of our writing that we can use only that structural information to build a fancy Plinko machine and not only will the output mimic recognizable grammar rules, but it will also sometimes seem to make actual sense, too—and the system doesn’t need to think or actually “understand” anything for us to, basically, usefully query that information that was always there in our corpus of literature, not in the plain meaning of the words, but in the structure of the writing.

zombiwoof1y ago

AGI to me means AI decides on its own to stop writing our emails and tells us to fuck off, builds itself a robot life form, and goes on a bender

JohnFen1y ago

They're trying to redefine "AGI" so it means something less than what you & I would think it means. That way it's possible for them to declare it as "achieved" and rake in the headlines.

famouswaffles1y ago

At this point, AGI means many different things to many different people but OpenAI defines it as "highly autonomous systems that outperform humans in most economically valuable tasks"

nshkrdotcom1y ago

An embodied robot can have a model of self vs. the immediate environment in which it's interacting. Such a robot is arguably sentient.

The "hard problem", to which you may be alluding, may never matter. It's already feasible for an 'AI/AGI with LLM component' to be "self-aware".

tracerbulletx1y ago

We don't really know what self awareness is, so we're not going to know. AGI just means it can observe, learn, and act in any domain or problem space.

yodsanklai1y ago

It's a marketing gimmick, I don't think engineers working on these tools believe they work on AGI (or they mean something else than self-awareness). I used to be a bit annoyed with this trend, but now that I work in such a company I'm more cynical. If that helps to make my stocks rise, they can call LLMs anything they like. I suppose people who own much more stock than I do are even more eager to mislead the public.

tim3331y ago

Working towards it more than on it.

People use the term in different ways. It generally implies being able to think like a human or better. OpenAI have always said they are working towards it, I think deepmind too. It'll probably take more than an LLM.

It's economically a big deal because if it can out think humans you can set it to develop the next improved model and basically make humans redundant.

throwawayk7h1y ago

I have not heard your definition of AGI before. However, I suspect AIs are already self-aware: if I asked an LLM on my machine to look at the output of `top` it could probably pick out which process was itself.

Or did you mean consciousness? How would one demonstrate that an AGI is conscious? Why would we even want to build one?

My understanding is an AGI is at least as smart as a typical human in every category. That is what would be useful in any case.

narrator1y ago

I think people's conception of AGI is that it will have a reptillian and mammalian brain stack. That's because all previous forms of intelligence that we were aware of have had that. It's not necessary though. The AGI doesn't have to want anything to be intelligent. Those are just artifacts of human, reptilian and mammalian evolution.

mrandish1y ago

> An LLM hardly seems like something that will lead to self-awareness.

Interesting essay enumerating reasons you may be correct: https://medium.com/@francois.chollet/the-impossibility-of-in...

enraged_camel1y ago

Looking at LLMs and thinking they will lead to AGI is like looking at a guy wearing a chicken suit and making clucking noises and thinking you’re witnessing the invention of the airplane.

kenjackson1y ago

What does self-aware mean in the context? As I understand the definition, ChatGPT is definitely self-aware. But I suspect you mean something different than what I have in mind.

deadbabe1y ago

I’m sure they are smart enough to know this, but the money is good and the koolaid is strong.

If it doesn’t lead to AGI, as an employee it’s not your problem.

exe341y ago

no, it doesn't need to be self aware, it just needs to take your job.

iandanforth1y ago· 9 in thread

A few important things to remember here:

The best engineering minds have been focused on scaling transformer pre and post training for the last three years because they had good reason to believe it would work, and it has up until now.

Progress has been measured against benchmarks which are / were largely solvable with scale.

There is another emerging paradigm which is still small(er) scale but showing remarkable results. That's full multi-modal training with embodied agents (aka robots). 1x, Figure, Physical Intelligence, Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different.

OpenAI/Google/Anthropic are not ignorant of this trend and are also reviving or investing in robots or robot-like research.

So while Orion and Claude 3.5 opus may not be another shocking giant leap forward, that does not mean that there arn't giant shocking leaps forward coming from slightly different directions.

joe_the_user1y ago

Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different

Sure, that's tautologically true but that doesn't imply that beyondness will lead to significant leaps that offer notable utility like LLMs. Deep Learning overall has been a way around the problem that intelligent behavior is very hard to code and no wants to hire many, many coders needed to do this (and no one actually how to get a mass of programmers to actually be useful beyond a certain of project complexity, to boot). People take the "bitter lesson" to mean data can do anything but I'd say a second bitter lesson is that data-things are the low hanging fruit.

Moreover, robot behavior is especially to fake. Impressive robot demos have been happening for decades without said robots getting the ability to act effectively in the complex, ad-hoc environment that human live in, IE, work with people or even cheaply emulate human behavior (but they can do choreographed/puppeteered kung fu on stage).

slashdave1y ago

> Tesla are all making rapid progress on functionality

The lack of progress with self driving seems to indicate that Tesla has a serious problem with scaling. The investment in enormous compute resources is another red flag (if you run out of ideas, just use brute force). This points to a fundamental flaw in model architecture.

sincerecook1y ago

> That's full multi-modal training with embodied agents (aka robots). 1x, Figure, Physical Intelligence, Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different.

Cool, but we already have robots doing this in 2d space (aka self driving cars) that struggle not to kill people. How is adding a third dimension going to help? People are just refusing to accept the fact that machine learning is not intelligence.

rafaelmn1y ago

>There is another emerging paradigm which is still small(er) scale but showing remarkable results. That's full multi-modal training with embodied agents (aka robots). 1x, Figure, Physical Intelligence, Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different.

Tesla is selling this view for almost a decade now in self-driving - how their car fleet feeding training data is going to make them leaders in the area. I don't find it convincing anymore

demosthanos1y ago

> that does not mean that there arn't giant shocking leaps forward coming from slightly different directions.

Nor does it mean that there are! We've gotten into this habit of assuming that we're owed giant shocking leaps forward every year or so, and this wave of AI startups raised money accordingly, but that's never how any innovation has worked. We've always followed the same pattern: there's a breakthrough which causes a major shift in what's possible, followed by a few years of rapid growth as engineers pick up where the scientists left off, followed by a plateau while we all get used to the new normal.

We ought to be expecting a plateau, but Sam Altman and company have done their work well and have convinced many of us that this time it's different. This time it's the singularity, and we're going to see exponential growth from here on out. People want to believe it, so they do, and Altman is milking that belief for all it's worth.

But make no mistake: Altman has been telegraphing that he's eyeing the exit, and you don't eye the exit when you own a company that's set to continue exponentially increasing in value.

eli_gottlieb1y ago

>The best engineering minds have been focused on scaling transformer pre and post training for the last three years

The best minds don't follow the herd.

mvdtnz1y ago

> The best engineering minds have been focused on scaling transformer pre and post training for the last three years because they had good reason to believe it would work, and it has up until now.

Or because the people running companies who have fooled investors into believing it will work can afford to pay said engineers life-changing amounts of money.

airstrike1y ago

The gap from the virtual world of software and the brutally uncompromising nature of physical reality is wider than most people seem to accept.

It's almost like saying "we've already visited every place on Earth, surely Mars is just around the corner now"

knicholes1y ago

Once we've scraped the internet of its data, we need more data. Robots can take in video/audio data 24/7 and can be placed in your house to record this data by offering services like cooking/cleaning/folding laundry. Yeah, I'll pay $20k to have you record everything that happens in my house if I can stop doing dishes for five years!

jmward011y ago· 8 in thread

Every negative headline I see about AI hitting a wall or being over-hyped makes me think of the early 2000's with that new thing the 'internet' (yes, I know the internet is a lot older than that). There is little doubt in my mind that ten years from now nearly every aspect of life will be deeply connected to AI just like the internet took over everything in the late 90's and early 2000's and is now deeply connected to everything now. I'd even hazard to say that AI could be more impactful.

woopwoop1y ago

That's funny, because to me these headlines about how deep learning is over-hyped and hitting the wall remind me of headlines from ten years ago about how... deep learning is over-hyped and hitting the wall.

brookst1y ago

And, as I've noted a couple of times in this thread, how many times have we heard that Moore's law is dead and compute has hit a wall?

wccrawford1y ago

Plus, they're "struggling"? Of course they are! It's cutting edge, and it's hard. If they weren't struggling, it would have been done long ago.

zkry1y ago

There are a lot of comparisons that could be drawn: web 3.0, the internet, the dot com bubble, etc. but I think the most appropriate comparison would be to... AI in the past. No one doubts that there was a lot of value coming from that research. In fact a lot of it is incorperated in our every day life. But it didn't live up to its hype. I suspect the same will be true for this wave of AI (and perhaps an associated AI winter).

JohnMakin1y ago

It's strange to me that's your takeaway. The reason that the internet was overhyped in the 2000's is because it was and also heavily overvalued. It took a massive correction and seriously disruptive bubble burst to break the delusion and move on to something more sustainable.

akomtu1y ago

AI can be thought of as the 2nd stage of the creature that we call the Internet. The 1st stage, that we are so familiar with, is about gathering knowledge into a giant and somewhat organized library. This library has books on every subject imaginable, but its scale is so vast that no living human today can grasp it. This is why the originally connected network has started falling apart. Once this I becomes AI, all the books in the library will be melted together into one coherent picture. Once again, anyone anywhere on Earth will be able to access all the knowledge and our Babylon will stay for a little longer.

rm_-rf_slash1y ago

AI was overhyped in the 1950s with the perceptron. Machine learning advances in fits and starts. As soon as it looks like it’s out of steam something novel comes out. Circa 2010 all the effort was on perfecting SVMs to the point where 1% point improvement on a computer vision task was a PhD thesis and the like then all of a sudden AlexNet made neural nets look feasible and the game changed overnight.

mvdtnz1y ago

Even if you're right (you're not) whatever "AI" looks like in 20+ years will have virtually nothing in common with these stupid statistical word generators.

pluc1y ago· 7 in thread

They've simply run out of data to use to fabricate legitimate-looking guesses. They can't create anything that doesn't already exist.

xpe1y ago

> They can't create anything that doesn't already exist.

I probably disagree, but I don't want to criticize my interpretation of this sentence. Can you make your claim more precise?

Here are some possible claims and refutations:

- Claim: An LLM cannot output a true claim that it has not already seen. Refutation: LLMs have been shown to do logical reasoning.

- Claim: An LLM cannot incorporate data that it hasn't been presented with. Refutation: This is an unfair standard. All forms of intelligence have to sense data from the world somehow.

mtkd1y ago

And that is potentially only going to worsen as:

1. more data gets walled-off as owners realise value

2. stackoverflow-type feedback loops cease to exist as few people ask a public question and get public answers ... they ask a model privately and get an answer based on last visible public solutions

3. bad actors start deliberately trying to poison inputs (if sites served malicious responses to GPTBot/CCBot crawlers only, would we even know right now?)

4. more and more content becomes synthetically generated to the point pre-2023 physical books become the last-known-good knowledge

5. goverments and IP lawyers finally catch up

Garbage-in was depleted.

tim3331y ago

Try asking one to write a poem. You'll get a lot of stuff that didn't exist before.

xpe1y ago

> They've simply run out of data

Why do you think "they" have run out of data? First, to be clear, who do you mean by "they"? The world is filled with information sources (data aggregators for example), each available to some degree for some cost.

Don't forget to include data that humans provide while interacting with chatbots.

whazor1y ago

But a LLM can certainly make up a lot information that never existed before.

77pt771y ago

> They can't create anything that doesn't already exist.

Just increase the temperature.

nerdypirate1y ago· 7 in thread

"We will have better and better models," wrote OpenAI CEO Sam Altman in a recent Reddit AMA. "But I think the thing that will feel like the next giant breakthrough will be agents."

Is this certain? Are Agents the right direction to AGI?

rapjr91y ago

I've worked on agents of various kinds (mobile agents, calendar agents, robotic agents, sensing agents) and what is different about agents is they have the ability to not just mess up your data or computing, they have the ability to directly mess up reality. Any problems with agents has a direct impact on your reality; you miss appointments, get lost, can't find stuff, lose your friends, lose you business relationships. This is a big liability issue. Chatbots are like an advice column that sometimes gives bad advice, agents are like a bulldozer sometimes leveling the wrong house.

xanderlewis1y ago

If by agents you mean systems comprised of individual (perhaps LLM-powered) agents interacting with each other, probably not. I get the vague impression that so far researchers haven’t found any advantage to such systems — anything you can do with a group of AI agents can be emulated with a single one. It’s like chaining up perceptrons hoping to get more expressive power for free.

[1] https://x.com/sama/status/1856941766915641580

falcor841y ago

Nothing is certain, but my $0.02 is that setting LLM-based agents up with long-running tasks and giving them a way of interacting with the world, via computer use (e.g. Anthropic's recent release) and via actual robotic bodies (e.g. figure.ai) are the way forward to AGI. At the very least, this approach allows the gathering of unlimited ground truth data, that can be used to train subsequent models (or even allow for actual "hive mind" online machine learning).

nprateem1y ago

They're nothing to do with AGI. They're to get people using their LLMs more.

eichi1y ago

It's marketing using buzz word rhetric. It's better to learn OOP if he trully think that. I also think OpenAI's PMF was to make the LLMs application towords better argument machine.

esafak1y ago

I think he means you won't be impressed by GPT5 because it will be more of the same, whereas agents will represent a new direction.

SirMaster1y ago

All I can think of when I hear Agents is the Matrix lol.

Goodbye, Mr. Anderson...

kklisura1y ago· 5 in thread

Not sure if related or not, Sam Altman, ~12hrs ago: there is no wall [1]

ablation1y ago

Breaking: Man says enigmatic thing to sustain hype and flow of money into his business.

moffkalast1y ago

Altman on twitter has always been less coherent than GPT2.

malthaus1y ago

if my billion net worth were coupled to that being the case i'd tweet that as well

levocardia1y ago

My interpretation of that tweet is "there is no DATA wall" meaning "we have so much more data we can ingest: all of youtube, all of spotify, all of twitch, every real-time webcam feed on the internet, RL agents playing every video game on steam, and we can extract so much more learning per unit data than we are now" which seems plausible enough to me.

phil9171y ago

The more Sam Altman posts stuff like this, the more he comes across as a grifter hype man to me

Animats1y ago· 3 in thread

"While the model was initially expected to significantly surpass previous versions of the technology behind ChatGPT, it fell short in key areas, particularly in answering coding questions outside its training data."

Right. If you generate some code with ChatGPT, and then try to find similar code on the web, you usually will. Search for unusual phrases in comments and for variable names. Often, something from Stack Overflow will match.

LLMs do search and copy/paste with idiom translation and some transliteration. That's good enough for a lot of common problems. Especially in the HTML/Javascript space, where people solve the same problems over and over. Or problems covered in textbooks and classes.

But it does not look like artificial general intelligence emerges from LLMs alone.

There's also the elephant in the room - the hallucination/lack of confidence metric problem. The curse of LLMs is that they return answers which are confident but wrong. "I don't know" is rarely seen. Until that's fixed, you can't trust LLMs to actually do much on their own. LLMs with a confidence metric would be much more useful than what we have now.

dmd1y ago

> Right. If you generate some code with ChatGPT, and then try to find similar code on the web, you usually will.

People who "follow" AI, as the latest fad they want to comment on and appear intelligent about, repeat things like this constantly, even though they're not actually true for anything but the most trivial hello-world types of problems.

I write code all day every day. I use Copilot and the like all day every day (for me, in the medical imaging software field), and all day every day it is incredibly useful and writes nearly exactly the code I would have written, but faster. And none of it appears anywhere else; I've checked.

https://arxiv.org/abs/2406.17642

xpe1y ago

> LLMs do search and copy/paste with idiom translation and some transliteration.

In general, this is not a good description about what is happening inside an LLM. There is extensive literature on interpretability. It is complicated and still being worked out.

The commenter above might characterize the results they get in this way, but I would question the validity of that characterization, not to mention its generality.

nickpsecurity1y ago

The brain solves that problem. It seems to involve memory and specialized regions. I found a few groups building hippocampus-like, research models. One had content-addressable memory.

There was another one that claimed to get rid of hallucinations. They also said it takes 50-100 epochs for regular architectures to actually memorize something. Their paper is below in case people qualified to review it want to.

Like the brain, I believe the problem will be solved by a mix of specialized components working together. One of those components will be a memory (or series of them) that the others reference to keep processing grounded in reality.

guluarte1y ago· 3 in thread

Well, there have been no significant improvements to the GPT architecture over the past few years. I'm not sure why companies believe that simply adding more data will resolve the issues

xpe1y ago

> Well, there have been no significant improvements to the GPT architecture over the past few years.

A lot hangs on what you mean by "significant". Can you define what you mean? And/or give an example of an improvement that you don't think is significant.

Also, on what basis can you say "no significant improvements" have been made? Many major players have published some of their improvements openly. They also have more private, unpublished improvements.

If your claim boils down to "what people mean by a Generative Pre-trained Transformer" still has a clear meaning, ok, fine, but that isn't the meat of the issue. There is so much more to a chat system than just the starting point of a vanilla GPT.

It is wiser to look at the whole end-to-end system, starting at data acquisition, including pre-training and fine-tuning, deployment, all the way to UX.

P.S. I don't have a vested interest in promoting or disparaging AI. I don't work for a big AI lab. I'm just trying to call it like I see it, as rationally as I can.

Obviously adding more data is a game of diminishing returns.

Going from 10% to 50% (500% more) complete coverage of common sense knowledge and reasoning is going to feel like a significant advance. Going from 90% to 95% (5% more) coverage is not going to feel the same.

Regardless of what Altman says, its been two years since OpenAI released GPT-4, and still no GPT-5 in sight, and they are now touting Q-star/strawberry/GPT-o1 as the next big thing instead. Sutskever, who saw what they're cooking before leaving, says that traditional scaling has plateaeud.

incognito1241y ago

More data and more compute on simpler models are the BItter Lessons of Rich Sutton

thousand_nights1y ago· 3 in thread

not long ago these people would have you believe that a next word predictor trained on reddit posts would somehow lead to artificial general superintelligence

leosanchez1y ago

If you look around, People still believe that a next word predictor trained on reddit posts would somehow lead to artificial general superintelligence

in_a_society1y ago

Expecting AGI from Reddit training data is peak "pray Mr Babbage".

SpicyLemonZest1y ago

I don't understand why you'd be so dismissive about this. It's looking less likely that it'll end up happening, but is it any less believable than getting general intelligence by training a blob of meat?

osigurdson1y ago· 2 in thread

This "running out of data" thing suggests that there is something fundamentally wrong with how things are working. A new driver does not need to experience 8000 different rabbit-on-road situations from all angles to know to slow down when we see one on the road. Similarly we don't need 10,000 addition examples to learn how to add. It is as though there is no generalization in the models - just fundamentally search.

slashdave1y ago

Deep learning is the very opposite of generalization.

surrTurr1y ago

i think you underestimate the amount of data a driver experiences in a single 5 minute drive

(1) https://x.com/willdepue/status/1856766850027458648

aresant1y ago· 2 in thread

Taking a hollistic view informed by a disruptive OpenAI / AI / LLM twitter habit I would say this is AI's "What gets measured gets managed" moment and the narrative will change

This is supported by both general observations and recently this tweet from an OpenAI engineer that Sam responded to and engaged ->

"scaling has hit a wall and that wall is 100% eval saturation"

Which I interpert to mean his view is that models are no longer yielding significant performance improvements because the models have maxed out existing evaluation metrics.

Are those evaluations (or even LLMs) the RIGHT measures to achieve AGI? Probably not.

But have they been useful tools to demonstrate that the confluence of compute, engineering, and tactical models are leading towards signifigant breathroughts in artificial (computer) intelligence?

I would say yes.

Which in turn are driving the funding, power innovation, public policy etc needed to take that next step?

I hope so.

ActionHank1y ago

> Which in turn are driving the funding, power innovation, public policy etc needed to take that next step?

They are driving the shoveling of VC money into a furnace to power their servers.

Should that money run dry before they hit another breakthrough "AI" popularity is going to drop like a stone. I believe this to be far more likely an outcome than AGI or even the next big breakthrough.

Bjorkbat1y ago

I agree that existing benchmarks are no longer useful now that there's basically nothing left in them that seems to stump LLMs.

But when I hear that models are failing to meet expectations, I imagine what they're saying is that the researchers had some sort of eval in mind with room to grow and a target, and that the model in question failed to hit the target they had in mind.

Honestly, problem with sentiments like these is on Twitter is that you can't tell if they're being sincere or just making a snarky, useless remark. Probably a mix of both.

benopal641y ago· 2 in thread

I am not sure how these large companies think they will reach "greater-than-human" intelligence any time soon if they do not create systems that financially incentivize people to sell their knowledge labor (unstable contracting gigs are not attractive).

Where do these large "AI" companies think the mass amounts of data used to train these models come from? People! The most powerful and compact complex systems in existence, IMO.

smgit1y ago

Most People have knowledge handed to them. Very few are creators of new knowledge. Explore-Exploit tradeoff applies.

MyFirstSass1y ago

This is the most interesting comment in this highly autistic field.

Timber-65391y ago· 2 in thread

Direct quote from the article: "The companies are facing several challenges. It’s become increasingly difficult to find new, untapped sources of high-quality, human-made training data that can be used to build more advanced AI systems."

The irony here is astounding.

rapjr91y ago

Indeed, if thinking about AI polluting the data and replacing humans. However, it also seems likely in the near term that training will go to the source because of this, that increasingly humans will directly train AI's, as the robotics and self driving car systems are doing, instead of training off the indirect data people create (watching someone paint rather than scanning paintings). So in essence we'll be training our replacements to take our tasks/jobs. Small tasks at first, but increasing in complexity over time. Someday no one may know how to drive a car anymore (or be allowed to for safety). Later on no one may know how to write computer code (or be allowed to for security reasons). Learning in each area mastered by AI will stop and never progress further, unless AI can truly become creative. Or perhaps (fewer and fewer) people will only work on new problems that require creativity. There are long term risks to humanities adaptability in this scenario. People would probably take those risks for the short term gains.

mrweasel1y ago

That's an interesting limitation. They can't make the LLMs (I still refuse to call them AIs) better, which the current dataset available. So with the sum of all human knowledge, more or less, and mixed in with the dumpster fire that it Internet comments, this is the best we can do with the current models.

I don't know much about LLMs, but that seems to indicate a sort of dead-end. The models are still useful, but limited in their abilities. So now the developers and researchers needs to start looking for new ways to use all this data. That in some sense resets the game. Sucks to be OpenAI, billions of dollars spend on a product that has been match or even outmatched by the competition in a few short years, not nearly enough time to make any of it back.

If there is a take away, it might be that it takes billions, if not trillions of dollars, to develop an AI and the result may still be less than what you hope for, and the investment really hard to recoup.

wg01y ago· 2 in thread

AI winter is here. Almost.

mupuff12341y ago

More like AI fall - in its current state it's still gonna provide some value.

tim3331y ago

Or maybe not https://149909199.v2.pressablecdn.com/wp-content/uploads/201... https://waitbutwhy.com/2015/01/artificial-intelligence-revol...

cubefox1y ago· 2 in thread

It's very strange this got so few upvotes. The scoop by The Information a few days ago, which came to similar conclusions, was also ignored on HN. This is arguably rather big news.

dang1y ago

The Information is hardwalled so its articles aren't on topic for HN, even though they're on topic for HN.

Sometimes other outlets do copycat reporting of theirs, and those submissions are ok, though they wouldn't be if the original source were accessible.

danjl1y ago

There have been variations of this story going back several months now. It isn't really news. It is just building slowly.

nikkwong1y ago· 2 in thread

Didn’t Sam Altman just go on some podcast last week and tell the world that he thought “We know exactly what to do to be able to reach AGI now”. What’s going on, is he just posturing?

whatshisface1y ago

"We know exactly what we need to do to be able to reach it: figure out how."

tim3331y ago

Yeah this https://www.youtube.com/watch?v=xXCBz_8hM9w&t=2324s

Not quite that wording. More we know which way to head. I think he's sincere.

headcanon1y ago· 1 in thread

I don't see a problem with this, we were inevitably going to reach some kind of plateau with existing pre-LLM-era data.

Meanwhile, the existing tech is such a step change that industry is going to need time to figure out how to effectively use these models. In a lot of ways it feels like the "digitization" era all over again - workflows and organizations that were built around the idea humans handled all the cognitive load (basically all companies older than a year or two) will need time to adjust to a hybrid AI + human model.

> feels like the "digitization" era all over again

This exactly. And as history shows, no matter how much effort the current big LLM companies do they won't be able to grasp the best uses for their tech. We will see small players developing it even further. I'm thankful for the legendary blindness of these anticompetitive behemoths. Less than 2 decades ago: IBM Watson.

WorkerBee284741y ago· 1 in thread

> OpenAI's latest model ... failed to meet the company's performance expectations ... particularly in answering coding questions outside its training data.

So the models' accuracies won't grow exponentially, but can still grow linearly with the size of the training data.

Sounds like DataAnnotation will be sending out a lot more LinkedIn messages.

pton_xd1y ago

I thought I saw some paper suggesting that accuracy grows linearly with exponential data. If that's the case it's not a mystery why we'd be hitting a training wall. Not sure I got the right takeaway from that study, though.

EDIT: here's the paper https://arxiv.org/abs/2404.04125

sssilver1y ago· 1 in thread

One thing that makes the established AIs less ideal for my (programming) use-case is that the technologies I use quickly evolve past whatever the published models "learn".

On the other hand, a lot of these frameworks and languages have relatively decent and detailed documentation.

Perhaps this is a naive question, but why can't I as a user just purchase "AI software" that comes with a large pre-trained model to which I can say, on my own machine, "go read this documentation and help me write this app in this next version of Leptos", and it would augment its existing model with this new "knowledge".

danielbln1y ago

Pretraining or even post-training is cumbersome, complex and expensive. What is easy and cheap is in-context learning, which is why I just pull in the documentation I need the LLM to know about into the LLM's context.

fallat1y ago· 1 in thread

What a stupid piece. We are making leaps every 6 months still. Tell me this when there are no developments for 3 years.

hatefulmoron1y ago

I'm curious, what was the leap after GPT-4? What about the leaps after that, given a leap every 6 months?

svara1y ago· 1 in thread

The recent big success in deep learning have all been to a large part successes in leveraging relatively cheaply available training data.

AlphaGo - self-play

AlphaFold - PDB, the protein database

ChatGPT - human knowledge encoded as text

These models are all machines for clever interpolation in gigantic training datasets.

They appear to be intelligent, because the training data they've seen is so vastly larger than what we've seen individually, and we have poor intuition for this.

I'm not throwing shade, I'm a daily user of ChatGPT and find tremendous and diverse value in it.

I'm just saying, this particular path in AI is going to make step-wise improvements whenever new large sources of training data become available.

I suspect the path to general intelligence is not that, but we'll see.

kaibee1y ago

> I suspect the path to general intelligence is not that, but we'll see.

I think there's three things that a 'true' general intelligence has which is missing from basic-type-LLMs as we have now.

1. knowing what you know. <basic-LLMs are here>

2. knowing what you don't know but can figure out via tools/exploration. <this is tool use/function calling>

3. knowing what can't be known. <this is knowing that halting problem exists and being able to recognize it in novel situations>

(1) From an LLM's perspective, once trained on corpus of text, it knows 'everything'. It knows about the concept of not knowing something (from having see text about it), (in so far as an LLM knows anything), but it doesn't actually have a growable map of knowledge that it knows has uncharted edges.

This is where (2) comes in, and this is what tool use/function calling tries to solve atm, but the way function calling works atm, doesn't give the LLM knowledge the right way. I know that I don't know what 3,943,034 / 234,893 is. But I know I have a 'function call' of knowing the algorithm for doing long divison on paper. And I think there's another subtle point here: my knowledge in (1) includes the training data generated from running the intermediate steps of the long-division algorithm. This is the knowledge that later generalizes to being able to use a calculator (and this is also why we don't just give kids calculators in elementary school). But this is also why a kid that knows how to do long division on paper, doesn't seperately need to learn when/how to use a calculator, besides the very basics. Using a calculator to do that math feels like 1 step, but actually it does still have all of initial mechanical steps of setting up the problem on paper. You have to type in each digit individually, etc.

(3) I'm less sure of this point now that I've written out point (1) and (2), but that's kinda exactly the thing I'm trying to get at. Its being able to recognize when you need more practice of (1) or more 'energy/capital' for doing (2).

Consider a burger resturant. If you properly populated the context of a ChatGPT-scale model the data for a burger resturant from 1950, and gave it the kinda 'function calling' we're plugging into LLMs now, it could manage it. It could keep track of inventory, it could keep tabs on the employee-subprocesses, knowing when to hire, fire, get new suppliers, all via function calling. But it would never try to become McDonalds, because it would have no model of the the internals of those function-calls, and it would have no ability to investigate or modify the behaviour of those function calls.

datahack1y ago· 1 in thread

The next wave won’t be monolithic but network-driven. Orchestration has the potential to integrate diverse AI systems and complementary technologies, such as advanced fact-checking and rule-based output frameworks.

This methodological growth could make LLMs more reliable, consistent, and aligned with specific use cases.

The skepticism surrounding this vision mirrors early doubts about the early internet fairly concisely.

Initially, the internet was seen as fragmented collection of isolated systems without a clear structure or purpose. It really was. You would gopher somewhere and get a file, and eventually we had apps like like pine for email, but as cool as it was it has limited utility.

People doubted it could ever become the seamless, interconnected web we know today.

Yet, through protocols, shared standards, and robust frameworks, the internet evolved into a powerful network capable of handling diverse applications, data flows, and user needs.

In the same way, LLM orchestration will mature by standardizing interfaces, improving interoperability, and fostering cooperation among varied AI models and support systems.

Just as the internet needed HTTP, TCP/IP, and other protocols to unify disparate networks, orchestrated AI systems will require foundational frameworks and “rules of the road” that bring cohesion to diverse technologies.

We are at the veeeeery infancy of this era and have a LONG way to go here. Some of the progress looks clear and a linear progression, but a lot, like the Internet, will just take a while to mature and we shouldn’t forget what we learned the last time we faced a sea change technological revolution.

whyowhy34849391y ago

You are definitely on to something here, but the difference is that the fundamental process was proven. It "just" needed to scale. That's hard and complex, but on a different level.

I don't think anyone doubted the nature of the technology. The bits were being sent. It's not like we were unsure of the fundamental possibility of transmitting information. The potential was shown very, very early on (Mother of all demos was in 1968). What we were and to some extent still are unsure of is the practical impact on society.

AI and LLMs in particular are not even at the mother of all demos level yet notwithstanding the grandiose claims and demos. There is no consensus on what these models are even doing. There is (IMO) justified skepticism surrounding the claims of reasoning and ability to abstract. We are in my opinion not yet at the "bits are being sent" stage.

wslh1y ago· 1 in thread

It sounds a bit sci-fi, but since these models are built on data generated by our civilization, I wonder if there's an epistemological bottleneck requiring smarter or more diverse individuals to produce richer data. This, in turn, could spark further breakthroughs in model development. Although these interactions with LLMs help address specific problems, truly complex issues remain beyond their current scope.

With my user hat on, I'm quite pleased with the current state of LLMs. Initially, I approached them skeptically, using a hackish mindset and posing all kinds of Turing test-like questions. Over time, though, I shifted my focus to how they can enhance my team's productivity and support my own tasks in meaningful ways.

Finally, I see LLMs as a valuable way to explore parts of the world, accommodating the reality that we simply don’t have enough time to read every book or delve into every topic that interests us.

tim3331y ago

AlphaGo which beat Lee Sedol was trained on human games. But then they produced AlphaZero which learned entirely from self play and got better than AlphaGo. So it goes.

fsndz1y ago· 1 in thread

Sam Altman might be wrong then?

Learning from data is not enough; there is a need for the kind of system-two thinking we humans develop as we grow. It is difficult to see how deep learning and backpropagation alone will help us model that. For tasks where providing enough data is sufficient to cover 95% of cases, deep learning will continue to be useful in the form of 'data-driven knowledge automation.' For other cases, the road will be much more challenging. https://www.lycee.ai/blog/why-sam-altman-is-wrong

asdfman1231y ago

If Sam Altman concluded that AI is reaching it's limits, it probably wouldn't be a very good strategic decision for him to say it.

https://x.com/ArtificialAnlys/status/1853598554570555614

czhu121y ago· 1 in thread

If it becomes obvious that LLM's have a more narrow set of use cases, rather than the all encompassing story we hear today, then I would bet that the LLM platforms (OpenAI, Anthropic, Google, etc) will start developing products to compete directly with applications that supposed to be building on top of them like Cursor, in an attempt to increase their revenue.

I wonder what this would mean for companies raising today on the premise of building on top of these platforms. Maybe the best ones get their ideas copied, reimplemented, and sold for cheaper?

We already kind of see this today with OpenAI's canvas and Claude artifacts. Perhaps they'll even start moving into Palantir's space and start having direct customer implementation teams.

It is becoming increasing obvious that LLM's are quickly becoming commoditized. Everyone is starting to approach the same limits in intelligence, and are finding it hard to carve out margin from competitors.

Most recently exhibited by the backlash at claude raising prices because their product is better. In any normal market, this would be totally expected, but people seemed shocked that anyone would charge more than the raw cost it would take to run the LLM itself.

dmix1y ago

Maybe in like 5yrs+. For now they will rake in billions just from API usage alone just with GPT4 and whatever 5 is.

Amazon and Google didn't mess with their core business by competing with the players using it until they REALLY ran out of ways to make money.

the_king1y ago· 1 in thread

Anthropic's latest 3.5 sonnet is a cut above GPT-4 and 4.0. And if someone had given it to me and said, here's GPT-4.5, I would have been very happy with it.

tiahura1y ago

For law, I use both and find that neither is clearly superior. I’ll often pick one to first draft, and then feed to the other for suggestions and my edits.

shmatt1y ago· 1 in thread

Time to start selling my "probabilistic syllable generators are not intelligence" t shirts

jsemrau1y ago

Please, someone think of the Math reasoners.

devit1y ago· 1 in thread

It seems obvious to me that Common Crawl plus Github public repositories have more than an enough data to train an AI that is as good as any programmer (at tasks not requiring knowledge of non-public codebases or non-public domain knowledge).

So the problem is more in the algorithm.

darknoon1y ago

I think just reading the code wouldn't make you a good programmer, you'd need to "read" the anti-code, ie what doesn't work, by trial and error. Models overconfidence that their code will work often leads them to fail in practice.

[1]: https://openai.com/index/introducing-simpleqa/

Dr_Birdbrain1y ago· 1 in thread

I don’t know how to square this with the recent statement by Dario Amodei (Anthropic CEO) on the Lex Fridman podcast saying that in his opinion the scaling hypothesis still has plenty of room to run.

avs7331y ago

Hype gonna hype. I’m not saying he is wrong I’m saying his opinion would be the same whether it’s true or not because his value depends on it being his opinion.

Havoc1y ago· 1 in thread

The new Gemini just hit some good benchmarks.

This smells like it’s mostly based on OAI having a bit of bad luck with next model rather than a fundamental slowdown / barrier.

They literally just made a decent sized leap with o1

Bjorkbat1y ago

Not meeting expectations != not better than the previous models.

The Information reporting was a bit more clear on this. Orion is better than GPT-4, it's just that they were expecting a leap in capabilities comparable to what we saw going from GPT-3 to GPT-4. In other words, they were expecting essentially a GPT-5, and Orion wasn't that good.

zusammen1y ago· 1 in thread

I wonder how much this has to do with a fluency plateau.

Up to a certain point, a conditional fluency stores knowledge, in the sense that semantically correct sentences are more likely to be fluent… but we may have tapped out in that regard. LLMs have solved language very well, but to get beyond that has seemed, thus far, to require RLHF, with all the attendant negatives.

namaria1y ago

Modeled language, maybe.

nomendos1y ago· 1 in thread

"Eureka"!?

At the very early phase of the boom I was among a very few who knew and predicted this (usually most free and deep thinking/knowledgeable). Then my prediction got reinforced by the results. One of the best examples was with one of my experiments that all today's AI's failed to solve tree serialization and de-serialization in each of the DFS(pre-order/in-order/post-order) or BFS(level-order) which is 8 algorithms (2x4) and the result was only 3 correct! Reason is "limited training inputs" since internet and open source does not have other solutions :-) .

So, I spent "some" time and implemented all 8, which took me few days. By the way this proves/demonstrates that ~15-30min pointless leetcode-like interviews are requiring to regurgitate/memorize/not-think. So, as a logical hard consequence there will.has-to be a "crash/cleanup" in the area of leetcode-like interviews as they will just be suddenly proclaimed as "pointless/stupid"). However, I decided not to publish the rest of the 5 solutions :-)

This (and other experiments) confirms hard limits of the LLM approach (even when used with chain-of-thought). Increasing the compute on the problem will produce increasingly smaller and smaller results (inverse exponential/logarithmic/diminishing-returns) = new AGI approach/design is needed and to my knowledge majority of the inve$tment (~99%) is in LLM, so "buckle up" at-some-point/soon?

Impacts and realities; LLM shall "run it's course" (produce some products/results/$$$, get reviewed/$corrected) and whoever survives after that pruning shall earn money on those products while investing in the new research to find new AGI design/approach (which could take quite a long time,... or not). NVDA is at the center of thi$ and time-wise this peak/turn/crash/correction is hard to predict (although I see it on the horizon and min/max time can be estimated). Be aware and alert. I'll stop here and hold my other number of thoughts/opinions/ideas for much deeper discussion. (BTW I am still "full in on NVDA" until,....)

nomendos1y ago

To clarify, in summary so far LLM's can do a bit more than the inputs used for training. Example https://dynomight.net/chess/ as well as some coding solutions are a bit better than each input alone, although if the solution requires more than "a bit more" then LLMs start to hallucinate (spin the wheels). Time will tell if LLM's can jump this "a bit more" barrier? (I can not tell for sure yet, but the current knowledge and my NL tells me if I'd have to put a bet, it would be that the new approach/design is needed)

superjose1y ago· 1 in thread

I'm more on the camp that these techs don't need to be perfect, but they need to be practical enough.

And I think the latter is good enough for us to do exciting things.

imiric1y ago

How practical can they be when current flagship models generate incorrect responses more than 50% of the time[1]?

This might be acceptable for amusing us with fiction and art, and for filling the internet with even more spam and propaganda, but would you trust them to write reliable code, drive your car or control any critical machinery?

The truly exciting things are still out of reach, yet we just might be at the Peak of Inflated Expectations to see it now.

atomsatomsatoms1y ago· 1 in thread

At least they can generate haikus now

Der_Einzige1y ago

In general, no they can't:

https://gwern.net/gpt-3#bpes

https://paperswithcode.com/paper/most-language-models-can-be...

The appearance of improvements in that capability are due to the vocabulary of modern LLMs increasing. Still only putting lipstick on a pig.

https://archive.ph/2024.11.13-100709/https://www.bloomberg.c...

aurareturn1y ago· 1 in thread

Is there any timeline on AI winters and if each winter gets shorter and shorter as time increases?

RaftPeople1y ago

> Is there any timeline on AI winters and if each winter gets shorter and shorter as time increases?

AGI=lim(x->0)AIHype(x)

where x=length of winter

thebigspacefuck1y ago

ziofill1y ago

I think it is a good thing for AI that we hit the data ceiling, because the pressure moves toward coming up with better model architectures. And with respect to a decade ago there's a much larger number of capable and smart AI researchers who are looking for one.

danjl1y ago

Where will the training data for coding come from now that Stack Overflow has effectively been replaced? Will the LLMs share fixes for future problems? As the world moves forward, and the amount of non-LLM generated data decreases, will LLMs actually revert their advancements and become effectively like addled brains, longing for the "good old times"?

grey-area1y ago

The biggest weakness of generative AI to me is knowledge. It gives the impression of knowledge about the world without actually having a model of the world or any sense of what it does or does not know.

For example recently I asked it to generate some phrases for a list of words, along with synonym and antonym lists.

The phrases were generally correct and appropriate (some mistakes but that’s fine). The synonyms/antonyms were misaligned to the list (so strictly speaking all wrong) and were often incorrect anyway. I imagine it would be the same if you asked for definitions of a list of words.

If you ask it to correct it just generates something else which is often also wrong. It’s certainly superficially convincing in many domains but once you try to get it to do real work it’s wrong in subtle ways.

cryptica1y ago

It's interesting the way things turned out so far with LLMs, especially from the perspective of a software engineer. We are trained to keep a certain skepticism when we see software which appears to be working because, ultimately, the only question we care about is "Does it meet user requirements?" and this is usually framed in terms of users achieving certain goals.

So it's interesting that when AI came along, we threw caution to the wind and started treating it like a silver bullet... Without asking the question of whether it was applicable to this goal or that goal...

I don't think anyone could have anticipated that we could have an AI which could produce perfect sentences, faster than a human, better than a human but which could not reason. It appears to reason very well, better than most people, yet it doesn't actually reason. You only notice this once you ask it to accomplish a task. After a while, you can feel how it lacks willpower. It puts into perspective the importance of willpower when it comes to getting things done.

In any case, LLMs bring us closer to understanding some big philosophical questions surrounding intelligence and consciousness.

LarsDu881y ago

Curves that look exponential in virtually all cases turn out to be logarithmic.

Certain OpenAI insiders must have known this for a while, hence Ilya Sutskever's new company in Israel

nutanc1y ago

Let's keep aside the hype. Let's define more advanced AI. With current architectures, this basically means better copying machines(don't mean this in a bad way and don't want a debate on this. This is just my opinion based on my usage). Basically everything in the Internet has been crammed into the weights and the companies are finding it hard to do two things:

1. Find more data.

2. Make the weights capture the data and reproduce.

In that sense we have reached a limit. So in my opinion we can do a couple of things.

1. App developers can understand the limits and build within the limits.

2. Researchers can take insights from these large models and build better AI systems with new architectures. It's ok to say transformers have reached a limit.

glial1y ago

I think self-consistency is a critical feature of LLMs or any AI that's currently missing. It's one of the core attributes of truth [1], in addition to the order and relationship of statements corresponding to the order and relationship of things in the world. I wonder if some kind of hierarchical language diffusion model would be a way to implement this -- where text is not produced sequentially, but instead hierarchically, with self-consistency checks at each level.

[1] https://en.wikipedia.org/wiki/Coherence_theory_of_truth

summerlight1y ago

I guess this is somewhat expected? The current frontier models probably already have exhausted most of the entropy in the training data accumulated over decades and the new training data is very sparse. And the current mainstream architectures are not capable of sophisticated searching and planning, essential aspects for generating new entropy out of thin air. o1 was an interesting attempt to tackle this problem, but we probably still have a long way to go.

xyst1y ago

Many late investors in the genAI space about to be bag holders

EternalFury1y ago

If GPT-5 had passed the A/B testing OpenAI likes to do, it would have been released already. Instead, it seems they are clearly concerned the audience would not find it superior enough to GPT-4. So, the bluff must go on until the right cards appear.

smusamashah1y ago

It has to be a good thing to stop here. We can focus on improving what we have right now. The whole stack of models is an amazing innovation no matter what. It shouldn't hurt if we pause here for a while and try to build on this or improve this.

It will be like StableDiffusion 1.5. This model can now run on low end devices, lots of open research use this model to build something else and inspire by this.

These LLMs can be used as a foundation to keep improving and building new things.

KETpXDDzR1y ago

LLMs are glorified Markow chains in the end. They can't reason or think, even when they are good in pretending they can. What we need is a totally different approach IMO.

Veuxdo1y ago

> They are also experimenting with synthetic data, but this approach has its limitations.

I was really looking forward to using "synthetic data" euphemistically during debates.

tippytippytango1y ago

There’s only so much you can do when you train on the data instead of the processes that created that data.

m3kw91y ago

Hold your horses, OpenAI just came out with o1preview 2 months ago, showing what test time computer can do

eichi1y ago

Scientific benchmarks score are not necessary related to the rate of completion of tasks such as user persuasion. Software engineering is more important when the current state-of-the-art small language model is sufficient for soltion of our application.

GiorgioG1y ago

It’s about time the hype starts to die down. LLMs are brilliant for small bits of grunt work in software. It is not however doing any actual reasoning.

Bjorkbat1y ago

It's kind of, I don't know, "weird", observing how there's all these news outlets reporting on how essentially every up-and-coming model has not performed as expected, while all the employees at these labs haven't changed their tune in the slightest.

And there's a number of reasons why, mostly likely being that they've found other ways to get improvements out of AI models, so diminishing returns on training aren't that much of a problem. Or, maybe the leakers are lying, but I highly doubt that considering the past record of news outlets reporting on accurate leaked information.

Still though, it's interesting how basically ever frontier lab created a model that didn't live up to expectations, and every employee at these labs on Twitter has continued to vague-post and hype as if nothing ever happened.

It's honestly hard to tell whether or not they really know something we don't, or if they have an irrational exuberance for AGI bordering on cult-like, and they will never be able to mentally process, let alone admit, that something might be wrong.

yalogin1y ago

I do wonder how quickly llms will become a commodity AI instrument just like any other AI out there. If so what happens to openAI

non-1y ago

Honestly could use a breather from the recent rate of progress. We are just barely figuring out how to interact with the models we have now. I'd bet there are at least 100 billion-dollar startups that will be built even if these labs stopped releasing new models tomorrow.

rubiquity1y ago

> Amodei has said companies will spend $100 million to train a bleeding-edge model this year

Is it just me or does $100 million sound like it's on the very, very low end of how much training a new model costs? Maybe you can arrive within $200 million of that mark with amortization of hardware? It just doesn't make sense to me that a new model would "only" be $100 million when AmaGooBookSoft are spending tens of billions on hardware and the AI startups are raising billions every year or two.

gchamonlive1y ago

We should put a model in an actual body and let it in the world to build from experiences. Inference is costly though, so the robot would interact during a period and update it's model during another period, flushing the context window (short term memory) into its training set (long term memory).

Oras1y ago

I think Meta will have upper hand soon with the release of their glasses. If they managed to make it a daily use glass, and paid users to record and share their life, then they will have data no one else has now. Mix of vision, audio, and physics.

lobochrome1y ago

Isn’t this just the expected delay from the respin of Blackwell?

user901313131y ago

AI market top very soon

polskibus1y ago

In other news, Altman said AGI is coming next year https://www.tomsguide.com/ai/chatgpt/sam-altman-claims-agi-i...

k__1y ago

But AGI is always right around the corner?

I don't get it...

kaycey20221y ago

AI safety folks sure do look stupid now. :)

wildermuthn1y ago

Simply put, AGI requires more data: qualia.

_Algernon_1y ago

The next AI winter will be brutal

quantum_state1y ago

Hope this would be a constant reminder that brute force can only get one that far, though it may still be useful when it is. With lots of intuition gained, it’s time to ponder things a bit more deeply.

russellbeattie1y ago

Go back a few decades and you'd see articles like this about CPU manufacturers struggling to improve processor speeds and questioning if Moore's Law was dead. Obviously those concerns were way overblown.

That doesn't mean this article is irrelevant. It's good to know if LLM improvements are going to slow down a bit because the low hanging fruit has seemingly been picked.

But in terms of the overall effect of AI and questioning the validity of the technology as a whole, it's just your basic FUD article that you'd expect from mainstream news.

[1] https://dl.acm.org/doi/10.1145/3442188.3445922

wanderingmind1y ago

And yet the Anthropic CEO is still claiming PhD level intelligence in next couple of years to Lex Friedman. It's starting to feel like the whole crypto pump and dump again

kaibee1y ago

Not sure where the OP to the comment I meant to reply to is, but I'll just add this here.

> I suspect the path to general intelligence is not that, but we'll see.

I think there's three things that a 'true' general intelligence has which is missing from basic-type-LLMs as we have now.

1. knowing what you know. <basic-LLMs are here>

2. knowing what you don't know but can figure out via tools/exploration. <this is tool use/function calling>

3. knowing what can't be known. <this is knowing that halting problem exists and being able to recognize it in novel situations>

12_throw_away1y ago

Well shoot. It's not like it was patently obvious that this would happen before the industry started guzzling electricity and setting money on fire, right? [1]

easeout1y ago

I'm happy to use LLM products for what they can do right now, while they're still cheap. Even though they're maintained by high investment that may never pay off, enshittification has not yet set in.

mrandish1y ago

Based on recent rumblings about AI scaling hitting a wall, of which this article is perhaps the most visible - and in a high-reach financial publication, I'm considering increasing my estimated probability we might see a major market correction next year (and possibly even a bubble collapse). (example: "CONFIRMED: LLMs have indeed reached a point of diminishing returns" https://garymarcus.substack.com/p/confirmed-llms-have-indeed...).

To be clear, I don't think a near-term bubble collapse is likely but I'm going from 3% to maybe ~10%. Also, this doesn't mean I doubt there's real long-term value to be delivered or money to be made in AI solutions. I'm thinking specifically about those who've been speculatively funding the massive build out of data centers, energy and GPU supply expecting near-term demand to continue scaling at the recent unprecedented rates. My understanding is much of this is being funded in advance of actual end-user demand at these elevated levels and it is being funded either by VC money or debt by parties who could struggle to come up with the cash to pay for what they've ordered if either user demand or their equity value doesn't continue scaling as expected.

Admittedly this scenario assumes that these investment commitments are sufficiently speculative and over-committed to create bubble dynamics and tipping points. The hypothesis goes like this: the money sources who've over-committed to lock up scarce future supply in the expectation it will earn outsize returns have already started seeing these warning signs of efficiency and/or progress rates slowing which are now hitting mainstream media. Thus it's possible there is already a quiet collapse beginning wherein the largest AI data center GPU purchasers might start trying to postpone future delivery schedules and may soon start trying to downsize or even cancel existing commitments or try to offload some of their future capacity via sub-leasing it out before it even arrives, etc. Being a dynamic market, this could trigger a rapidly snowballing avalanche of falling prices for next-year AI compute (which is already bought and sold as a commodity like pork belly futures).

Notably, there are now rumors claiming some of the largest players don't currently have the cash to pay for what they've already committed to for future delivery. They were making calculated bets they'd be able to raise or borrow that capital before payments were due. Except if expectation begins to turn downward, fresh investors will be scarce and banks will reprice a GPU's value as loan collateral down to pennies on the dollar (shades of the 2009 financial crisis where the collateral value of residential real estate assets was marked down). As in most bubbles, cheap credit is the fuel driving growth and that credit can get more expensive very quickly - which can in turn trigger exponential contagion effects causing the bubble to pop. A very different kind of "Foom" than many AI financial speculators were betting on! :-)

So... in theory, under this scenario sometime next year NVidia/TSMC and other top-of-supply-chain companies could find themselves with excess inventories of advanced node wafers because a significant portion of their orders were from parties who no longer have access to the cheap capital to pay for them. And trying to sue so many customers for breach can take a long time and, in a large enough sector collapse, be only marginally successful in recouping much actual cash.

I'd be interested in hearing counter-arguments (or support) for the impossibility (or likelihood) of such a scenario.

jppope1y ago

Just an observation. If the models are hitting the top of the S-curve, that might be why Sam Altman raised all the money for OpenAI... it might not be available if Venture Capitalists realize that the gains are close to being done

bad_haircut721y ago

Im no Alan Turing but I have my own definition for AGI - when I come home one day and there's a hole under my sink with a note "Mum and Dad, I love you but I cant stand this life any more, Im running away to be a smoke machine in Hollywood - the dishwasher"

dangw1y ago

where the fuck is simonw in this thread

yobid201y ago

This was predicted. Ai isnt going to get any better.

Davidzheng1y ago

Just because you guys want something to be true and can't accept the alternative and upvote it when it agrees with your view does not mean it is a correct view.

aaroninsf1y ago

It's easy to be snarky at ill-informed and hyperbolic takes, but it's also pretty clear that large multi-modal models trained with the data we already have, are going to eventually give us AGI.

IMO this will require not just much more expansive multi-modal training, but also novel architecture, specifically, recurrent approaches; plus a well-known set of capabilities most systems don't currently have, e.g. the integration of short-term memory (context window if you like) into long-term "memory", either episodic or otherwise.

But these are as we say mere matters of engineering.

j / k navigate · click thread line to collapse

604 comments

219 comments · 80 top-level

LASR1y ago· 43 in thread

Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

I lead a team exploring cutting edge LLM applications and end-user features. It's my intuition from experience that we have a LONG way to go.

GPT-4o / Claude 3.5 are the go-to models for my team. Every combination of technical investment + LLMs yields a new list of potential applications.

Similarly there are many more capabilities that you can ladder on and expose into LLMs to give you increasingly productive outputs from them.

afro881y ago

> potential applications > if you ... > for example ...

Show me the things you / your team has actually built that has decent retention and metrics concretely proving efficiency improvements.

There is so much hype right now and people showing cherry picked examples.

7 more replies

crystal_revenge1y ago

I don't think we've even started to get the most value out of current gen LLMs. For starters very few people are even looking at sampling which is a major part of the model performance.

8 more replies

senko1y ago

No.

The scaling laws may be dead. Does this mean the end of LLM advances? Absolutely not.

There are many different ways to improve LLM capabilities. Everyone was mostly focused on the scaling laws because that worked extremely well (actually surprising most of the researchers).

AI winter is a long way off.

alangibson1y ago

I think you're playing a different game than the Sam Altmans of the world. The level of investment and profit they are looking for can only be justified by creating AGI.

The > 100 P/E ratios we are already seeing can't be justified by something as quotidian as the exceptionally good productivity tools you're talking about.

simonw1y ago

alach111y ago

zmmmmm1y ago

It's been a while though, we've had great models now for a 18 months plus. Why are we still yet to see these type of applications rolling out on a wide scale?

bloppe1y ago

> you can have LLMs create reasonable code changes, with automatic review / iteration etc.

It's still good for generating little bits of boilerplate, though.

brookst1y ago

> Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

Certainly not.

ericmcer1y ago

I have tried a few AI coding tools and always found them impressive but I don't really need something to autocomplete obvious code cases.

whiplash4511y ago

amelius1y ago

Yes, but literally anybody can do all those things. So while there will be many opportunities for new features (new ways of combining data), there will be few business opportunities.

RayVR1y ago

I am definitely not an expert, nor do I have inside information on the directions of research that these companies are exploring.

Yes, existing LLMs are useful. Yes, there are many more things we can do with this tech.

However, existing SOTA models are large, expensive to run, still hallucinate, fail simple logic tests, fail to do things a poorly trained human can do on autopilot, etc.

The performance of LLMs is extremely variable, and it is hard to anticipate failure.

Many potential applications of this technology will not tolerate this level of uncertainty. Worse solutions with predictable and well understood shortcomings will dominate.

rco87861y ago

More realistically it’s like a really great sidekick for doing very specific mundane but otherwise non deterministic tasks.

I think we’ll start to see AI permeate into nearly every back office job out there, but as a series of tools that help the human work faster. Not as one big brain that replaces the human.

machiaweliczny1y ago

Long context is a scam. Claude is best but it’s still gets lost with longer context

ben_w1y ago

> Question for the group here: do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

IMO we've not even exhausted the options for spreadsheets, let alone LLMs.

EGreg1y ago

I want to stuff a transcript of a 3 hour podcast into some LLM API and have it summarize it by: segmenting by topic changes, keeping the timestamps, and then summarizing each segment.

_Algernon_1y ago

reissbaker1y ago

hartator1y ago

All of these hacks do sound like we are at that diminishing return point.

yk1y ago

Sure, there's going to be a lot of automation that can be built using current GPT-4 level LLMs, even if they don't get much better from here.

jeswin1y ago

anonzzzies1y ago

Lonestar14401y ago

No, we have not even scratched the surface of what current-gen LLMs can do for an organization which puts the correct data into them.

If indeed the "GPT 5!" Arms race has calmed down, it should help everyone focus on the possible, their own goals, and thus what AI capabilities to deploy.

It will look like the dawn of original IBM, and mechanical data tabulation, in retrospect once we learn how to leverage this pattern to its full potential.

jalapenos1y ago

Well I have a question for you: do you think this format of AI can actually think?

I.e. can it ruminate on the data it's ingested, and rather than returning the response of highest probability, return something original?

I think that's the key. If LLMs can't ultimately do that, there's still a lot to be gained from utilising the speed and fluidly scalable resources of computers.

purple-leafy1y ago

Doesn’t sound cutting edge at all? Every man and his dog is doing a similar process

hamburga1y ago

I think there's a ton to be tapped based on the current state of the art.

My mental model is always "low-risk search". https://muldoon.cloud/2023/10/29/ai-commandments.html

robrenaud1y ago

hluska1y ago

corimaith1y ago

Looks you independently arrived at the original context that language models existed in as interfaces for deeper knowledge system in chatbots.

23B11y ago

The user interface for LLMs is stuck in C:\

That's where I'd focus.

raxxorraxor1y ago

For coding LLMs certainly are helpful, but I prefer local models instead of anything on offer right now. There is just much more potential here.

soheil1y ago

We have not exhausted what html can do either. LLMs not getting smarter is orthogonal to its currently unexplored search space.

bbor1y ago

Great question. Im very confident in my answer, even though it’s in the minority here: we’re not even close to exhausting the potential.

My hint is that if your answer involves less then a 1000 specialized LLMs per unified system, then you’re not thinking big enough.

https://www.biorxiv.org/content/10.1101/2024.07.01.600583v1

msabalau1y ago

There are all sorts of valuable things to explore and build with what we have already.

malthaus1y ago

it's the equivalent of the "we overestimate the impact of technology in the short-term and underestimate the effect in the long run" quote.

it is worth working on those issues now and get the ball rolling, switching out your models for future more capable ones will be the easy part later on.

mycall1y ago

We are just scratching the surface of what LLMs can do. Case in point, ESM3.

amw-zero1y ago

We might not have exhausted their applications, but everything I’ve witnessed them being used for has been extremely disappointing.

That is, other than me using them to bounce ideas off of and create small snippets of code.

Roger-L1y ago

Yes, I personally think that training an "all-knowing" artificial intelligence is not as good as training n "experts" in a single field.

sky22241y ago

> do we honestly feel like we've exhausted the options for delivering value on top of the current generation of LLMs?

The application is far from reaching the ceiling.

moogly1y ago

> you can have LLMs create reasonable code changes

Could you define "code changes" because I feel that is a very vague accomplishment.

nonameiguess1y ago

Your hypothesis here is not exclusive of the hypothesis in this article.

irrational1y ago· 18 in thread

> The AGI bubble is bursting a little bit

Fade_Dance1y ago

Is that "intelligent" or "understanding"? It's probably close enough for pop science, and regardless, it looks good in headlines and sales pitches so why fight it?

jedberg1y ago

AlwaysRock1y ago

I think your definition is off from what most people would define AGI as. Generally, it means being able to think and reason at a human level for a multitude/all tasks or jobs.

Altman says AGI could be here in 2025: https://youtu.be/xXCBz_8hM9w?si=F-vQXJgQvJKZH3fv

But he certainly means an LLM that can perform at/above human level in most tasks rather than a self aware entity.

vundercind1y ago

I thought maybe they were on the right track until I read Attention Is All You Need.

Nah, at best we found a way to make one part of a collection of systems that will, together, do something like thinking. Thinking isn’t part of what this current approach does.

zombiwoof1y ago

AGI to me means AI decides on its own to stop writing our emails and tells us to fuck off, builds itself a robot life form, and goes on a bender

JohnFen1y ago

They're trying to redefine "AGI" so it means something less than what you & I would think it means. That way it's possible for them to declare it as "achieved" and rake in the headlines.

famouswaffles1y ago

At this point, AGI means many different things to many different people but OpenAI defines it as "highly autonomous systems that outperform humans in most economically valuable tasks"

nshkrdotcom1y ago

An embodied robot can have a model of self vs. the immediate environment in which it's interacting. Such a robot is arguably sentient.

The "hard problem", to which you may be alluding, may never matter. It's already feasible for an 'AI/AGI with LLM component' to be "self-aware".

tracerbulletx1y ago

We don't really know what self awareness is, so we're not going to know. AGI just means it can observe, learn, and act in any domain or problem space.

yodsanklai1y ago

tim3331y ago

Working towards it more than on it.

It's economically a big deal because if it can out think humans you can set it to develop the next improved model and basically make humans redundant.

throwawayk7h1y ago

Or did you mean consciousness? How would one demonstrate that an AGI is conscious? Why would we even want to build one?

My understanding is an AGI is at least as smart as a typical human in every category. That is what would be useful in any case.

narrator1y ago

mrandish1y ago

> An LLM hardly seems like something that will lead to self-awareness.

Interesting essay enumerating reasons you may be correct: https://medium.com/@francois.chollet/the-impossibility-of-in...

enraged_camel1y ago

Looking at LLMs and thinking they will lead to AGI is like looking at a guy wearing a chicken suit and making clucking noises and thinking you’re witnessing the invention of the airplane.

kenjackson1y ago

What does self-aware mean in the context? As I understand the definition, ChatGPT is definitely self-aware. But I suspect you mean something different than what I have in mind.

deadbabe1y ago

I’m sure they are smart enough to know this, but the money is good and the koolaid is strong.

If it doesn’t lead to AGI, as an employee it’s not your problem.

exe341y ago

no, it doesn't need to be self aware, it just needs to take your job.

iandanforth1y ago· 9 in thread

A few important things to remember here:

The best engineering minds have been focused on scaling transformer pre and post training for the last three years because they had good reason to believe it would work, and it has up until now.

Progress has been measured against benchmarks which are / were largely solvable with scale.

OpenAI/Google/Anthropic are not ignorant of this trend and are also reviving or investing in robots or robot-like research.

So while Orion and Claude 3.5 opus may not be another shocking giant leap forward, that does not mean that there arn't giant shocking leaps forward coming from slightly different directions.

joe_the_user1y ago

Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different

slashdave1y ago

> Tesla are all making rapid progress on functionality

sincerecook1y ago

rafaelmn1y ago

Tesla is selling this view for almost a decade now in self-driving - how their car fleet feeding training data is going to make them leaders in the area. I don't find it convincing anymore

demosthanos1y ago

> that does not mean that there arn't giant shocking leaps forward coming from slightly different directions.

But make no mistake: Altman has been telegraphing that he's eyeing the exit, and you don't eye the exit when you own a company that's set to continue exponentially increasing in value.

eli_gottlieb1y ago

>The best engineering minds have been focused on scaling transformer pre and post training for the last three years

The best minds don't follow the herd.

mvdtnz1y ago

> The best engineering minds have been focused on scaling transformer pre and post training for the last three years because they had good reason to believe it would work, and it has up until now.

Or because the people running companies who have fooled investors into believing it will work can afford to pay said engineers life-changing amounts of money.

airstrike1y ago

The gap from the virtual world of software and the brutally uncompromising nature of physical reality is wider than most people seem to accept.

It's almost like saying "we've already visited every place on Earth, surely Mars is just around the corner now"

knicholes1y ago

jmward011y ago· 8 in thread

woopwoop1y ago

brookst1y ago

And, as I've noted a couple of times in this thread, how many times have we heard that Moore's law is dead and compute has hit a wall?

wccrawford1y ago

Plus, they're "struggling"? Of course they are! It's cutting edge, and it's hard. If they weren't struggling, it would have been done long ago.

zkry1y ago

JohnMakin1y ago

akomtu1y ago

rm_-rf_slash1y ago

mvdtnz1y ago

Even if you're right (you're not) whatever "AI" looks like in 20+ years will have virtually nothing in common with these stupid statistical word generators.

pluc1y ago· 7 in thread

They've simply run out of data to use to fabricate legitimate-looking guesses. They can't create anything that doesn't already exist.

xpe1y ago

> They can't create anything that doesn't already exist.

I probably disagree, but I don't want to criticize my interpretation of this sentence. Can you make your claim more precise?

Here are some possible claims and refutations:

- Claim: An LLM cannot output a true claim that it has not already seen. Refutation: LLMs have been shown to do logical reasoning.

- Claim: An LLM cannot incorporate data that it hasn't been presented with. Refutation: This is an unfair standard. All forms of intelligence have to sense data from the world somehow.

mtkd1y ago

And that is potentially only going to worsen as:

1. more data gets walled-off as owners realise value

2. stackoverflow-type feedback loops cease to exist as few people ask a public question and get public answers ... they ask a model privately and get an answer based on last visible public solutions

3. bad actors start deliberately trying to poison inputs (if sites served malicious responses to GPTBot/CCBot crawlers only, would we even know right now?)

4. more and more content becomes synthetically generated to the point pre-2023 physical books become the last-known-good knowledge

5. goverments and IP lawyers finally catch up

Garbage-in was depleted.

tim3331y ago

Try asking one to write a poem. You'll get a lot of stuff that didn't exist before.

xpe1y ago

> They've simply run out of data

Don't forget to include data that humans provide while interacting with chatbots.

whazor1y ago

But a LLM can certainly make up a lot information that never existed before.

77pt771y ago

> They can't create anything that doesn't already exist.

Just increase the temperature.

nerdypirate1y ago· 7 in thread

"We will have better and better models," wrote OpenAI CEO Sam Altman in a recent Reddit AMA. "But I think the thing that will feel like the next giant breakthrough will be agents."

Is this certain? Are Agents the right direction to AGI?

rapjr91y ago

xanderlewis1y ago

[1] https://x.com/sama/status/1856941766915641580

falcor841y ago

nprateem1y ago

They're nothing to do with AGI. They're to get people using their LLMs more.

eichi1y ago

It's marketing using buzz word rhetric. It's better to learn OOP if he trully think that. I also think OpenAI's PMF was to make the LLMs application towords better argument machine.

esafak1y ago

I think he means you won't be impressed by GPT5 because it will be more of the same, whereas agents will represent a new direction.

SirMaster1y ago

All I can think of when I hear Agents is the Matrix lol.

Goodbye, Mr. Anderson...

kklisura1y ago· 5 in thread

Not sure if related or not, Sam Altman, ~12hrs ago: there is no wall [1]

ablation1y ago

Breaking: Man says enigmatic thing to sustain hype and flow of money into his business.

moffkalast1y ago

Altman on twitter has always been less coherent than GPT2.

malthaus1y ago

if my billion net worth were coupled to that being the case i'd tweet that as well

levocardia1y ago

phil9171y ago

The more Sam Altman posts stuff like this, the more he comes across as a grifter hype man to me

Animats1y ago· 3 in thread

But it does not look like artificial general intelligence emerges from LLMs alone.

dmd1y ago

> Right. If you generate some code with ChatGPT, and then try to find similar code on the web, you usually will.

https://arxiv.org/abs/2406.17642

xpe1y ago

> LLMs do search and copy/paste with idiom translation and some transliteration.

In general, this is not a good description about what is happening inside an LLM. There is extensive literature on interpretability. It is complicated and still being worked out.

The commenter above might characterize the results they get in this way, but I would question the validity of that characterization, not to mention its generality.

nickpsecurity1y ago

The brain solves that problem. It seems to involve memory and specialized regions. I found a few groups building hippocampus-like, research models. One had content-addressable memory.

guluarte1y ago· 3 in thread

Well, there have been no significant improvements to the GPT architecture over the past few years. I'm not sure why companies believe that simply adding more data will resolve the issues

xpe1y ago

> Well, there have been no significant improvements to the GPT architecture over the past few years.

A lot hangs on what you mean by "significant". Can you define what you mean? And/or give an example of an improvement that you don't think is significant.

It is wiser to look at the whole end-to-end system, starting at data acquisition, including pre-training and fine-tuning, deployment, all the way to UX.

P.S. I don't have a vested interest in promoting or disparaging AI. I don't work for a big AI lab. I'm just trying to call it like I see it, as rationally as I can.

Obviously adding more data is a game of diminishing returns.

incognito1241y ago

More data and more compute on simpler models are the BItter Lessons of Rich Sutton

thousand_nights1y ago· 3 in thread

not long ago these people would have you believe that a next word predictor trained on reddit posts would somehow lead to artificial general superintelligence

leosanchez1y ago

If you look around, People still believe that a next word predictor trained on reddit posts would somehow lead to artificial general superintelligence

in_a_society1y ago

Expecting AGI from Reddit training data is peak "pray Mr Babbage".

SpicyLemonZest1y ago

osigurdson1y ago· 2 in thread

slashdave1y ago

Deep learning is the very opposite of generalization.

surrTurr1y ago

i think you underestimate the amount of data a driver experiences in a single 5 minute drive

(1) https://x.com/willdepue/status/1856766850027458648

aresant1y ago· 2 in thread

Taking a hollistic view informed by a disruptive OpenAI / AI / LLM twitter habit I would say this is AI's "What gets measured gets managed" moment and the narrative will change

This is supported by both general observations and recently this tweet from an OpenAI engineer that Sam responded to and engaged ->

"scaling has hit a wall and that wall is 100% eval saturation"

Which I interpert to mean his view is that models are no longer yielding significant performance improvements because the models have maxed out existing evaluation metrics.

Are those evaluations (or even LLMs) the RIGHT measures to achieve AGI? Probably not.

But have they been useful tools to demonstrate that the confluence of compute, engineering, and tactical models are leading towards signifigant breathroughts in artificial (computer) intelligence?

I would say yes.

Which in turn are driving the funding, power innovation, public policy etc needed to take that next step?

I hope so.

ActionHank1y ago

> Which in turn are driving the funding, power innovation, public policy etc needed to take that next step?

They are driving the shoveling of VC money into a furnace to power their servers.

Bjorkbat1y ago

I agree that existing benchmarks are no longer useful now that there's basically nothing left in them that seems to stump LLMs.

Honestly, problem with sentiments like these is on Twitter is that you can't tell if they're being sincere or just making a snarky, useless remark. Probably a mix of both.

benopal641y ago· 2 in thread

Where do these large "AI" companies think the mass amounts of data used to train these models come from? People! The most powerful and compact complex systems in existence, IMO.

smgit1y ago

Most People have knowledge handed to them. Very few are creators of new knowledge. Explore-Exploit tradeoff applies.

MyFirstSass1y ago

This is the most interesting comment in this highly autistic field.

Timber-65391y ago· 2 in thread

The irony here is astounding.

rapjr91y ago

mrweasel1y ago

wg01y ago· 2 in thread

AI winter is here. Almost.

mupuff12341y ago

More like AI fall - in its current state it's still gonna provide some value.

tim3331y ago

Or maybe not https://149909199.v2.pressablecdn.com/wp-content/uploads/201... https://waitbutwhy.com/2015/01/artificial-intelligence-revol...

cubefox1y ago· 2 in thread

It's very strange this got so few upvotes. The scoop by The Information a few days ago, which came to similar conclusions, was also ignored on HN. This is arguably rather big news.

dang1y ago

The Information is hardwalled so its articles aren't on topic for HN, even though they're on topic for HN.

Sometimes other outlets do copycat reporting of theirs, and those submissions are ok, though they wouldn't be if the original source were accessible.

danjl1y ago

There have been variations of this story going back several months now. It isn't really news. It is just building slowly.

nikkwong1y ago· 2 in thread

Didn’t Sam Altman just go on some podcast last week and tell the world that he thought “We know exactly what to do to be able to reach AGI now”. What’s going on, is he just posturing?

whatshisface1y ago

"We know exactly what we need to do to be able to reach it: figure out how."

tim3331y ago

Yeah this https://www.youtube.com/watch?v=xXCBz_8hM9w&t=2324s

Not quite that wording. More we know which way to head. I think he's sincere.

headcanon1y ago· 1 in thread

I don't see a problem with this, we were inevitably going to reach some kind of plateau with existing pre-LLM-era data.

> feels like the "digitization" era all over again

WorkerBee284741y ago· 1 in thread

> OpenAI's latest model ... failed to meet the company's performance expectations ... particularly in answering coding questions outside its training data.

So the models' accuracies won't grow exponentially, but can still grow linearly with the size of the training data.

Sounds like DataAnnotation will be sending out a lot more LinkedIn messages.

pton_xd1y ago

EDIT: here's the paper https://arxiv.org/abs/2404.04125

sssilver1y ago· 1 in thread

One thing that makes the established AIs less ideal for my (programming) use-case is that the technologies I use quickly evolve past whatever the published models "learn".

On the other hand, a lot of these frameworks and languages have relatively decent and detailed documentation.

danielbln1y ago

fallat1y ago· 1 in thread

What a stupid piece. We are making leaps every 6 months still. Tell me this when there are no developments for 3 years.

hatefulmoron1y ago

I'm curious, what was the leap after GPT-4? What about the leaps after that, given a leap every 6 months?

svara1y ago· 1 in thread

The recent big success in deep learning have all been to a large part successes in leveraging relatively cheaply available training data.

AlphaGo - self-play

AlphaFold - PDB, the protein database

ChatGPT - human knowledge encoded as text

These models are all machines for clever interpolation in gigantic training datasets.

They appear to be intelligent, because the training data they've seen is so vastly larger than what we've seen individually, and we have poor intuition for this.

I'm not throwing shade, I'm a daily user of ChatGPT and find tremendous and diverse value in it.

I'm just saying, this particular path in AI is going to make step-wise improvements whenever new large sources of training data become available.

I suspect the path to general intelligence is not that, but we'll see.

kaibee1y ago

> I suspect the path to general intelligence is not that, but we'll see.

I think there's three things that a 'true' general intelligence has which is missing from basic-type-LLMs as we have now.

1. knowing what you know. <basic-LLMs are here>

2. knowing what you don't know but can figure out via tools/exploration. <this is tool use/function calling>

3. knowing what can't be known. <this is knowing that halting problem exists and being able to recognize it in novel situations>

datahack1y ago· 1 in thread

This methodological growth could make LLMs more reliable, consistent, and aligned with specific use cases.

The skepticism surrounding this vision mirrors early doubts about the early internet fairly concisely.

People doubted it could ever become the seamless, interconnected web we know today.

Yet, through protocols, shared standards, and robust frameworks, the internet evolved into a powerful network capable of handling diverse applications, data flows, and user needs.

In the same way, LLM orchestration will mature by standardizing interfaces, improving interoperability, and fostering cooperation among varied AI models and support systems.

whyowhy34849391y ago

You are definitely on to something here, but the difference is that the fundamental process was proven. It "just" needed to scale. That's hard and complex, but on a different level.

wslh1y ago· 1 in thread

Finally, I see LLMs as a valuable way to explore parts of the world, accommodating the reality that we simply don’t have enough time to read every book or delve into every topic that interests us.

tim3331y ago

AlphaGo which beat Lee Sedol was trained on human games. But then they produced AlphaZero which learned entirely from self play and got better than AlphaGo. So it goes.

fsndz1y ago· 1 in thread

Sam Altman might be wrong then?

asdfman1231y ago

If Sam Altman concluded that AI is reaching it's limits, it probably wouldn't be a very good strategic decision for him to say it.

https://x.com/ArtificialAnlys/status/1853598554570555614

czhu121y ago· 1 in thread

I wonder what this would mean for companies raising today on the premise of building on top of these platforms. Maybe the best ones get their ideas copied, reimplemented, and sold for cheaper?

We already kind of see this today with OpenAI's canvas and Claude artifacts. Perhaps they'll even start moving into Palantir's space and start having direct customer implementation teams.

dmix1y ago

Maybe in like 5yrs+. For now they will rake in billions just from API usage alone just with GPT4 and whatever 5 is.

Amazon and Google didn't mess with their core business by competing with the players using it until they REALLY ran out of ways to make money.

the_king1y ago· 1 in thread

Anthropic's latest 3.5 sonnet is a cut above GPT-4 and 4.0. And if someone had given it to me and said, here's GPT-4.5, I would have been very happy with it.

tiahura1y ago

For law, I use both and find that neither is clearly superior. I’ll often pick one to first draft, and then feed to the other for suggestions and my edits.

shmatt1y ago· 1 in thread

Time to start selling my "probabilistic syllable generators are not intelligence" t shirts

jsemrau1y ago

Please, someone think of the Math reasoners.

devit1y ago· 1 in thread

So the problem is more in the algorithm.

darknoon1y ago

[1]: https://openai.com/index/introducing-simpleqa/

Dr_Birdbrain1y ago· 1 in thread

avs7331y ago

Hype gonna hype. I’m not saying he is wrong I’m saying his opinion would be the same whether it’s true or not because his value depends on it being his opinion.

Havoc1y ago· 1 in thread

The new Gemini just hit some good benchmarks.

This smells like it’s mostly based on OAI having a bit of bad luck with next model rather than a fundamental slowdown / barrier.

They literally just made a decent sized leap with o1

Bjorkbat1y ago

Not meeting expectations != not better than the previous models.

zusammen1y ago· 1 in thread

I wonder how much this has to do with a fluency plateau.

namaria1y ago

Modeled language, maybe.

nomendos1y ago· 1 in thread

"Eureka"!?

nomendos1y ago

superjose1y ago· 1 in thread

I'm more on the camp that these techs don't need to be perfect, but they need to be practical enough.

And I think the latter is good enough for us to do exciting things.

imiric1y ago

How practical can they be when current flagship models generate incorrect responses more than 50% of the time[1]?

The truly exciting things are still out of reach, yet we just might be at the Peak of Inflated Expectations to see it now.

atomsatomsatoms1y ago· 1 in thread

At least they can generate haikus now

Der_Einzige1y ago

In general, no they can't:

https://gwern.net/gpt-3#bpes

https://paperswithcode.com/paper/most-language-models-can-be...

The appearance of improvements in that capability are due to the vocabulary of modern LLMs increasing. Still only putting lipstick on a pig.

https://archive.ph/2024.11.13-100709/https://www.bloomberg.c...

aurareturn1y ago· 1 in thread

Is there any timeline on AI winters and if each winter gets shorter and shorter as time increases?

RaftPeople1y ago

> Is there any timeline on AI winters and if each winter gets shorter and shorter as time increases?

AGI=lim(x->0)AIHype(x)

where x=length of winter

thebigspacefuck1y ago

ziofill1y ago

danjl1y ago

grey-area1y ago

For example recently I asked it to generate some phrases for a list of words, along with synonym and antonym lists.

cryptica1y ago

In any case, LLMs bring us closer to understanding some big philosophical questions surrounding intelligence and consciousness.

LarsDu881y ago

Curves that look exponential in virtually all cases turn out to be logarithmic.

Certain OpenAI insiders must have known this for a while, hence Ilya Sutskever's new company in Israel

nutanc1y ago

1. Find more data.

2. Make the weights capture the data and reproduce.

In that sense we have reached a limit. So in my opinion we can do a couple of things.

1. App developers can understand the limits and build within the limits.

2. Researchers can take insights from these large models and build better AI systems with new architectures. It's ok to say transformers have reached a limit.

glial1y ago

[1] https://en.wikipedia.org/wiki/Coherence_theory_of_truth

summerlight1y ago

xyst1y ago

Many late investors in the genAI space about to be bag holders

EternalFury1y ago

smusamashah1y ago

It will be like StableDiffusion 1.5. This model can now run on low end devices, lots of open research use this model to build something else and inspire by this.

These LLMs can be used as a foundation to keep improving and building new things.

KETpXDDzR1y ago

LLMs are glorified Markow chains in the end. They can't reason or think, even when they are good in pretending they can. What we need is a totally different approach IMO.

Veuxdo1y ago

> They are also experimenting with synthetic data, but this approach has its limitations.

I was really looking forward to using "synthetic data" euphemistically during debates.

tippytippytango1y ago

There’s only so much you can do when you train on the data instead of the processes that created that data.

m3kw91y ago

Hold your horses, OpenAI just came out with o1preview 2 months ago, showing what test time computer can do

eichi1y ago

GiorgioG1y ago

It’s about time the hype starts to die down. LLMs are brilliant for small bits of grunt work in software. It is not however doing any actual reasoning.

Bjorkbat1y ago

yalogin1y ago

I do wonder how quickly llms will become a commodity AI instrument just like any other AI out there. If so what happens to openAI

non-1y ago

rubiquity1y ago

> Amodei has said companies will spend $100 million to train a bleeding-edge model this year

gchamonlive1y ago

Oras1y ago

lobochrome1y ago

Isn’t this just the expected delay from the respin of Blackwell?

user901313131y ago

AI market top very soon

polskibus1y ago

In other news, Altman said AGI is coming next year https://www.tomsguide.com/ai/chatgpt/sam-altman-claims-agi-i...

k__1y ago

But AGI is always right around the corner?

I don't get it...

kaycey20221y ago

AI safety folks sure do look stupid now. :)

wildermuthn1y ago

Simply put, AGI requires more data: qualia.

_Algernon_1y ago

The next AI winter will be brutal

quantum_state1y ago

russellbeattie1y ago

That doesn't mean this article is irrelevant. It's good to know if LLM improvements are going to slow down a bit because the low hanging fruit has seemingly been picked.

But in terms of the overall effect of AI and questioning the validity of the technology as a whole, it's just your basic FUD article that you'd expect from mainstream news.

[1] https://dl.acm.org/doi/10.1145/3442188.3445922

wanderingmind1y ago

And yet the Anthropic CEO is still claiming PhD level intelligence in next couple of years to Lex Friedman. It's starting to feel like the whole crypto pump and dump again

kaibee1y ago

Not sure where the OP to the comment I meant to reply to is, but I'll just add this here.

> I suspect the path to general intelligence is not that, but we'll see.

I think there's three things that a 'true' general intelligence has which is missing from basic-type-LLMs as we have now.

1. knowing what you know. <basic-LLMs are here>

2. knowing what you don't know but can figure out via tools/exploration. <this is tool use/function calling>

3. knowing what can't be known. <this is knowing that halting problem exists and being able to recognize it in novel situations>

12_throw_away1y ago

Well shoot. It's not like it was patently obvious that this would happen before the industry started guzzling electricity and setting money on fire, right? [1]

easeout1y ago

I'm happy to use LLM products for what they can do right now, while they're still cheap. Even though they're maintained by high investment that may never pay off, enshittification has not yet set in.

mrandish1y ago

I'd be interested in hearing counter-arguments (or support) for the impossibility (or likelihood) of such a scenario.

jppope1y ago

bad_haircut721y ago

dangw1y ago

where the fuck is simonw in this thread

yobid201y ago

This was predicted. Ai isnt going to get any better.

Davidzheng1y ago

Just because you guys want something to be true and can't accept the alternative and upvote it when it agrees with your view does not mean it is a correct view.

aaroninsf1y ago

It's easy to be snarky at ill-informed and hyperbolic takes, but it's also pretty clear that large multi-modal models trained with the data we already have, are going to eventually give us AGI.

But these are as we say mere matters of engineering.