> For an airplane wing (airfoil), the top surface is curved and the bottom is flatter. When the wing moves forward:
> * Air over the top has to travel farther in the same amount of time -> it moves faster -> pressure on the top decreases.
> * Air underneath moves slower -> pressure underneath is higher
> * The presure difference creates an upward force - lift
Isn't that explanation of why wings work completely wrong? There's nothing that forces the air to cover the top distance in the same time that it covers the bottom distance, and in fact it doesn't. https://www.cam.ac.uk/research/news/how-wings-really-work
Very strange to use a mistake as your first demo, especially while talking about how it's phd level.
Source: PhD on aircraft design
“This is why a flat surface like a sail is able to cause lift – here the distance on each side is the same but it is slightly curved when it is rigged and so it acts as an aerofoil. In other words, it’s the curvature that creates lift, not the distance.”
But like you say flat plates can generate lift at positive AoA, no curvature (camber) required. Can you confirm this is correct? Kinda going crazy because I'd very much expect a Cambridge aerodynamicist to get this 100% right.
I've always been under the impression that flat-plate airfoils can't generate lift without a positive angle-of-attack - where lift is generated through the separate mechanism of the air pushing against an angled plane? But a modern airfoil can, because of this effect.
And that if you flip them upside down, a flat plate is more efficient and requires less angle-of-attack than the standard airfoil shape because now the lift advantage is working to generate a downforce.
I just tried to search Google, but I'm finding all sorts of conflicting answers, with only a vague consensus that the AI-provided answer above is, in fact, correct. The shape of the wing causes pressure differences that generate lift in conjunction with multiple other effects that also generate lift by pushing or redirecting air downward.
There is no requirement for air to travel any where. Let alone in any amount of time. So this part of the AI's response is completely wrong. "Same amount of time" as what? Air going underneath the wing? With an angle of attack the air under the wing is being deflected down, not magically meeting up with the air above the wing.
They spout common knowledge on a broad array of subjects and it's usually incorrect to anyone who has some knowledge on the subject.
> “What actually causes lift is introducing a shape into the airflow, which curves the streamlines and introduces pressure changes – lower pressure on the upper surface and higher pressure on the lower surface,” clarified Babinsky, from the Department of Engineering. “This is why a flat surface like a sail is able to cause lift – here the distance on each side is the same but it is slightly curved when it is rigged and so it acts as an aerofoil. In other words, it’s the curvature that creates lift, not the distance.”
The meta-point that "it's the curvature that creates the lift, not the distance" is incredibly subtle for a lay audience. So it may be completely wrong for you, but not for 99.9% of the population. The pressure differential is important, and the curvature does create lift, although not via speed differential.
I am far from an AI hypebeast, but this subthread feels like people reaching for a criticism.
That doesn't matter for lay audieces and doesn't really matter at all until we try and use them for technical things.
The video in the Cambridge link shows how the upper surface particles greatly overtake the lower surface flow. They do not rejoin, ever.
People seem to overcomplicate what LLM's are capable of, but at their core they are just really good word parsers.
Most of the phd’s I know are studying things that I guarantee GPT-5 doesn’t know about… because they’re researching novel stuff.
Also, LLMs don’t have much consistency with how well they’re able to apply the knowledge that they supposedly have. Hence the “lots of almost correct code” stereotype that’s been going around.
I was using the fancy new Claude model yesterday to debug some fast-check tests (quickcheck-inspired typescript lib). Claude could absolutely not wrap its head around the shrinking behavior, which rendered it useless for debugging
It’s very common to see AI evangelists taking its output at face value, particularly when it’s about something that they are not an expert in. I thought we’d start seeing less of this as people get burned by it, but it seems that we’re actually just seeing more of it as LLMs get better at sounding correct. Their ability to sound correct continues to increase faster than their ability to be correct.
Meanwhile the demo seems to suggest business as usual for AI hallucinations and deceptions.
This is the problem with AI in general.
When I ask it about things I already understand, it’s clearly wrong quite often.
When I ask it about something I don’t understand, I have no way to know if its response is right or wrong.
https://www.grc.nasa.gov/www/k-12/VirtualAero/BottleRocket/a...
In fact I'd classify it as downright strange.
An LLM doesn't know more than what's in the training data.
In Michael Crichton's The Great Train Robbery (published in 1975, about events that happened in 1855) the perpetrator, having been caught, explains to a baffled court that he was able to walk on top of a running train "because of the Bernoulli effect", that he misspells and completely misunderstands. I don't remember if this argument helps him get away with the crime? Maybe it does, I'm not sure.
This is another attempt at a Great Robbery.
It goes on:
> At this point, the prosecutor asked for further elucidation, which Pierce gave in garbled form. The summary of this portion of the trial, as reported in the Times, was garbled still further. The general idea was that Pierce--- by now almost revered in the press as a master criminal--- possessed some knowledge of a scientific principle that had aided him.
How apropos to modern science reporting and LLMs.
And I might be wrong but my understanding is that it's not wrong per-se, it's just wildly incomplete. Which, is kind of like the same as wrong. But I believe the airfoil design does indeed have the effect described which does contribute to lift somewhat right? Or am I just a victim of the misconception.
This is an LLM. "Wrong" is not a concept that applies, as it requires understanding. The explanation is quite /probable/, as evidenced by the fact that they thought to use it as an example…
I asked ChatGPT for help with Wordle the other day, by asking for a 5-letter word that contained P, M, K and Y. It said:
> Yes, the word skimp contains the letters P, M, K, and Y
Would you say that wrong is not a concept that applies to this answer?
A quite good example of AI limits
These are places where common lay discussions use language in ways that is wrong, or makes simplifcations that are reasonable but technically incorrect. They are especially common when something is so 'obvious' that experts don't explain it, the most frequent version of the concepts being explained
These, in my testing, show up a lot in LLMs - technical things are wrong when the most language of the most common explanations simplifies or obfuscates the precise truth. Often, it pretty much matches the level of knowledge of a college freshman/sophmore or slightly below, which is sort of the level of discussion of more technical topics on the internet.
>In fact, theory predicts – and experiments confirm – that the air traverses the top surface of a body experiencing lift in a shorter time than it traverses the bottom surface; the explanation based on equal transit time is false.
So the effect is greater than equal time transit.
I've seen the GPT5 explanation in GCSE level textbooks but I thought it was supposed to be PhD level;)
Common misconceptions should be expected when you train a model to act like the average of all humans.
https://jimruttshow.blubrry.net/the-jim-rutt-show-transcript...
> “What actually causes lift is introducing a shape into the airflow, which curves the streamlines and introduces pressure changes – lower pressure on the upper surface and higher pressure on the lower surface,” clarified Babinsky, from the Department of Engineering. “This is why a flat surface like a sail is able to cause lift – here the distance on each side is the same but it is slightly curved when it is rigged and so it acts as an aerofoil. In other words, it’s the curvature that creates lift, not the distance.”
So I'd characterize this answer as "correct, but incomplete" or "correct, but simplified". It's a case where a PhD in fluid dynamics might state the explanation one way to an expert audience, but another way to a room full of children.
The hilarious thing about this subthread is that it's already getting filled with hyper-technical but wrong alternative explanations by people eager to show that they know more than the robot.
Not much explanation yet why GPT-5 warrants a major version bump. As usual, the model (and potentially OpenAI as a whole) will depend on output vibe checks.
How is this sustainable.
Not that it makes it useless, just that we seem to not "be there" yet for the standard tasks software engineers do every day.
Exactly. Too many videos - too little real data / benchmarks on the page. Will wait for vibe check from simonw and others
https://openai.com/gpt-5/?video=1108156668
2:40 "I do like how the pelican's feet are on the pedals." "That's a rare detail that most of the other models I've tried this on have missed."
4:12 "The bicycle was flawless."
5:30 Re generating documentation: "It nailed it. It gave me the exact information I needed. It gave me full architectural overview. It was clearly very good at consuming a quarter million tokens of rust." "My trust issues are beginning to fall away"
Edit: ohh he has blog post now: https://news.ycombinator.com/item?id=44828264
People knew that gpt5 wouldn’t be an AGI or even close to that. It’s just an updated version. GptN would become more or leas like an annual release.
Pretty par for course evals at launch setup.
https://chatgpt.com/share/6895d5da-8884-8003-bf9d-1e191b11d3...
GPT-5 pricing: $10/Mtok out
What am I missing?
(Not to undermine progress in the foundational model space, but there is a lack of appreciation for the democratization of domain specific models amongst HNers).
Meanwhile, Anthropic & Google have more room in their P/S ratios to continue to spend effort on logarithmic intelligence gains.
Doesn't mean we won't see more and more intelligent models out of OpenAI, especially in the o-series, but at some point you have to make payroll and reality hits.
Before the release of the model Sam Altman tweeted a picture of the Death Star appearing over the horizon of a planet.
We’re talking about less than a 10% performance gain, for a shitload of data, time, and money investment.
Maybe quantum compute would be significant enough of a computing leap to meaningfully move the needle again.
This is day one, so there is probably another 10-20% in optimizations that can be squeezed out of it in the coming months.
This gives them an out. "That was the old model, look how much better this one tests on our sycophancy test we just made up!!"
GPT5.5 will be a 10X compute jump.
4.5 was 10x over 4.
I feel it’s worthy of a major increment, even if benchmarks aren’t significantly improved.
He also said that AGI was coming early 2025.
People that can't stop drinking the kool aid are really becoming ridiculous.
Diminished returns.-
... here's hoping it leads to progress.-
They also announced gpt-5-pro but I haven't seen benchmarks on that yet.
Once sector of the economy would cut down on investment spending, which can be easily offset by decreasing the interest rate.
But this is a short-term effect. What I'm worried is a structural change of the labor market, which would be positive for most people, but probably negative for people like me.
I don't mind losing my programming job in exchange for being able to go to the pharmacy for my annual anti-cancer pill.
But, what happens when you lose that programming job and are forced to take a job at a ~50-70% pay reduction? How are you paying for that anti-cancer drug with a job with no to little health insurance?
I don't think it is "bad" to be sincerely worried that the current trajectory of AI progress represents this trade.
The likelihood of all that is incredibly slim. It's not 0% -- nothing ever really is -- but it is effectively so.
Especially with the economics of scientific research, the reproducibility crisis, and general anti-science meme spreading throughout the populace. The data, the information, isn't there. Even if it was, it'd be like Alzheimer's research: down the wrong road because of faked science.
There is no one coming to save humanity. There is only our hard work.
How exactly do you wish death comes to you?
Have you looked at how expensive prescription drug prices are without (sometimes WITH) insurance? If you are no longer employed, good luck paying for your magical pill.
there is some improvements in some benchs and not else worthy of note in coding. i only took a peek though so i might be wrong
But yeah, you are correct in that no matter what, we're going to be left holding the bag.
Livestream link: https://www.youtube.com/live/0Uu_VJeVVfo
Research blog post: https://openai.com/index/introducing-gpt-5/
Developer blog post: https://openai.com/index/introducing-gpt-5-for-developers
API Docs: https://platform.openai.com/docs/guides/latest-model
Note the free form function calling documentation: https://platform.openai.com/docs/guides/function-calling#con...
GPT5 prompting guide: https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_g...
GPT5 new params and tools: https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_...
GPT5 frontend cookbook: https://cookbook.openai.com/examples/gpt-5/gpt-5_frontend
prompt migrator/optimizor https://platform.openai.com/chat/edit?optimize=true
Enterprise blog post: https://openai.com/index/gpt-5-new-era-of-work
System Card: https://openai.com/index/gpt-5-system-card/
What would you say if you could talk to a future OpenAI model? https://progress.openai.com/
coding examples: https://github.com/openai/gpt-5-coding-examples
Academic benchmark score improves only 5% but they make the bar 50% higher.
basically in my testing really felt that gpt5 was "using tools to think" rather than just "using tools". it gets very powerful when coding long horizon tasks (a separate post i'm publishing later).
to give one substantive example, in my developer beta (they will release the video in a bit) i put it to a task that claude code had been stuck on for the last week - same prompts - and it just added logging to instrument some of the failures that we were seeing and - from the logs that it added and asked me to rerun - figured out the solve.
> It’s actually worse at writing than GPT-4.5
Sounds like we need to wait a bit for the dust to settle before one can trust anything one hears/reads :)
It's hard to make a man understand something standing between them and their salary
>"While I never use AI for personal writing (because I have a strong belief in writing to think)"
The optimal AI productivity process is starting to look like:
AI Generates > Human Validates > Loop
Yet cognitive generation is how humans learn and develop cognitive strength, as well as how they maintain such strength.
Similar to how physical activity is how muscles/bone density/etc grow, and how body tissues maintain.
Physical technology freed us from hard physical labor that kept our bodies in shape -- at a cost of physical atrophy.
AI seems to have a similar effect for our minds. AI will accelerate our cognitive productivity, and allow for cognitive convenience -- at a cost of cognitive atrophy.
At present we must be intentional about building/maintaining physical strength (dedicated strength training, cardio, etc).
Soon we will need to be intentional about building/maintaining cognitive strength.
I suspect the workday/week of the future will be split on AI-on-a-leash work for optimal productivity, with carve-outs for dedicated AI-enhanced-learning solely for building/maintaining cognitive health (where productivity is not the goal, building/maintaining cognition is). Similar to how we carve out time for working out.
What are your thoughts on this? Based on what you wrote above, it seems you have similar feelings?
Is there a name for this theory?
If not can you coin one? You're great at that :)
Sorry, but this sounds like overly sensational marketing speak and just leaves a bad taste in the mouth for me.
Then I noticed the date on the comment: 2023.
Technically, every advancement in the space is “the closest to AGI that we’ve ever been”. It’s technically correct, since we’re not moving backward. It’s just not a very meaningful statement.
In the words OpenAI: “AGI is defined as highly autonomous systems that outperform humans at most economically valuable work”
1)Internal Retrieval
2)Web Search
3)Code Interpreter
4)Actions
How did you come up with this idea?
edit:
livestream here: https://www.youtube.com/live/0Uu_VJeVVfo
The current situation is kind of like a grand prize where Zuck or similar will hand $1bn to anyone who cracks it. That's a huge incentive for people to have a go.
It's a perfect situation for Nvidia. You can see that after months of trying to squeeze out all % of marginal improvements, sama and co decided to brand this GPT-4.0.0.1 version as GPT-5. This is all happening on NVDA hardware, and they are gonna continue desperately iterating on tiny model efficiencies until all these valuation $$$ sweet sweet VC cash run out (most of it directly or indirectly going to NVDA).
To tell a made-up anecdote: A colleague told me how his professor friend was running statistical models over night because the code was extremely unoptimized and needed 6+ hours to compute. He helped streamline the code and took it down to 30 minutes, which meant the professor could run it before breakfast instead.
We are completely fine with giving a task to a Junior Dev for a couple of days and see what happens. Now we love the quick feedback of running Claude Max for a hundred bucks, but if we could run it for a buck over night? Would be quite fine for me as well.
So, if sama says this is going to be totally revolutionary for months, then uploads a Death Star reference the night before and then when they show it off the tech is not as good as proposed, laughter is the only logical conclusion.
Companies linking this to terminating us and getting rid of our jobs to please investors means we, whose uptake of this tech is required for their revenue goals, are skeptical about it and have a vested interest in it failing to meet expectations
This is not the happy path for gpt-5.
The table in the model card where every model in the current drop down somehow maps to one of the 6 variants of gpt-5 is not where most people thought we would be today.
The expectation was consolidation on a highly performant model, more multimodal improvements, etc.
This is not terrible, but I don't think anyone who's an "accelerationist" is looking at this as a win.
Update after some testing: This feels like gpt-4.1o and gpt-o4-pro got released and wrapped up under a single model identifier.
How are they mindblowing? This was all possible on Claude 6 months ago.
> Major progress on multiple fronts
You mean marginal, tiny fraction of % progress on a couple of fronts? Cause it sounds like we are not seeing the same presentation.
> Yet, I like what I'm seeing.
Most of us don't
> So -- they did not invent AGI yet.
I am all for constant improvements and iterations over time, but with this pace of marginal tweak-like changes, they are gonna reach AGI never. And yes, we are laughing because sama has been talking big on agi for so long, and even with all the money and attention he can't be able to be even remotely close to it. Same for Zuck's comment on superintelligence. These are just salesmen, and we are laughing at them when their big words don't match their tiny results. What's wrong with that?
But up until now, especially from Sam Altman, we've heard countless veiled suggestions that GPT-5 would achieve AGI. A lot of the pro-AI people have been talking shit for the better part of the last year saying "just wait for GPT-5, bro, we're gonna have AGI."
The frustration isn't the desire to achieve AGI, it's the never-ending gaslighting trying to convince people (really, investors) that there's more than meets the eye. That we're only ever one release away from AGI.
Instead: just be honest. If you're not there, you're not there. Investors who don't do any technical evals may be disappointed, but long-term, you'll have more than enough trust and goodwill from customers (big and small) if you don't BS them constantly.
its not a "fix"
Meanwhile Sam Altman has been making the rounds fearmongering that AGI/ASI is right around the corner and that clearly is not the truth. It's fair to call them out on it.
HN is just for insecure , miserable shitheads.
When GPT-5 launches, several older models will be retired, including:
- GPT-4o
- GPT-4.1
- GPT-4.5
- GPT-4.1-mini
- o4-mini
- o4-mini-high
- o3
- o3-pro
If you open a conversation that used one of these models, ChatGPT will automatically switch it to the closest GPT-5 equivalent. Chats with 4o, 4.1, 4.5, 4.1-mini, o4-mini, or o4-mini-high will open in GPT-5, chats with o3 will open in GPT-5-Thinking, and chats with o3-Pro will open in GPT-5-Pro (available only on Pro and Team).
[0] https://help.openai.com/en/articles/11909943-gpt-5-in-chatgp...So only for free/plus users (for now). I do wonder how long they will take to deprecate these models via API though...
Smaller base models + more RL. Technically better at the verticals that are making money, but worse on subjective preference.
They'll probably try to prompt engineer back in some of the "vibes", hence the personalities. But also maybe they decided people spending $20 a month to hammer 4o all day as a friend (no judgement, really) are ok to tick off for now... and judging by Reddit, they are very ticked off.
The only way to get access to other models right now (for me at least) is via the iPhone app, for now.
3.5 Turbo has been deprecated for a long time but is still running
If you are building on models that could disappear tomorrow when a company needs to juice the launch of a new model (or increase prices), you are introducing avoidable risk.
Doesn't matter at all if the newer model is earth-shatteringly good (and this one doesn't seem to be): If I can't reliably access the models I've built my tooling on top of... I'm very unhappy.
If this note is just intended for the GUI chat interface they provide - Fine. I don't love it, but I get it.
But if the older models start disappearing from the paid API surfaces (ex - I can no longer get to a precise snapshot through something like "gpt-4o-2024-08-06" or "gpt-3.5-turbo-1106") then this is a great reason to abandon OpenAI entirely as a platform.
Just right next paragraph...
I'm not saying I'd do it that way myself, but it explains why they don't see it as too bold.
> "[GPT-5] can write an entire computer program from scratch, to help you with whatever you'd like. And we think this idea of software on demand is going to be one of the defining characteristics of the GPT-5 era."
But then again, all of this is a hype machine cranked up till the next one needs cranking.
It does feel like we're marching toward a day when "software on tap" is a practical or even mundane fact of life.
But, despite the utility of today's frontier models, it also feels to me like we're very far from that day. Put another way: my first computer was a C64; I don't expect I'll be alive to see the day.
Then again, maybe GPT-5 will make me a believer. My attitude toward AI marketing is that it's 100% hype until proven otherwise -- for instance, proven to be only 87% hype. :-)
I’m not sure this will be game changing vs existing offerings
GPT-5 doesn't seem to get you there tho ...
(Disclaimer: But I am 100% sure it will happen eventually)
"Fast fashion" is not a good thing for the world, the environment, the fashion industry, and arguably not a good thing for the consumers buying it. Oh but it is good for the fast fashion companies.
The best part is, this is not even the real definition of "AGI" yet (whatever that means at this point).
More like 10% of the capability that was promised and already the flow of capital from the inflated salaries of the past decade are going to the top AI researchers.
Having both eliminates a feedback loop and the LLM enables you to get shit done fast.
Before it was: 100 o3 per week 100 o4-mini-high per day 300 o4-mini per day 50 4.5 per week
[0] https://help.openai.com/en/articles/11909943-gpt-5-in-chatgp...
I'm guessing they'll just announce massive tier generosity later considering how GPT-5 input tokens are half the price of 4.1 on the API. It's probably a way to keep the servers from being overloaded and to encourage people to buy Plus while the hype is hot.
The expensive thing: 100 per week -> 200 per week
This is...the opposite of a nerf? The numbers goed up? (We can quibble about the daily vs hourly difference, but certainly for me the weekly cap was the only thing that mattered.)
The better analogue is "Imagine in the 70's being able to teletype into an insanely expensive compute infrastructure and have reasonable timesharing capabilities of a limited resource across multiple users."
Unix. I'm describing the motivation for Unix there.
We already look back on earlier times with constraints that were appropriate.
Presumably compute will get cheaper, we'll build more datacenters, maybe we'll even power them in a way that doesn't destroy our planet, and GPT questions will become too cheap to meter. Just give it some time.
They've mentioned improvements in that aspects a few times now, and if it actually materializes, that would be a big leap forward for most users even if underneath GPT-4 was also technically able to do the same things if prompted just the right way.
The jump from 3 to 4 was huge. There was an expectation for similar outputs here.
Making it cheaper is a good goal - certainly - but they needed a huge marketing win too.
But it's only an incremental improvement over the existing o line. So people feel like the improvement from the current OpenAI SoTA isn't there to justify a whole bump. They probably should have just called o1 GPT-5 last year.
They're all working on subjective improvements, but for example, none of them would develop and deploy a sampler that makes models 50% worse at coding but 50% less likely to use purple prose.
(And unlike the early days where better coding meant better everything, more of the gains are coming from very specific post-training that transfers less, and even harms performance there)
For example: You could ban em dash tokens entirely, but there are places like dialogue where you want them. You can write a sampler that only allows em dashes between quotation marks.
That's a highly contrived example because em dashes are useful in other places, but samplers in general can be as complex as your performance goals will allow (they are on the hot path for token generation)
Swapping samplers could be a thing, but you need more than that in the end. Even the idea of the model accepting loosely worded prompts for writing is a bit shakey: I see a lot of gains by breaking down the writing task into very specifc well-defined parts during post-training.
It's ok to let an LLM go from loose prompts to that format for UX, but during training you'll do a lot better than trying to learn on every way someone can ask for a piece of writing
I won't argue that I always use it in a stylistically appropriate fashion, but I may have to move away from it. I am NOT beating the actually-an-AI allegations.
That said, yeah the equal time thing never made any sense.