OpenAI's new GPT-5 models announced early by GitHub (opens in new tab)

(theverge.com)

79 pointsbkolobara7mo ago72 comments

72 comments

> It handles complex coding tasks with minimal prompting...

I find it interesting how marketers are trying to make minimal prompting a good thing, a direction to optimize. Even if i talk to a senior engineer, i'm trying to be specific as possible to avoid ambiguities etc. Pushing the models to just do what they think its best is a weird direction. There are so many subtle things/understandings of the architecture that are just in my head or a colleagues head. Meanwhile, i found that a very good workflow is asking claude code to come back with clarifying questions and then a plan, before just starting to execute.

KronisLV7mo ago

> Meanwhile, i found that a very good workflow is asking claude code to come back with clarifying questions and then a plan, before just starting to execute.

RooCode supports various modes https://docs.roocode.com/basic-usage/using-modes

For example, you can first use the Ask mode to explore the codebase and answer your questions, as well as ask you its own about what you want to do. Then, you can switch over to the Code mode to do the actual implementation, or the model itself will ask you to switch to it in other modes, because it's not allowed to change files in the Ask mode.

I think that approach works pretty well, especially when you document what needs to be done in a separate Markdown file or something along the lines of it, that can be then referenced if you have to clean the context, like a new refactoring task for what's been implemented.

> I find it interesting how marketers are trying to make minimal prompting a good thing, a direction to optimize.

This seems like a good thing, though. You're still allowed to be as specific as you want to, but the baseline is a bit better.

ls-a7mo ago

This works well with managers. They think if the task title on jira is a one liner, then it's that simple to implement.

consp7mo ago

Usually it's exactly the opposite, more often due to missing containment and requirements so you get a vague oneliner. Infinity hours it is ...

1 more reply

igleria7mo ago

> I find it interesting how marketers are trying to make minimal prompting a good thing

They do that because IMHO the average person seems to prefer something to be easy, rather than correct.

Ancapistani7mo ago

> Even if i talk to a senior engineer, i'm trying to be specific as possible to avoid ambiguities etc.

Sure - but you're being specific about the acceptance criteria, not the technical implementation details, right?

That's where the models I've been using are at the moment in terms of capability; they're like junior engineers. They know how to write good quality code. If I tell them exactly what to write, they can one-shot most tasks. Otherwise, there's a good chance the output will be spaghetti.

> There are so many subtle things/understandings of the architecture that are just in my head or a colleagues head.

My primary agentic code generation tool at the moment is OpenHands (app.all-hands.dev). Every time it makes an architectural decision I disagree with, I add a "microagent" (long-term context, analogous to CLAUDE.md or Devin's "Knowledge Base").

If that new microagent works as expected, I incorporate it into either my global or organization-level configs.

The result is that it gets more and more aligned with the way I prefer to do things over time.

nojito7mo ago

Because people are overprompting and creating crazy elaborate harnesses. My prompts are maybe 1 - 2 sentences.

There is a definite skill gap between folks who are using these tools effectively and those who do not.

c0487mo ago

This is why I don't listen at all to the fearmongers that say programmers will disappear. At most, our jobs will slightly change.

There will always be people that describe a problem, and you'll always need people actually figuring out what's actually wrong.

croes7mo ago

The problem isn’t the AI but the management that believes the PR. It doesn’t matter if AI can replace developers but if the management thinks it can.

1 more reply

ACCount367mo ago

What makes you look at existing AI systems and then say "oh, this totally isn't capable of describing a problem or figuring out what's actually wrong"? Let alone "this wouldn't EVER be capable of that"?

3 more replies

Ratelman7mo ago

Interesting/unfortunate/expected that GPT-5 isn't touted as AGI or some other outlandish claim. It's just improved reasoning etc. I know it's not the actual announcement and it's just a single page accidentally released, but it at least seems more grounded...? Have to wait and see what the actual announcement entails.

throwaway5257537mo ago

At this point it's pretty obvious that the easy scaling gains have been made already and AI labs are scrounging for tricks to milk out extra performance from their huge matrix product blobs:

-Reasoning, which is just very long inference coupled with RL

-Tool use aka an LLM with glue code to call programs based on its output

-"Agents" aka LLMs with tools in a loop

Those are pretty neat tricks, and not at all trivial to get actionable results from (from an engineering point of view), mind you. But the days of the qualitative intelligence leaps from GPT-2 to 3, or 3 to 4, are over. Sure, benchmarks do get saturated, but at incredible cost and forcing AI researchers to make up new "dimensions of scaling" as the ones they were previously banking on stalled. And meanwhile it's all your basic next token prediction blob running it all, just with a few optimizing tricks.

My hunch is that there won't be a wondorous life turning AGI (poorly defined anyway), just consolidating existing gains (distillation, small language models, MoE, quality datasets, etc.) and finding new dimensions and sources of data (biological data and 'sense-data' for robotics come to mind).

binary1327mo ago

This is the worst they’ll ever be! It’s not just going to be an ever slower asymptotic improvement that never quite manages to reach escape velocity but keeps costing orders of magnitude more to research, train, and operate….

nialv77mo ago

I wonder whether the markets will crash if gpt5 flops. Because it might be the model that cements the idea that, yes, we have hit a wall.

qsort7mo ago

I'm the first to call out ridiculous behavior by AI companies but short of something massively below expectations this can't be bad for openai. GPT-5 is going to be positioned as a product for the general public first and foremost. Not everyone cares about coding benchmarks.

nialv77mo ago

llama 4 basically (arguably) destroyed Meta's LLM lab, and it wasn't even that bad of a model.

1 more reply

benterix7mo ago

> massively below expectations

Well, the problem is that the expectations are already massive, mostly thanks to sama's strategy of attracting VC.

ben_w7mo ago

OpenAI's announcements are generally a lot more grounded than the hype surrounding them and their stuff.

e.g. if you look at Altman's blog of "superintelligence in a few thousand days", what he actually wrote doesn't even disagreeing with LeCun (famously a nay-sayer) about the timeline.

naveen997mo ago

Few thousands days is decades.

1 more reply

Imustaskforhelp7mo ago

Yeah, I guess it wouldn't be that big but it will have a lot of hype around it.

I doubt it can even beat opus 4.1

bkolobaraOP7mo ago

The actual announcement (now deleted on GitHub's blog): https://archive.is/IoMEg

billytrend7mo ago

Did they photoshop the screenshot from https://github.blog/changelog/2025-05-19-github-models-built... ? Other than the model id, it’s identical.

ukblewis7mo ago

I get that it looks suspicious, but here’s the archive link: https://archive.is/2025.08.07-035308/https://github.blog/cha...

netown7mo ago

> GPT-5 will have "enhanced agentic capabilities” and can handle “complex coding tasks with minimal prompting.”

this seems to be directly targeted at anthropic/claude, wonder if it leads anywhere or if claude keeps it's mystical advantage (especially with new claude models coming out this week as well).

> GPT-5 will have four model variants, according to GitHub...

i also find it interesting that the primary model is the logic-focused one (likely very long and deep reasoning), whereas the conversational mainstream model is now a variant. seems like a fundamental shift in how they want these tools to be used, as opposed to today's primary 4o and the more logical GPT-4.1, o4-mini, and o3.

ed_mercer7mo ago

Damn interns!

therodeoen7mo ago

they are comparing it to llama 4 and cohere v2 in the image…

nxobject7mo ago

Is the announcement implying that "mainline" GPT-5 is now a reasoning model?

> gpt-5: Designed for logic and multi-step tasks.

hobofan7mo ago

That doesn't imply reasoning and more likely expresses that it's focused on better tool calling capabilities.

blixt7mo ago

I think the promise back when all the separate reasoning / multimodal models were out was that GPT-5 would be the model to bring it all together (which mostly comes down to audio/video I think since o3/o4 do images really well).

om87mo ago

Of course it is. GPT-5 is one of the most anticipated things in AI right now. To live up to the hype, it needs to be a reasoning model.

fnord777mo ago

sama posted a picture of the death star yesterday

j / k navigate · click thread line to collapse

72 comments

deepdarkforest7mo ago

> It handles complex coding tasks with minimal prompting...

KronisLV7mo ago

> Meanwhile, i found that a very good workflow is asking claude code to come back with clarifying questions and then a plan, before just starting to execute.

RooCode supports various modes https://docs.roocode.com/basic-usage/using-modes

> I find it interesting how marketers are trying to make minimal prompting a good thing, a direction to optimize.

This seems like a good thing, though. You're still allowed to be as specific as you want to, but the baseline is a bit better.

ls-a7mo ago

This works well with managers. They think if the task title on jira is a one liner, then it's that simple to implement.

consp7mo ago

Usually it's exactly the opposite, more often due to missing containment and requirements so you get a vague oneliner. Infinity hours it is ...

1 more reply

igleria7mo ago

> I find it interesting how marketers are trying to make minimal prompting a good thing

They do that because IMHO the average person seems to prefer something to be easy, rather than correct.

Ancapistani7mo ago

> Even if i talk to a senior engineer, i'm trying to be specific as possible to avoid ambiguities etc.

Sure - but you're being specific about the acceptance criteria, not the technical implementation details, right?

> There are so many subtle things/understandings of the architecture that are just in my head or a colleagues head.

If that new microagent works as expected, I incorporate it into either my global or organization-level configs.

The result is that it gets more and more aligned with the way I prefer to do things over time.

nojito7mo ago

Because people are overprompting and creating crazy elaborate harnesses. My prompts are maybe 1 - 2 sentences.

There is a definite skill gap between folks who are using these tools effectively and those who do not.

c0487mo ago

This is why I don't listen at all to the fearmongers that say programmers will disappear. At most, our jobs will slightly change.

There will always be people that describe a problem, and you'll always need people actually figuring out what's actually wrong.

croes7mo ago

The problem isn’t the AI but the management that believes the PR. It doesn’t matter if AI can replace developers but if the management thinks it can.

1 more reply

ACCount367mo ago

3 more replies

Ratelman7mo ago

throwaway5257537mo ago

At this point it's pretty obvious that the easy scaling gains have been made already and AI labs are scrounging for tricks to milk out extra performance from their huge matrix product blobs:

-Reasoning, which is just very long inference coupled with RL

-Tool use aka an LLM with glue code to call programs based on its output

-"Agents" aka LLMs with tools in a loop

binary1327mo ago

nialv77mo ago

I wonder whether the markets will crash if gpt5 flops. Because it might be the model that cements the idea that, yes, we have hit a wall.

qsort7mo ago

nialv77mo ago

llama 4 basically (arguably) destroyed Meta's LLM lab, and it wasn't even that bad of a model.

1 more reply

benterix7mo ago

> massively below expectations

Well, the problem is that the expectations are already massive, mostly thanks to sama's strategy of attracting VC.

ben_w7mo ago

OpenAI's announcements are generally a lot more grounded than the hype surrounding them and their stuff.

e.g. if you look at Altman's blog of "superintelligence in a few thousand days", what he actually wrote doesn't even disagreeing with LeCun (famously a nay-sayer) about the timeline.

naveen997mo ago

Few thousands days is decades.

1 more reply

Imustaskforhelp7mo ago

Yeah, I guess it wouldn't be that big but it will have a lot of hype around it.

I doubt it can even beat opus 4.1

bkolobaraOP7mo ago

The actual announcement (now deleted on GitHub's blog): https://archive.is/IoMEg

billytrend7mo ago

Did they photoshop the screenshot from https://github.blog/changelog/2025-05-19-github-models-built... ? Other than the model id, it’s identical.

ukblewis7mo ago

I get that it looks suspicious, but here’s the archive link: https://archive.is/2025.08.07-035308/https://github.blog/cha...

netown7mo ago

> GPT-5 will have "enhanced agentic capabilities” and can handle “complex coding tasks with minimal prompting.”

this seems to be directly targeted at anthropic/claude, wonder if it leads anywhere or if claude keeps it's mystical advantage (especially with new claude models coming out this week as well).

> GPT-5 will have four model variants, according to GitHub...

ed_mercer7mo ago

Damn interns!

therodeoen7mo ago

they are comparing it to llama 4 and cohere v2 in the image…

nxobject7mo ago

Is the announcement implying that "mainline" GPT-5 is now a reasoning model?

> gpt-5: Designed for logic and multi-step tasks.

hobofan7mo ago

That doesn't imply reasoning and more likely expresses that it's focused on better tool calling capabilities.

blixt7mo ago

om87mo ago

Of course it is. GPT-5 is one of the most anticipated things in AI right now. To live up to the hype, it needs to be a reasoning model.

fnord777mo ago

sama posted a picture of the death star yesterday

j / k navigate · click thread line to collapse