undefined | Better HN

0 pointssmokel11mo ago0 comments

Stating this without any arguments is not very convincing.

Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect. That is progress, and that does give reason to extrapolate.

Unless of course you mean something very special with "solving programming".

0 comments

bigstrat200311mo ago

> Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect.

IMO, they're still useless today, with the only progress being that they can produce a more convincing facade of usefulness. I wouldn't call that very meaningful progress.

wvenable11mo ago

I don't know how someone can legitimately say that they're useless. Perfect, no. But useless, also no.

addaon11mo ago

> I don't know how someone can legitimately say that they're useless.

Clearly, statistical models trained on this HN thread would output that sequence of tokens with high probability. Are you suggesting that a statement being probable in a text corpus is not a legitimate source of truth? Can you generalize that a little bit?

wvenable11mo ago

Who said anything about truth? We're talking about usefulness.

drdeca11mo ago

I’ve found them somewhat useful? Not for big things, and not for code for work.

But for small personal projects? Yes, helpful.

conradkay11mo ago

It's funny how there's a decent % of people at both "LLMs are useless" and "LLMs 3-10x my productivity"

otabdeveloper411mo ago

> LLMs 3-10x my productivity

x10 of zero is still zero, I guess.

IshKebab11mo ago

They are very clearly not useless. You haven't given them a fair shake.

marcosdumay11mo ago

Why state the same arguments everybody has been repeating for ages?

LLMs can only give you code that somebody has wrote before. This is inherent. This is useful for a bunch of stuff, but that bunch won't change if OpenAI decides to spend the GDP of Germany training one instead of Costa Rica.

vidarh11mo ago

> LLMs can only give you code that somebody has wrote before. This is inherent.

This is trivial to prove to be false.

Invent a programming language that does not exist. Describe its semantics to an LLM. Ask it to write a program to solve a problem in that language. It will not always work, but it will work often enough to demonstrate that they are very much capable of writing code that has never been written before.

The first time I tried this was with GPT3.5, and I had it write code in an unholy combination of Ruby and INTERCAL, and it had no problems doing that.

Similarly giving it a grammar of a hypothetical language, and asking it to generate valid text in a language that has not existed before also works reasonably well.

This notion that LLMs only spit out things that has been written before might have been reasonable to believe a few years ago, but it hasn't been a reasonable position to hold for a long time at this point.

shrewduser11mo ago

This doesn't surprise me, i find LLM's are really good at interpolating and translating. so if i made up a language and gave it the rules and asked it to translate i wouldn't expect it to be bad at it.

vidarh11mo ago

It shouldn't surprise anyone, but it is clear evidence against the claim I replied to, and clearly a lot of people still hold on to this irrational assumption that they can't produce anything new.

1 more reply

JoshCole11mo ago

> LLMs can only give you code that somebody has wrote before.

This premise is false. It is fundamentally equivalent to the claim that a language model being trained on a dataset: ["ABA", "ABB"] would be unable to generate, given input "B" the string "BAB" or "BAA".

171862744011mo ago

Isn't the claim, that it will never make up "C"?

JoshCole11mo ago

They don't claim that. They say LLMs only generate text someone has written. Another way you could refute their premise was by showing the existence of AI-created programs for which someone isn't a valid description of the writer (e.g., from evolutionary algorithms) then training a network on that data such that it can output it. It is just as trivial a way to prove that the premise is false.

Your claim here is slightly different.

You're claiming that if a token isn't supported, it can't be output [1]. But we can easily disprove this by adding minimal support for all tokens, making C appear in theory. Such support addition shows up all the time in AI literature [2].

[1]: https://en.wikipedia.org/wiki/Support_(mathematics)

[2]: In some regimes, like game theoretic learning, support is baked into the solving algorithms explicitly during the learning stage. In others, like reinforcement learning, its accomplished by making the policy a function of two objectives, one an exploration objective, another an exploitation objective. That existing cross pollination already occurs between LLMs in the pre-trained unsupervised regime and LLMs in the post-training fine-tuning via forms of reinforcement learning regime should cause someone to hesitate to claim that such support addition is unreasonable if they are versed in ML literature.

Edit:

Got downvoted, so I figure maybe people don't understand. Here is the simple counterexample. Consider an evaluator that gives rewards: F("AAC") = 1, all other inputs = 0. Consider a tokenization that defines "A", "B", "C" as tokens, but a training dataset from which the letter C is excluded but the item "AAA" is present.

After training "AAA" exists in the output space of the language model, but "AAC" does not. Without support, without exploration, if you train the language model against the reinforcement learning reward model of F, you might get no ability to output "C", but with support, the sequence "AAC" can be generated and give a reward. Now actually do this. You get a new language model. Since "AAC" was rewarded, it is now a thing within the space of the LLM outputs. Yet it doesn't appear in the training dataset and there are many reward models F for which no person will ever have had to output the string "AAC" in order for the reward model to give a reward for it.

It follows that "C" can appear even though "C" does not appear in the training data.

2 more replies

Epa09511mo ago

First, how much of coding is really never done before?

And secondly, what you say are false (at least if taken literally). I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.

vidarh11mo ago

> I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.

I literally just pointed out the same time without having seen your comment.

Second this. I've done this several times, and it can handle it well. Already GPT3.5 could easily reason about hypothetical languages given a grammar or a loose description.

I find it absolutely bizarre that people still hold on to this notion that these languages can't do anything new, because it feels implausible that they have tried given how well it works.

skydhash11mo ago

If you give it the rules to generate something, why can't it generate it? That's what something like Mockaroo[0] does. It's just more formal. That's pretty much what LLM training does, extracting patterns from a huge corpus of text. Then it goes one to generate according to the patterns. It can not generate a new pattern that is not a combination of the previous one.

[0]: https://mockaroo.com/

1 more reply

lupire11mo ago

Second, how much of commenting is really never done before?

1 more reply

apwell2311mo ago

> how much of coding is really never done before?

A lot because we use libraries for 'done frequently before' code. i don't generate a database driver for my webapp with llm.

Epa09511mo ago

We use libraries for SOME of the 'done frequently' code.

But how much of enterprise programming is 'get some data from a database, show it on a Web page (or gui), store some data in the database', with variants?

It makes sense that we have libraries for abstraction away some common things. But it also makes sense that we can't abstract away everything we do multiple times, because at some point it just becomes so abstract that it's easier to write it yourself than to try to configure some library. Does not mean that it's not a variant of something done before.

4 more replies

bawolff11mo ago

> First, how much of coding is really never done before?

Lots of programming doesn't have one specific right answer, but a bunch of possible right answers with different trade-offs. The programmers job isn't just to get working code neccesarily. I dont think we are at the point where llm's can see the forest for the trees, so to speak.

rhubarbtree11mo ago

That’s not true. LLMs are great translators, they can translate ideas to code. And that doesn’t mean it has to be recalling previously seen text.

nahbruheem11mo ago

Generating unseen code is not hard.

Set rules on what’s valid, which most languages already do; omit generation of known code; generate everything else

The computer does the work, programmers don’t have to think it up.

A typed language example to explain; generate valid func sigs

func f(int1, int2) return int{}

If that’s our only func sig in our starting set then it makes it obvious

Well relative to our tiny starter set func f(int1, int2, int3) return int{} is novel

This Redis post is about fixing a prior decision of a random programmer. A linguistics decision.

That’s why LLMs seem worse than programmers because we make linguistics decisions that fit social idioms.

If we just want to generate all the never before seen in this model code we don’t need a programmer. If we need to abide laws of a flexible language nature, that’s what a programmer is for; compose not just code by compliance with ground truth.

That antirez is good at Redis is a bias since he has context unseen by the LLM. Curious how well antirez would do with an entirely machine generated Redis-clone that was merely guided by experts. Would his intuition for Redis’ implementation be useful to a completely unknown implementation?

He’d make a lot of newb errors and need mentorship, I’m guessing.

nprateem11mo ago

I think we're hoping for more than the 'infinite monkeys bashing out semantically correct code' approach.

nahbruheem11mo ago

Ok, define what means and make it. Then as soon as you do realize you run into Gödel’s understanding your machine doesn’t solve problems related to its own existence and needs outside help. So you need to generate that yet unseen solution that lacks context for understanding itself… repeat and see it’s exactly generating one yet unseen layer of logic after another.

Read the article; his younger self failed to see logic needed now. Add that onion peel. No such thing as perfect clairvoyance.

Even Yann LeCun’s energy based models driving robots have the same experience problem.

Make a computer that can observe all of the past and future.

Without perfect knowledge our robots will fail to predict some composition of space time before they can adapt.

So there’s no probe we can launch that’s forever and generally able to survive with our best guess when launched.

More people need to study physical experiments and physics and not the semantic rigor of academia. No matter how many ideas we imagine there is no violating physics.

Pop culture seems to have people feeling starship Enterprise is just about to launch from dry dock.

Retric11mo ago

Progress sure, but the rate the’ve improved hasn’t been particularly fast recently.

Programming has become vastly more efficient in terms of programmer effort over decades, but making some aspects of the job more efficient just means all your effort it spent on what didn’t improve.

lexandstuff11mo ago

People seem to have forgotten how good the 2023 GPT-4 really was at coding tasks.

mirsadm11mo ago

The latest batch of LLMs has been getting worse in my opinion. Claude in particular seems to be going backwards with every release. The verbosity of the answers is infuriating. You ask it a simple question and it starts by inventing the universe, poorly

apwell2311mo ago

> Perhaps you remember that language models were completely useless at coding some years ago

no i don't remember that. They are doing similar things now that they did 3 yrs ago. They were still a decent rubber duck 3 yrs ago.

vidarh11mo ago

And 6 years ago GPT2 had just been released. You're being obtuse by interpreting "some years" as specifically 3.

j / k navigate · click thread line to collapse

0 comments

bigstrat200311mo ago

> Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect.

IMO, they're still useless today, with the only progress being that they can produce a more convincing facade of usefulness. I wouldn't call that very meaningful progress.

wvenable11mo ago

I don't know how someone can legitimately say that they're useless. Perfect, no. But useless, also no.

addaon11mo ago

> I don't know how someone can legitimately say that they're useless.

wvenable11mo ago

Who said anything about truth? We're talking about usefulness.

drdeca11mo ago

I’ve found them somewhat useful? Not for big things, and not for code for work.

But for small personal projects? Yes, helpful.

conradkay11mo ago

It's funny how there's a decent % of people at both "LLMs are useless" and "LLMs 3-10x my productivity"

otabdeveloper411mo ago

> LLMs 3-10x my productivity

x10 of zero is still zero, I guess.

IshKebab11mo ago

They are very clearly not useless. You haven't given them a fair shake.

marcosdumay11mo ago

Why state the same arguments everybody has been repeating for ages?

vidarh11mo ago

> LLMs can only give you code that somebody has wrote before. This is inherent.

This is trivial to prove to be false.

The first time I tried this was with GPT3.5, and I had it write code in an unholy combination of Ruby and INTERCAL, and it had no problems doing that.

Similarly giving it a grammar of a hypothetical language, and asking it to generate valid text in a language that has not existed before also works reasonably well.

shrewduser11mo ago

vidarh11mo ago

It shouldn't surprise anyone, but it is clear evidence against the claim I replied to, and clearly a lot of people still hold on to this irrational assumption that they can't produce anything new.

1 more reply

JoshCole11mo ago

> LLMs can only give you code that somebody has wrote before.

171862744011mo ago

Isn't the claim, that it will never make up "C"?

JoshCole11mo ago

Your claim here is slightly different.

[1]: https://en.wikipedia.org/wiki/Support_(mathematics)

Edit:

It follows that "C" can appear even though "C" does not appear in the training data.

2 more replies

Epa09511mo ago

First, how much of coding is really never done before?

vidarh11mo ago

> I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.

I literally just pointed out the same time without having seen your comment.

Second this. I've done this several times, and it can handle it well. Already GPT3.5 could easily reason about hypothetical languages given a grammar or a loose description.

I find it absolutely bizarre that people still hold on to this notion that these languages can't do anything new, because it feels implausible that they have tried given how well it works.

skydhash11mo ago

[0]: https://mockaroo.com/

1 more reply

lupire11mo ago

Second, how much of commenting is really never done before?

1 more reply

apwell2311mo ago

> how much of coding is really never done before?

A lot because we use libraries for 'done frequently before' code. i don't generate a database driver for my webapp with llm.

Epa09511mo ago

We use libraries for SOME of the 'done frequently' code.

But how much of enterprise programming is 'get some data from a database, show it on a Web page (or gui), store some data in the database', with variants?

4 more replies

bawolff11mo ago

> First, how much of coding is really never done before?

rhubarbtree11mo ago

That’s not true. LLMs are great translators, they can translate ideas to code. And that doesn’t mean it has to be recalling previously seen text.

nahbruheem11mo ago

Generating unseen code is not hard.

Set rules on what’s valid, which most languages already do; omit generation of known code; generate everything else

The computer does the work, programmers don’t have to think it up.

A typed language example to explain; generate valid func sigs

func f(int1, int2) return int{}

If that’s our only func sig in our starting set then it makes it obvious

Well relative to our tiny starter set func f(int1, int2, int3) return int{} is novel

This Redis post is about fixing a prior decision of a random programmer. A linguistics decision.

That’s why LLMs seem worse than programmers because we make linguistics decisions that fit social idioms.

He’d make a lot of newb errors and need mentorship, I’m guessing.

nprateem11mo ago

I think we're hoping for more than the 'infinite monkeys bashing out semantically correct code' approach.

nahbruheem11mo ago

Read the article; his younger self failed to see logic needed now. Add that onion peel. No such thing as perfect clairvoyance.

Even Yann LeCun’s energy based models driving robots have the same experience problem.

Make a computer that can observe all of the past and future.

Without perfect knowledge our robots will fail to predict some composition of space time before they can adapt.

So there’s no probe we can launch that’s forever and generally able to survive with our best guess when launched.

More people need to study physical experiments and physics and not the semantic rigor of academia. No matter how many ideas we imagine there is no violating physics.

Pop culture seems to have people feeling starship Enterprise is just about to launch from dry dock.

Retric11mo ago

Progress sure, but the rate the’ve improved hasn’t been particularly fast recently.

lexandstuff11mo ago

People seem to have forgotten how good the 2023 GPT-4 really was at coding tasks.

mirsadm11mo ago

apwell2311mo ago

> Perhaps you remember that language models were completely useless at coding some years ago

no i don't remember that. They are doing similar things now that they did 3 yrs ago. They were still a decent rubber duck 3 yrs ago.

vidarh11mo ago

And 6 years ago GPT2 had just been released. You're being obtuse by interpreting "some years" as specifically 3.

j / k navigate · click thread line to collapse