Programming with natural language is going to work (2010) (opens in new tab)

(writings.stephenwolfram.com)

95 pointsmrleinad3y ago233 comments

233 comments

119 comments · 9 top-level

armchairhacker3y ago· 64 in thread

Remember the phrase "developers spend 10% of the time writing code and 90% of the time debugging?"

Even if it isn't 90%, most developers like writing code more than debugging, so most would prefer to automate the latter.

AI translating natural language into code probably isn't as important as AI generating bug-free code and/or debugging its code. Even GPT-4 struggles with this: sometimes you point out a bug and it works, but sometimes it just can't find the issue and starts hallucinating even more as it gets confused.

Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

Also, it's a bit ironic how one way to prevent bugs is using stronger type systems and formal methods. But, AI is particularly bad at formal methods. But maybe with a system like MCTS combined with much faster generation...

blibble3y ago

I have worked on one 30 year old system that was written by people that knew what they were doing

it was then maintained by at least 20 different people that had no idea what they were doing

very little of the altered logic makes any sense and 95% of the time consists of trying to reverse engineer what they were trying to do

then fix it up without breaking other parts of the codebase as many of the logic bugs interact with each other and end up cancelling out

with the garbage generated by LLMs... I suspect all development is going to turn into this

pydry3y ago

You mean one day rich executives are going to be in desperate need of and highly reliant upon skilled developers who can understand and maintain their mission critical systems?

Sounds awful. I was hoping for the future where we would all be out of a job.

thfuran3y ago

I was hoping for the future where I could get paid to work on something slightly less unpleasant than my most byzantine nightmare.

blibble3y ago

those changes were probably the most stressful thing I've ever done, with several billion dollars a day reliant on them, with absolutely no rollback possible

we had two new starters and one long term employee quit after being asked to work on that codebase

(and ended up having to do it myself at extremely short notice... worked though)

bryanrasmussen3y ago

check the wages for Cobol developers in your area recently?

on edit: https://www.zippia.com/cobol-programmer-jobs/salary/#

1 more reply

Guthur3y ago

My thought exactly, and compounded that we will now have far less practiced coders as they all generate the banal they should be learning with.

nonethewiser3y ago

Or maybe it can help you understand the code

blibble3y ago

9 times out of 10 there is nothing to understand

the logic is simply broken

the_other3y ago

> Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

AI lay-person, web developer here.

This hits at something that's been nagging at me for a few weeks. LLMs get labelled "intelligent", but they're "just" spitting out word/symbol patterns. They don't comprehend the meaning of the patterns (as best I have come to understand it from reading news).

Deeper than this, "intelligence" isn't rooted in words. It's rooted in experience, logic and their interaction. The words are a side-effect; they're a way to label the experiences.

We're not training LLMs with experiences, only a reference to experience had by others, which may or may not map to the experience of the interlocutor. The LLM has no experience and limited/no logic to wrestle those experiences into wisdom.

Debugging is a "pursuit of wisdom" activity. Statistics, which underpin LLMs often identify truths (we wouldn't use them in science if they didn't), but they also hide truths. You need to clean data, take unusual slices of it, recombine it, and otherwise process it in order to uncover the true truths. I'm not sure LLMs can do that yet (if at all), and strongly doubt they can without _experience_ (sensory input, intention, failure etc).

Version4673y ago

> They don't comprehend the meaning of the patterns (as best I have come to understand it from reading news).

It is not obvious to me that this is true.

The argument against this is that a model has to develop an abstract understanding of the world in order to get better at predicting the next token.

You'll find different opinions as to wether this happens with todays llm's, but it's definitely not a clear cut issue.

ankurdhama3y ago

> The argument against this is that a model has to develop an abstract understanding of the world in order to get better at predicting the next token.

I think they only develop abstract understanding about the relationships of the words/group of words.

Ygg23y ago

> LLMs get labelled "intelligent", but they're "just" spitting out word/symbol patterns.

I think the word you are looking is knowledgeable. They have huge knowledge repository and can remix it in whatever way they please, but I don't think if you gave them a novel problem (outside their realm of knowledge) they could solve it.

gaogao3y ago

The unicorn example from MSFT's Sparks of AGI paper, https://arxiv.org/abs/2303.12712, probably counts as a novel problem.

As another example, ImageBlind, https://ai.facebook.com/blog/imagebind-six-modalities-bindin..., learned modality translations explicitly not in its training set such as text -> sound.

mattlutze3y ago

Most software devs would prefer a debugger, but most non-software devs would probably prefer a virtual developer.

If a technical product manager was able to generate the code needed for a feature using some combination of LLMs and CoPilots—even for proof of concepts, for example—the business would need to hire fewer developers and saves significant operating costs.

Extending it to your everyday person. Aunt Jane bought a new washing machine and wants to add it into her smart house. Either she's gotta call the niece to hook it all up, write the integration to Home Assistant, write automations for her iOS devices, etc, or, she asks her favorite LLM / Co-Pilot for help and it step-by-step walks her through it directly, with code, with individual troubleshooting.

There's business in both use cases, but the total addressable market is quite a bit bigger for one of them.

noirscape3y ago

Aunt Jane likely didn't even want or might not even have a smart house in the first place. That's perhaps the thing you're missing here - your average person doesn't give a shit if it's "smart", they just want things to work and the only difference for them between "it's smart" and "it's dumb" is if the remote they're fiddling with is their phone, an on-device button or a regular remote (as well as how likely it is that they get spied on externally).

Your average person's ideal amount of exposure to code is zero (as in, most would prefer to not even think about code). If the device isn't automated from the getgo in an easy way, then Aunt Jane has no smart device and no matter how easy it is to write the connection, she's not going to have it unless her niece shows up.

You're severely overestimating the desire for an everyday person to automate their home. Most people are perfectly fine manually flicking light switches and unless you live on a very specific schedule, often automation ends up falling short of practical reality, so even the majority of people who "automate" just end up moving the light switch to their phone.

ohwellhere3y ago

I think there are more roles out there that can benefit than people realize.

I know a cartographer who makes maps in some specialized software and then also has to occasionally interact with some official federally managed database. This person is not a software engineer and the database interaction has been historically painful in the office. This year ChatGPT just helped whip up some Python scripts.

Two hypotheses: 1) the “in house tooling” market is huge and will make great use of this tech; 2) that will contribute to pushing the floor lower and lower where more and more roles and people will take advantage of it.

auggierose3y ago

How about the washing machine just plugs itself seamlessly into her smart house, without Aunt Jane having to do anything than a quick identity check on her phone? Probably needs to be an Apple washing machine, though.

crabmusket3y ago

You mean some kind of interoperability standard?? We'll invent AGI before that becomes a reality!

brokenkebaby3y ago

...with expensive quickly deteriorating cable

1 more reply

discreteevent3y ago

>it step-by-step walks her through it directly, with code

This is exactly where it completely breaks down. The AI might as well show her some hieroglyphics.

GuB-423y ago

I think the idea of replacing devs with LLMs is missing the point of what a developer does.

Yes, today, developers are people who know programming languages, that's something non-developers don't know, so teach an AI do code and you can spare a developer, right?

But no, what developers really do is tell the computer precisely what it should do, a programming language simply the best way we have found to do it. Using GUIs, natural languages, etc... may fell less obscure to the layman, but the problem doesn't go away, in fact, it is often made worse because of the fuzziness of natural languages. There is a reason developer have been using programming languages for so long despite countless attempts to find alternatives, they simply haven't been matched for the task of telling a computer what to do precisely. And if something better is found, then developers will use that, and besides some syntactic technicalities, the job will be essentially the same.

cookieperson3y ago

So you postulate that the installation of every home appliance will have instructions for how to do so with 100% accuracy in Chatgpt? I don't know about you, but having read many installation instructions... I wouldn't trust most humans to get them right, and definitely not an LLM. Half of them refer to screws that don't exist...

bjourne3y ago

Spot the bug in the following C function for reading files:

    bool
    files_read(const char *path, char **data, size_t *size) {
        FILE *f = fopen(path, "rb");
        if (!f) {
            return false;
        }
        fseek(f, 0, SEEK_END);
        size_t n = (size_t)ftell(f);
        rewind(f);
        if (size) {
            *size = n;
        }
        if (!data) {
            goto ok;
        }
        *data = (char*)malloc(sizeof(char)*(n + 1));
        if (fread(*data, 1, n, f) != n) {
            free(*data);
            return false;
        }
        *data[n] = '\0';
     ok:
        fclose(f);
        return true;
    }

ChatGPT solves this task instantly. Many, even experienced C developers, would have trouble finding the bug. And the lack of error checking on malloc is not a bug.

golergka3y ago

Just for reference, copied this snippet in GPT-4, it generated this answer in 1,5 seconds:

> The issue in this code is related to memory access. The line data[n] = '\0'; should be (data)[n] = '\0';.

And then went on to generate corrected code and this explanation:

> The original expression, data[n], first applies the array subscript operation, which is equivalent to (data + n), and then tries to dereference the resulting pointer, which is not the intention here. This will likely lead to a segmentation fault because data is a pointer to pointer, and data[n] will try to access memory that has not been allocated.

> The corrected expression, (*data)[n], first dereferences data to get the allocated char array, and then applies the array subscript operation to set the null terminator at the correct location.

ChatGTP3y ago

Just wondering if you think this type of bug hasn't been covered at least 50x in many online forums, I'd be very surprised?

I'm not trying to say this isn't impressive, but seems fairly obvious and more like something a linter / analyzer would pickup ?

2 more replies

thfuran3y ago

What does it do if you correct the bug and then ask it what the bug is?

Joker_vD3y ago

Not going to talk about the actual bug(s), but... why is it trying to NUL-terminate the buffer full of binary ("rb") data? Also, sizeof(char)? That's 1 by definition.

Actually, no, let's talk about the bug. What's with people's obsession of eliminating local temporaries and working straight with pointer-to-result?

        char *buff = malloc(n + 1);
        if (fread(buff, 1, n, f) != n) {
            free(buff);
            fclose(f);
            return false;
        }
        buff[n] = '\0';
        *data = buff;
      ok:
        fclose(f);
        return true;

This way, data won't contain a freed pointer when reading fails.

bjourne3y ago

"rb" because you don't want Windows to open the file in translated mode and munging its contents. Null-termination so that the same code can deal with both binary and text files. And I prefer all mallocs to follow the pattern malloc(sizeof(tp)*n) because it's less hassle than thinking about what types' sizes may vary. This rabbit hole is endless and we haven't even started covering my use of fseek/ftell/rewind!

1 more reply

pfdietz3y ago

You also forgot to close f when returning false.

raincole3y ago

> Also, it's a bit ironic how one way to prevent bugs is using stronger type systems and formal methods. But, AI is particularly bad at formal methods.

It kinda works tho. In my anecdote Copilot works much better with C# than Python simply because I can write the signature of a function and that it generates the content.

(I know Python has type annotations too, but Copilot just isn't as smart as with C#. Perhaps because there isn't enough training data in typed Python?)

moffkalast3y ago

Hmm, so the endgame would be the most strongly defined and strict language so LLMs can immediately see and fix mistakes, but automate the tedium of writing it by having them as an intermediary?

raincole3y ago

The endgame is either the elimination of jobs or the elimination of human beings.

But yes, what you described matches my experience. I'd say Copilot benefits those "tedious but relatively strict" languages the most.

Turskarama3y ago

Training an AI to write unit tests (and refactor code to be more testable) would be a real game changer.

riffraff3y ago

Copilot does a decent job with "dumb" unit tests, e.g. "when the condition X is Y then Z".

But it obviously doesn't have knowledge of your domain, and I don't think any LLM could do a perfect job without that.

protortyp3y ago

I already use gpt4 for this. Works quite well if you show it a couple of reference unit tests.

toolslive3y ago

you can quite easily use a GPT to generate test examples. For example, "generate me 1000 customer support mails about ..."

byby3y ago

I don't think you get it. The game has already changed. And nobody trained the AI to write unit tests or refactor code. It could do it as emergent behavior.

It happened so fast that so many people are in denial and many people aren't even asking the obvious questions. Simply ASK chatGPT to refactor your code to be more unit testable and IT Can do it.

Of course it clearly has your typical LLM problems but it is roughly 50% there. The game has changed. If AI gets to 100%, not only will the game be changed, but humans will no longer be part of the game.

See what I asked chatGPT:

   ME:
   def (x: int):
      for i in range(x):
         print(i)

   Can you refactor the code so it can be more unit testable?


   CHATGPT:
   Certainly! To make the code more unit testable, we can refactor it into a function that returns a list of integers instead of directly printing them. Here's the refactored code:

   python

   def generate_number_list(x: int) -> List[int]:
       number_list = []
       for i in range(x):
           number_list.append(i)
       return number_list

   In this refactored code, we've created a function called generate_number_list that takes an integer x as input. Instead of printing each number, we use a number_list variable to store the generated numbers by appending them in the loop. Finally, we return the number_list.

   By returning the list instead of printing the numbers directly, we make the code more testable because we can now write unit tests that assert the correctness of the generated list.

If boggles my mind that I see threads of people saying "what if we train LLMs to do this or to do that?" or whatever. Guys, just ask the LLM to do what you want in plain English. It can do it.

barrkel3y ago

This is a fine, absolutely trivial, example. But LLMs are simply not all that.

IME GPT-4 can't write a bug-free 10 line shell script. It's particularly poor at inferring unstated requirements - or the need to elicit the same.

There's a general problem with LLMs: they're too eager to please. It shows up as confirmation bias. Embed a perspective in your prompt, and LLMs continue in the same vein.

You can, with careful prompting, try to provoke and prod the text generation into a more correct shape, but often it feels to me more like a game than productivity. I have to know the answer already to know how to ask the right questions and make the right corrections. So it feels like I'm supervising a child, and that I should be amazed it can do anything at all. And it is amazing; but for productivity outside tightly constrained environments (e.g. converting freeform dialogue into filling out a bureaucratic form - I think this is a close to ideal use case), I struggle to see it scaling up much, from what I've seen so far.

For creativity - e.g. making up a story for a child - it's not bad. One of my favourite use cases, after discovering how bad it is at writing code.

1 more reply

mjburgess3y ago

That isn't the game.

The game is designing software to requirements. It's writing literature for a new era. It's creating X for A audience with N vauge unspecified needs -- where X is a complex product made of many parts, involving many people, with shifting and changing problems/solutions/requirements.

The game was never writing the stack overflow answer -- that was already written.

2 more replies

blibble3y ago

if it actually understood what it was doing it would tell you that that logic doesn't need a test as the python has the range(x) functionality built-in

instead it generates a load of redundant boilerplate

if I saw a developer check that in I'd think they were incompetent

2 more replies

Marazan3y ago

You do get that that code is garbage and if any dev tried to check that code in they would get laughed out of the room right?

It is pure pablum. It is almost the perfect example of LLMs producing fluent vapid bilge.

1 more reply

cookieperson3y ago

So your current stance is, LLMs can't do everything yet, but don't bother thinking about extending it's capabilities just ask it it can do everything? Fascinating...

1 more reply

Turskarama3y ago

Yeah, it does great on little toy examples.

What I would like to do is feed in my entire 50k line program and get something out.

2 more replies

toolslive3y ago

  >    ME: 
  >        f(x) = exp(cos(x))
  >
  >        what is f(0) ?
  >
  >    GPT-3.5:
  >        f(0) = 1

(note, it most often generates the correct result, but I've seen it do the above too)

2 more replies

crabmusket3y ago

This a classic case of understanding programming as an activity that is primarily about text production. This is what Naur was arguing against in Programming As Theory Building [1].

Taking Nair's viewpoint, coding AI will be useful only to the extent that it assists programmers to build and employ their theory of the program. Or to itself develop a useful theory of a specific program, not just programs in general.

[1] https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f...

Ygg23y ago

Honestly ChatGPT-4 sucks at writing code (if similar code isn't in its corpus of knowledge).

I asked it to write a YAML parser in pure Rust. The first thing it did was write this:

    pub enum YamlNode {
       Scalar(String),
       Sequence(Vec<YamlNode>),
       Map(HashMap<String, YamlNode>),
    }

This is what I expect a freshman in college to write as their first Yaml parser.

Literally unusable as a starting point.

I did try to correct it but it tripped up and somehow managed to fuck that up to.

hderms3y ago

Why does it suck, because it's doing a lot of allocations?

Ygg23y ago

No. You have a fair point, but not my biggest gripe. It's much more fundamental to what YAML is.

YAML allows YamlNodes as keys in a map (aka complex keys). E.g.

    ? [ a, b]
    : "complex key, scalar value"

You asked it for YAML parser and it gave you a shitty JSON parser.

Also this is the most banal thing about YAML, something that should be obvious just browsing the examples. And I gave it test suite and link to YAML spec.

Most work in YAML is around super obscure corner cases.

pornel3y ago

Rust is 90% arguing with the compiler and 10% debugging. And yesterday GPT-4 solved a WTF lifetime issue for me pulling a trick I wouldn't have thought about.

Ygg23y ago

Probably because its corpus of knowledge includes someone solving it like that.

pixl973y ago

I'm not really a programmer, but when I write code for stuff I need this is how I operate. The novel idea I want to solve is the end goal, not the intermediate code that I'm gluing together that other people have generally created.

ChatGTP3y ago

What was the WTF lifetime issue?

pornel3y ago

I wanted to compare function pointers, but that's tricky due to unique function types, disallowed non-primitive casts, plus apparently not allowed for types with higher-ranked lifetimes (for<'a> fn(&'a)). GPT-4 came up with using a type alias (type FnComparable<'a>) that added necessary coercions and gave them equal lifetime.

api3y ago

Stronger type systems really do prevent bugs. In Rust I probably spend 20% of the time at most debugging. Of course you pay for that by the fact that it takes longer up front to write Rust because you have to think about it more, but overall I think it’s a net win.

You also get the bonus of nasty bugs like crash bugs and exploitable memory problems become almost entirely a thing of the past. You can ship new software you know is very reliable.

AnimalMuppet3y ago

I think that training GPT to use a debugger is going to be hard. For many debug problems, the amount of context you have to keep is going to be really hard for GPT.

> But, AI is particularly bad at formal methods.

AI? Or GPT? Earlier, reasoning-based attempts at AI seem like they would be a natural fit for formal methods.

ly3xqhl8g93y ago

Train them how to hallucinate the debugger and the world. Forget about running the debugger only for your program, simulate the entire stack, networking and databases included, for a true timeless debugger.

bravura3y ago

I’ve said it before and I’ll say it again. A time traveling retroactive debugger integrated with an LLM for interrogating what happened and why, would be awesome.

Hirrolot3y ago

I'm imagining an AI based on a deductive system rather than sequential text generation. This is roughly how "strong type systems" work, and so it might be simpler to map this model to formal methods.

By the way, if you spend 90% of time debugging your code, I think that's really sad. Either the programming language sucks, or the codebase sucks, or both. I probably spend 10-20% max of my time on debugging.

hnfong3y ago

The "strong type systems" work by having you design the types correctly in the first place. If designed correctly they work wonders, and 90+% of the time if the code compiles it's probably correct. That's a big "if" though, and most of your time writing "strong type systems" is coming up with the correct type structure.

The one thing worse than debugging code is debugging "types". Those 30-page C++ compiler errors are definitely worse than whatever I had to do with (for example) Python.

4 more replies

byby3y ago

>Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

We can actually try this now. Literally tell the LLM what you want to do and work with it. See how far you can take it. You will of course be asking the LLM for debug line inputs and you will be providing it with outputs as you run the debugger yourself.

antihipocrat3y ago

Or just learn to write it ourselves? If it takes the same amount of time to work with the LLM as coding it from scratch then I'd prefer to improve my coding ability while I do the work.

Kostchei3y ago

With no experience in java, no coding for 30years since pascal in high school, no previous use of git or github, no hands on experience of the azure stack... I stood up 4 static web apps that do things I want in my hobby in 4 weeks- the first one took 12 hours including being shown git, installing npm etc etc. The last one took me 40 minutes. They do things for me in D&D that I have wanted for 20 years- now that capability is accessible. Whole monster manual ingested into a level, terrain and faction based encounter system that give ranges and features for the encounter ie a battle map. Scaling encounters suitable for the party at any level that theme with the terrain and dominant faction. The best thing about an MMO but for 5thed dnd.

Did I learn a bit of java and css and git?- sure, but I was up and running in about 4 hours with a mvp for my 1st one. There is NO way I could "learn" that in that timeframe. I just asked chatGPT 4 how to do it, and it told me. When I didn't know how to commit, it told me (actually I didn't even know the concept). It held my hand every step of the way.

I didn't need to learn something first, I just did it. And I have started doing it at work. "hmm 4 GB of fortinet logs in 20 files of gzip on mac.. how do I find a host name in that? - chatgpt.. oh- 1 line of zgrep.. never heard of it- hey it works.."

admittedly, I am bathed in tech, been hanging around folks talking about projects for years. But NOW I can execute- the problem? When it hits about 500 lines of java- maybe 10 functions, it is too big to drop into the prompt to debug and I don't know enough to fix myself. Solution, make smaller apps, get them working, create data files to reference in json, chain them together. eh, not perfect, but good enough for hobby.

Beware- fools like me who know nothing will be bringing code to production near you soon. Cool that you like to learn stuff, but syntax bores the crud out of me, each to their own, I'm just going to make. I find it more satisfying. Terrifying that code born like mine will end up in someone's prod, but it will.

2 more replies

byby3y ago

I'm just saying the tech is already here. The core engine can do it.

Before you go on and write such a system it's better to test if the LLM can do debugging to an efficacy level that we require. I don't think anyone has tried this yet and we do know LLMs have certain issues.

But make no mistake, the possibility that an LLM knows how to debug programs is actually quite high. If it can do this: https://www.engraved.blog/building-a-virtual-machine-inside/ it can likely debug a program, but I can't say definitively because I'm too lazy to try.

2 more replies

Forgotthepass83y ago

I fiddled around with some things on the weekend (i am not a programmer, i actually hate it so using LLMs is great for me - us EEs always write awful code) to automatically create a debug file of any output that gets a traceback and create a standard report using pdb, inspect, etc (never used them before) regarding the functions, parameters and variables, current state etc etc.

Though i was surprised i can't easily run pdb instance via a python program, still have to use stdin/out apparently.

Next i want to implement automerge (or semiautomerge) between different outputs which e.g. contain variants of the same function to automatically resolve issues spawned from the model forgetting. That's so annoying

I also suspect a lot of issues are due to the training data being on old SW. I think we can automatically remap this with whitelisted functions and parameters (i recall inspect can do this), blacklisted ones from old version NOT present in the current, and maybe a transformation between the two -- or automatisch regenerate if it's wrong, maybe with a modification to the prompt.

Also talking to it in other languages generates massively different code (i used deepl) so i had the crazy idea of spawning Dockers and just letting this automatic/semiautomatic trouble Shooting+ just parallel generating lots of functions using wildly different inputs (and models) to brute force the problem of having to code

I do need to look into a nice terminal interface for N-way merges and parallel gen monitoring.

The most useful thing for me was making some vim keybinds and scripts to automatically grab Codeblocks, run them and quickly regenerate. You can literally just tell it "DF" and if fixes a pandas issue sometimes

The holy grail will probably be local fine tunes/LoRAs for specific issues or libraries, since it only costs a few $ for one. Sign me up for an expert plotly AI in a box for neat plots please

Edit : i also have literally no idea what I'm doing either, but linting and analyzing generated Code blocks could help expedite this whole process as well. And in principle you don't even have to run it if you know the type is wrong or something.

I don't know what this is called but computer science is ostensibly mathematics so i assume/hope there is some rigor here

omginternets3y ago· 23 in thread

Sure, if the natural language we're talking about is formalized to the point of basically being math. Let's not confuse that with "everyone will be able to program", though.

speed_spread3y ago

It'll be the same way everybody can program with Excel right now. Things will work fine, until they don't anymore, at which point a professional will be brought in to help fix "the bug". Obviously, that bug will the obvious result of the whole thing being a horrible mess and will be unfixable without a proper rewrite.

_Maybe_ the AI will be able to help with the rewrite, generating test cases and translating business logic, which would be nice.

qumpis3y ago

I think many people give up too quickly (as could perceived by those professionals who had to put sweat when learning how to put things back when they break). But a capable language model might serve as a debugger: "you're stuck? About to call a professional? No worries, explain to me what you see and let me ask you questions in order to find a fix together"

noisy_boy3y ago

I think it'll evolve into the AI Assistant being an always-on/always-available app/service/client (ala Siri/Alexa) coupled with a much more powerful online service. Advantage is that once the client is installed, it has access to practically everything the user is doing (which is what the tech companies dream of) and can actually ingest the inputs without the user having to summarize/translate into a text box (problem with Excel? AI assistant can read the excel sheet that has the issue so you can just ask questions).

I also think that the main two reasons behind Alexa/Siri not seeing very high adoption rates are misinterpretation of the voice commands + quality of results. With LLMs, at least the first issue should improve dramatically and if the popularity of ChatGPT is anything to go by, the second issue should also see improvements.

cableshaft3y ago

I'd much rather program with the help of ChatGPT 4, as it is right now, over programming with Excel (I've done both). And at least on a smaller scale, the code generated by ChatGPT hasn't been a mess at all.

Sometimes it's incorrect or insufficient and it can't handle the more complex solution I'm asking of it, but the code isn't messy and doesn't require me to rewrite it from scratch, just fix what's incorrect.

6gvONxR4sf7o3y ago

I'd love if the ambiguities could be a dialogue of question/answer, rather than being fully specified ahead of time like we generally have programming today. It seems much more efficient.

dools3y ago

That's what programming with ChatGPT is like.

2 more replies

quickthrower23y ago

Sounds like agile (small a) as opposed to waterfall.

grt_thr3y ago

The ambiguity is exponential. I wish that the people hyping on llms read the older literature and sentence parsing.

The only reason why people are so impressed is that chatgpt sometimes gives better results than Google. Which just ought to tell you hiw bad google has gotten.

wolfgang423y ago

Well, programmers provide a natural language interface and somehow we usually manage the ambiguity and complexity OK.

In my experience, a lot of support requests for bespoke/in-house software go like this:

> User: Why is my wibble being quarked? This shouldn’t be happening!

> Dev: Wibble ID, please?

> User: ID 234567. This is terrible!

> Dev: [rummages in git blame] Well, this wibble is frobnicated, and three years ago [links to Slack thread] you said that all frobnicated wibbles should be automatically quarked.

> User: Yes, but that was before we automated the Acme account. We never frobnicate their wibbles!

> Dev: ...so, is there a way for me to tell if a client wants their wibbles unfrobnicated, or should I hard-code an exception for Acme?

(And then, six months later: “Why are none of Acme’s wibbles being frobnicated automatically?”)

If you could introduce an AI assistant that could answer these questions instantly (instead of starting with a support ticket), it’d cut the feedback loop from hours or days down to seconds, and the users (who are generally pretty smart in their field, even if my frustration is showing above) would have a much better resource for understanding the black box they’ve been given and why it works the way it does.

2 more replies

6gvONxR4sf7o3y ago

But that's exactly the point. The game of 20 questions is exponential as well. To uniquely identify a thing, the more specific you go, the level of precision needed to be unambiguous blows up. However, as a dialogue, you don't have to fully spec out every branch of the tree ahead of time. They ask a yes or no question and the ambiguity decreases exponentially for every question asked.

By having a dialogue, you can resolve only the ambiguities pertinent to the specific question at hand.

qumpis3y ago

There's no need to detail how individual bricks of a house will be laid out when discussing the overall plan of it. Current LLMs, from my experience, don't branch out too well when facing ambiguity, but rather pick the most likely answer consistent with the history. But it's imaginable that these concerns will be addressed once systems start maximizing the returns over whole conversations and not just individual interactions.

1 more reply

cookieperson3y ago

Google search has eroded a lot over the years. I think some of that is scale, but also adversarial SEO listings, etc. The biggest jump in its degregation Ive noticed happened around the time chatgpt launched after it's beta. Now every major search engine has an LLM product... Has had me wondering...

Paul-Craft3y ago

Yeah. I actually mis-parsed the headline myself, before noting the date. I was thinking "is going to work" meant it would be happening in more and more workplaces.

bee_rider3y ago

I guess this will open up programming to anyone who can handle the math required to describe the behavior of a program. Unfortunately it is not even clear to me that this population is anywhere near as large as the existing population of programmers.

shrimp_emoji3y ago

Isn't ChatGPT (or some now not that far fetched futuristic version that's superior) a 5GL? You ask it for a program in English, and it codes it for you. You've just coded in natural language.

goatlover3y ago

You still have to make sure the code runs correctly and incorporate it into the rest of the code base. You also need to have some idea of what sort of code you're asking for, particularly if it has to fit with existing code.

captnObvious3y ago

Check out chatgpt code interpreter. One step down…

quickthrower23y ago

It can do that if constrained. There was a show HN of an app generation tool that was quite impressive: perhaps because it only created web apps and didn’t worry about stylistic design.

takeda3y ago

I would love to see somebody writing a compiler that way.

n8cpdx3y ago

https://medium.com/@byanofsky/chatgpt-helped-me-build-a-comp...

ilaksh3y ago

The original version of my site aidev.codes was a bit like that. The concept was to keep refining the spec and each time it would regenerate starting with whatever section had been modified. There was even a way to clone an "app spec" like with Codepens.

People generally didn't seem very interested. Also the bigger problem was that the gpt-3.5 models really wouldn't return the same thing consistently so a minor change could throw other things off.

Also a conversation feels like a much more intuitive way of doing it. So I switched to that.

geysersam3y ago

I don't see why you need a language "formalized to the point of math" to instruct a computer to follow simple instructions (Write a crud API for this and this, it should have these endpoints etc.)

Of course there will be ambiguity, but you could say anything higher level than assembly is also ambiguous, but that's not usually a problem.

I think it's reasonable to expect that more people will be able to program if this becomes reality. Just as going from assembly to c allowed more people to program.

cableshaft3y ago

Eh, I was using it to help generate animated backgrounds with various custom geometry and animations tonight for a game of mine, and it performed admirably but it was struggling a bit.

I wouldn't mind if it had a bit more formalization to the point of math to it for those bits. Was still good enough that I got some usable options from it, though.

lcuff3y ago· 9 in thread

I suspect as things evolve with the Large Language Models, there will be integration with existing computer languages and frameworks. That the ability to say "Create a web site using language X with framework Y" will become a reality. This ability to get Hello World done in 5 minutes instead of 1/2 a day to walk through a intro book/tutorial, well, that's a win.

Then, down the road, each AI might have a preferred/default language, technology and framework. Quite possibly newly created. This has a parallel to intermediate representations (the earliest of which I'm aware of is p-code in UCSD Pascal). But it is also analogous to compiling C to assembler, and then machine code. Similarly with Java. Wolfram is correct when he talks about needing it to be a representation that the creator can inspect and verify. Not necessarily the least bit easy with a complex project. It will be an engineering journey, but it does spark in me the hope that English (or anyone's native language) becomes the high level language of choice for guiding machines in tasks. Mathematica? Humph. Stephen Wolfram is very pleased with things he's thought of or perhaps synthesized, but I'm gonna say, I think Mathematica is not the generic solution we will want.

nordsieck3y ago

> I suspect as things evolve with the Large Language Models, there will be integration with existing computer languages and frameworks. That the ability to say "Create a web site using language X with framework Y" will become a reality.

So, I think a key challenge is that modern programming languages do 2 things simultaneously.

1. They provide direct instruction to computers.

2. They document precise human intent.

Even if the need for part 1 goes away, part 2 will always be with us. I would expect that if natural language programming becomes a thing, that a dialect forms - akin to legalese - that embodies best practices for precisely documenting human intent.

galleywest2003y ago

> I would expect that if natural language programming becomes a thing, that a dialect forms - akin to legalese - that embodies best practices for precisely documenting human intent.

Sort of like how the SQL syntax seems designed for business admin folks to use, but us programmers ended up using it.

2 more replies

tbrownaw3y ago

> This ability to get Hello World done in 5 minutes instead of 1/2 a day to walk through a intro book/tutorial, well, that's a win.

So a bit like `create-react-app` or `dotnet new` or such, but now powered by AI?*

lcuff3y ago

Well, I haven't used either of those, so I can't comment on a comparison ... but the point is less 'powered by an ai' and more the fact that I'm using English, not the jargon of create-react-app or dotnet new or the rails version or the django version ... Right now, my experience programming with an AI is that I'm much better off asking for subroutines to do this and that, not ask for a complete architecture. Mostly for debugging reasons: When it doesn't work, I want the debugging to be manageable. Getting the AI to debug its own code has not been a productive path for me, but hopefully soon.

dboreham3y ago

> This ability to get Hello World done in 5 minutes...

This has existed since at least 1992 (Visual C++) and probably before that (Borland Pascal?).

lcuff3y ago

Well, I'm thinking more of a web site Hello World than a C++ Hello World. The original point of Hello World (per Kernighan and Ritchie) was to ensure that all the tools were installed (the compiler itself wasn't a given) and in your path and configured correctly. So yeah, I was say even in 1988 (when I started programming in C) there's a way in which Hello.c was 5 minutes. But when I started with Rails (years later), it was more like 1/2 a day, and the Hello World notion had expanded in scope by quite a bit. I didn't spell out the _kind_ of Hello World I had in mind. My bad.

riceart3y ago

Web hello world in rails is literally a single command and seconds.

1 more reply

65103y ago

Front end js is like the nest we've been building for it - not knowing why.

byby3y ago

>Create a web site using language X with framework Y

Not sure if you played with chatGPT, but you can ALREADY ask it to do this and it will give you something that works 50% of the time.

dboreham3y ago· 5 in thread

We've been doing this for 200+ years. All the programming languages we use, and mathematical notation, are based in some way upon natural language concepts (yes even FP languages). So really we're talking about a continuum. And also worth noting that human languages evolve over time. So both programming languages will evolve to be more natural, and natural language will evolve to be more machine-parsable.

gwoolhurme3y ago

Yeah. The first time I saw Haskell written out, it felt like a blending of formal math and programming. I was really excited at how it looks like it could have come from my discrete math book. I don't think programming in english is a good idea, but like you said, we will probably get something that looks more like formal languages, because that's what we've always done.

grt_thr3y ago

Mathematics as shown in textbooks is not rigorous. I don't understand why so many people fetishize something they saw at university.

Formalised mathematics are incomprehensible to humans and orders of magnitude longer then anything you can see in textbook or mathematical papers outside automated theorem proving.

2 more replies

behnamoh3y ago

> So both programming languages will evolve to be more natural, and natural language will evolve to be more machine-parsable.

Hard disagree. Is Rust similar to natural language? Has English become more similar to <pick the easiest programming language>?

gwoolhurme3y ago

Just to play with the logic a bit. I could see natural language coming up with formal templating to get something closer to a programming language. Not like how we speak will change, but more like how a haiku works is formalized. Similarly to how every CS1 class in university has people who do the "write the steps in making a sandwich" exercise. That exercise is actually easier in a programming language, because of the nature of what a programming language is, its instructions in steps at the end of the day. So perhaps something is formalized that way? Also most modern languages have been slowly adding in things that are easier to read from a natural language perspective VS classical C syntax style. Rust is very C-like but still has plenty of FP like functions built in too.

EamonnMR3y ago

The way you're using brackets sure resembles Backus-Naur form.

dusted3y ago· 4 in thread

One way to view a programming language, that differs from the traditional math-oriented perspective, is that a programming language is a subset of a natural language which has become sufficiently unambiguous.

Ambiguity is in my opinion the biggest reason why "plain {insert natural language} programming won't be a thing".. The major challenge in normal commercial software development is not writing working code, it's aligning expectations and identifying assumptions and eliminating ambiguities.. At that point, the coding part is more or less mapping the description to whatever particular words the implementation language is using.

fsflover3y ago

> Ambiguity is in my opinion the biggest reason why "plain {insert natural language} programming won't be a thing"

How about Ithkuil? https://news.ycombinator.com/item?id=36022731

classified3y ago

You're joking. Even the inventor doesn't speak it fluently.

fsflover3y ago

I wasn't talking about fluency, just about possibility to use it for programming.

thomastjeffery3y ago

Programming languages don't entirely remove ambiguity: they isolate themselves from it.

Everything in a program is unambiguous, but no useful program stays entirely within its own black box.

Ambiguity still exists between programs. This ambiguity is the source of incompatibility.

dang3y ago· 4 in thread

Discussed at the time:

Generating code from natural language is closer than you think - https://news.ycombinator.com/item?id=1912530 - Nov 2010 (80 comments)

rexsteroxxy3y ago

From the top comment:

> I teach computer science and have a particular fondness for introductory CS. The reason Stephen Wolfram is wrong, wrong, wrong about this is that people that have never been taught programming can't express themselves precisely enough in their native language, either; and even among those of us that have been programming for decades, when we express ourselves in natural language we can be very precise but it takes a lot more work and becomes a lot more unwieldy than just writing out our instructions in [pseudo]code.

I used to think the same thing, but I what changed my mind was the noticable increase in precision that came with ChatGPT 4. Before I felt like a monkey - not I actually get things done they way "I intended".

cookieperson3y ago

So you can go to chatgpt and say," I want to make a new search engine that is better then my competitors in performance, resource consumption, and cost. It must be the best in class for relevant searches, using a new hyper efficient search algorithm and data structures. It must have a pay to use API so I can sell ad space. It needs to have a marketable name with best selling branding. I also would like this search engine to have email, an online multiuser Microsoft word clone with the fastest live editing feature. This platform must be written in terraform, html, go lang, and node js using trusted dependencies and be completely secure using best practices. I also need end to end tests for the deployment of this product, dashboards to monitor it's uptime, performance, revenue, etc in AWS. My initial budget is 100k. Given the technical realities of my request also provide a summary for any limitations that arose, and technical specifications of this product."

Dang dude, why aren't you a billionaire competing with Google, bing, etc tomorrow?

WastingMyTime893y ago

Talk of moving the goal post.

People are having a discussion about the ability of LLM to generate correct code from imprecise natural language descriptions and how good GPT4 was at that and your rebuke is that it can't do cutting edge algorithm research and an unnecessary ad hominem.

1 more reply

andromaton3y ago

From the second comment: *I would be fascinated to see several hundred years down the road how natural languages and computer languages have comingled and evolved into something new. I'd be inclined to believe that bringing natural language to computers won't just be a one-way street.

Order update: Your wish is arriving hundreds of years early.

heikkilevanto3y ago· 1 in thread

Long time ago they tried to develop a system where the users could explain things in "plain" English instead of the difficult codes used so far. It was called cobol.

As has already been pointed out, 90% of time is spent on debugging and modifying existing code, not writing new stuff. And of the 10% of coding, 20% is writing the happy path, and 80% is spent in handling errors, corner cases, input validation, and inconsistent domain rules.

Maybe we can have AI tools to help with all this, but there is still a long way to go. And when we get there, it will still take professional developers to use those tools, and to understand all the special cases.

moffkalast3y ago

> It was called cobol.

Taking something that's lower level than C and replacing characters with words is not exactly what one'd call a plain English interface, it's just being obnoxiously verbose.

There's a stark difference between trying to somehow half-assedly hardcode this into a language, and having a language that is designed for debugging only, fairly strict in handling corner cases, and then having a natural language interface on top of it so nobody actually has to write the cancer that it likely ends up being.

pyrale3y ago

It seems like this article is an advertising for Wolfram Alpha's then-new feature. It brings few elements to support the headline thesis, aside from showing these features.

makz3y ago

This is not new. There is something called rethorical algebra.

Rhetorical Algebra

It was developed by ancient Babylonians where the equation was written in the form of words that remained up to the 16th century.

Example: x + 5 = 8, is written as " The thing plus five equal to eight".

j / k navigate · click thread line to collapse

233 comments

119 comments · 9 top-level

armchairhacker3y ago· 64 in thread

Remember the phrase "developers spend 10% of the time writing code and 90% of the time debugging?"

Even if it isn't 90%, most developers like writing code more than debugging, so most would prefer to automate the latter.

Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

blibble3y ago

I have worked on one 30 year old system that was written by people that knew what they were doing

it was then maintained by at least 20 different people that had no idea what they were doing

very little of the altered logic makes any sense and 95% of the time consists of trying to reverse engineer what they were trying to do

then fix it up without breaking other parts of the codebase as many of the logic bugs interact with each other and end up cancelling out

with the garbage generated by LLMs... I suspect all development is going to turn into this

pydry3y ago

You mean one day rich executives are going to be in desperate need of and highly reliant upon skilled developers who can understand and maintain their mission critical systems?

Sounds awful. I was hoping for the future where we would all be out of a job.

thfuran3y ago

I was hoping for the future where I could get paid to work on something slightly less unpleasant than my most byzantine nightmare.

blibble3y ago

those changes were probably the most stressful thing I've ever done, with several billion dollars a day reliant on them, with absolutely no rollback possible

we had two new starters and one long term employee quit after being asked to work on that codebase

(and ended up having to do it myself at extremely short notice... worked though)

bryanrasmussen3y ago

check the wages for Cobol developers in your area recently?

on edit: https://www.zippia.com/cobol-programmer-jobs/salary/#

1 more reply

Guthur3y ago

My thought exactly, and compounded that we will now have far less practiced coders as they all generate the banal they should be learning with.

nonethewiser3y ago

Or maybe it can help you understand the code

blibble3y ago

9 times out of 10 there is nothing to understand

the logic is simply broken

the_other3y ago

> Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

AI lay-person, web developer here.

Deeper than this, "intelligence" isn't rooted in words. It's rooted in experience, logic and their interaction. The words are a side-effect; they're a way to label the experiences.

Version4673y ago

> They don't comprehend the meaning of the patterns (as best I have come to understand it from reading news).

It is not obvious to me that this is true.

The argument against this is that a model has to develop an abstract understanding of the world in order to get better at predicting the next token.

You'll find different opinions as to wether this happens with todays llm's, but it's definitely not a clear cut issue.

ankurdhama3y ago

> The argument against this is that a model has to develop an abstract understanding of the world in order to get better at predicting the next token.

I think they only develop abstract understanding about the relationships of the words/group of words.

Ygg23y ago

> LLMs get labelled "intelligent", but they're "just" spitting out word/symbol patterns.

gaogao3y ago

The unicorn example from MSFT's Sparks of AGI paper, https://arxiv.org/abs/2303.12712, probably counts as a novel problem.

As another example, ImageBlind, https://ai.facebook.com/blog/imagebind-six-modalities-bindin..., learned modality translations explicitly not in its training set such as text -> sound.

mattlutze3y ago

Most software devs would prefer a debugger, but most non-software devs would probably prefer a virtual developer.

There's business in both use cases, but the total addressable market is quite a bit bigger for one of them.

noirscape3y ago

ohwellhere3y ago

I think there are more roles out there that can benefit than people realize.

auggierose3y ago

crabmusket3y ago

You mean some kind of interoperability standard?? We'll invent AGI before that becomes a reality!

brokenkebaby3y ago

...with expensive quickly deteriorating cable

1 more reply

discreteevent3y ago

>it step-by-step walks her through it directly, with code

This is exactly where it completely breaks down. The AI might as well show her some hieroglyphics.

GuB-423y ago

I think the idea of replacing devs with LLMs is missing the point of what a developer does.

Yes, today, developers are people who know programming languages, that's something non-developers don't know, so teach an AI do code and you can spare a developer, right?

cookieperson3y ago

bjourne3y ago

Spot the bug in the following C function for reading files:

    bool
    files_read(const char *path, char **data, size_t *size) {
        FILE *f = fopen(path, "rb");
        if (!f) {
            return false;
        }
        fseek(f, 0, SEEK_END);
        size_t n = (size_t)ftell(f);
        rewind(f);
        if (size) {
            *size = n;
        }
        if (!data) {
            goto ok;
        }
        *data = (char*)malloc(sizeof(char)*(n + 1));
        if (fread(*data, 1, n, f) != n) {
            free(*data);
            return false;
        }
        *data[n] = '\0';
     ok:
        fclose(f);
        return true;
    }

ChatGPT solves this task instantly. Many, even experienced C developers, would have trouble finding the bug. And the lack of error checking on malloc is not a bug.

golergka3y ago

Just for reference, copied this snippet in GPT-4, it generated this answer in 1,5 seconds:

> The issue in this code is related to memory access. The line data[n] = '\0'; should be (data)[n] = '\0';.

And then went on to generate corrected code and this explanation:

> The corrected expression, (*data)[n], first dereferences data to get the allocated char array, and then applies the array subscript operation to set the null terminator at the correct location.

ChatGTP3y ago

Just wondering if you think this type of bug hasn't been covered at least 50x in many online forums, I'd be very surprised?

I'm not trying to say this isn't impressive, but seems fairly obvious and more like something a linter / analyzer would pickup ?

2 more replies

thfuran3y ago

What does it do if you correct the bug and then ask it what the bug is?

Joker_vD3y ago

Not going to talk about the actual bug(s), but... why is it trying to NUL-terminate the buffer full of binary ("rb") data? Also, sizeof(char)? That's 1 by definition.

Actually, no, let's talk about the bug. What's with people's obsession of eliminating local temporaries and working straight with pointer-to-result?

        char *buff = malloc(n + 1);
        if (fread(buff, 1, n, f) != n) {
            free(buff);
            fclose(f);
            return false;
        }
        buff[n] = '\0';
        *data = buff;
      ok:
        fclose(f);
        return true;

This way, data won't contain a freed pointer when reading fails.

bjourne3y ago

1 more reply

pfdietz3y ago

You also forgot to close f when returning false.

raincole3y ago

> Also, it's a bit ironic how one way to prevent bugs is using stronger type systems and formal methods. But, AI is particularly bad at formal methods.

It kinda works tho. In my anecdote Copilot works much better with C# than Python simply because I can write the signature of a function and that it generates the content.

(I know Python has type annotations too, but Copilot just isn't as smart as with C#. Perhaps because there isn't enough training data in typed Python?)

moffkalast3y ago

Hmm, so the endgame would be the most strongly defined and strict language so LLMs can immediately see and fix mistakes, but automate the tedium of writing it by having them as an intermediary?

raincole3y ago

The endgame is either the elimination of jobs or the elimination of human beings.

But yes, what you described matches my experience. I'd say Copilot benefits those "tedious but relatively strict" languages the most.

Turskarama3y ago

Training an AI to write unit tests (and refactor code to be more testable) would be a real game changer.

riffraff3y ago

Copilot does a decent job with "dumb" unit tests, e.g. "when the condition X is Y then Z".

But it obviously doesn't have knowledge of your domain, and I don't think any LLM could do a perfect job without that.

protortyp3y ago

I already use gpt4 for this. Works quite well if you show it a couple of reference unit tests.

toolslive3y ago

you can quite easily use a GPT to generate test examples. For example, "generate me 1000 customer support mails about ..."

byby3y ago

I don't think you get it. The game has already changed. And nobody trained the AI to write unit tests or refactor code. It could do it as emergent behavior.

It happened so fast that so many people are in denial and many people aren't even asking the obvious questions. Simply ASK chatGPT to refactor your code to be more unit testable and IT Can do it.

See what I asked chatGPT:

   ME:
   def (x: int):
      for i in range(x):
         print(i)

   Can you refactor the code so it can be more unit testable?


   CHATGPT:
   Certainly! To make the code more unit testable, we can refactor it into a function that returns a list of integers instead of directly printing them. Here's the refactored code:

   python

   def generate_number_list(x: int) -> List[int]:
       number_list = []
       for i in range(x):
           number_list.append(i)
       return number_list

   In this refactored code, we've created a function called generate_number_list that takes an integer x as input. Instead of printing each number, we use a number_list variable to store the generated numbers by appending them in the loop. Finally, we return the number_list.

   By returning the list instead of printing the numbers directly, we make the code more testable because we can now write unit tests that assert the correctness of the generated list.

If boggles my mind that I see threads of people saying "what if we train LLMs to do this or to do that?" or whatever. Guys, just ask the LLM to do what you want in plain English. It can do it.

barrkel3y ago

This is a fine, absolutely trivial, example. But LLMs are simply not all that.

IME GPT-4 can't write a bug-free 10 line shell script. It's particularly poor at inferring unstated requirements - or the need to elicit the same.

There's a general problem with LLMs: they're too eager to please. It shows up as confirmation bias. Embed a perspective in your prompt, and LLMs continue in the same vein.

For creativity - e.g. making up a story for a child - it's not bad. One of my favourite use cases, after discovering how bad it is at writing code.

1 more reply

mjburgess3y ago

That isn't the game.

The game was never writing the stack overflow answer -- that was already written.

2 more replies

blibble3y ago

if it actually understood what it was doing it would tell you that that logic doesn't need a test as the python has the range(x) functionality built-in

instead it generates a load of redundant boilerplate

if I saw a developer check that in I'd think they were incompetent

2 more replies

Marazan3y ago

You do get that that code is garbage and if any dev tried to check that code in they would get laughed out of the room right?

It is pure pablum. It is almost the perfect example of LLMs producing fluent vapid bilge.

1 more reply

cookieperson3y ago

So your current stance is, LLMs can't do everything yet, but don't bother thinking about extending it's capabilities just ask it it can do everything? Fascinating...

1 more reply

Turskarama3y ago

Yeah, it does great on little toy examples.

What I would like to do is feed in my entire 50k line program and get something out.

2 more replies

toolslive3y ago

  >    ME: 
  >        f(x) = exp(cos(x))
  >
  >        what is f(0) ?
  >
  >    GPT-3.5:
  >        f(0) = 1

(note, it most often generates the correct result, but I've seen it do the above too)

2 more replies

crabmusket3y ago

This a classic case of understanding programming as an activity that is primarily about text production. This is what Naur was arguing against in Programming As Theory Building [1].

[1] https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f...

Ygg23y ago

Honestly ChatGPT-4 sucks at writing code (if similar code isn't in its corpus of knowledge).

I asked it to write a YAML parser in pure Rust. The first thing it did was write this:

    pub enum YamlNode {
       Scalar(String),
       Sequence(Vec<YamlNode>),
       Map(HashMap<String, YamlNode>),
    }

This is what I expect a freshman in college to write as their first Yaml parser.

Literally unusable as a starting point.

I did try to correct it but it tripped up and somehow managed to fuck that up to.

hderms3y ago

Why does it suck, because it's doing a lot of allocations?

Ygg23y ago

No. You have a fair point, but not my biggest gripe. It's much more fundamental to what YAML is.

YAML allows YamlNodes as keys in a map (aka complex keys). E.g.

    ? [ a, b]
    : "complex key, scalar value"

You asked it for YAML parser and it gave you a shitty JSON parser.

Also this is the most banal thing about YAML, something that should be obvious just browsing the examples. And I gave it test suite and link to YAML spec.

Most work in YAML is around super obscure corner cases.

pornel3y ago

Rust is 90% arguing with the compiler and 10% debugging. And yesterday GPT-4 solved a WTF lifetime issue for me pulling a trick I wouldn't have thought about.

Ygg23y ago

Probably because its corpus of knowledge includes someone solving it like that.

pixl973y ago

ChatGTP3y ago

What was the WTF lifetime issue?

pornel3y ago

api3y ago

You also get the bonus of nasty bugs like crash bugs and exploitable memory problems become almost entirely a thing of the past. You can ship new software you know is very reliable.

AnimalMuppet3y ago

I think that training GPT to use a debugger is going to be hard. For many debug problems, the amount of context you have to keep is going to be really hard for GPT.

> But, AI is particularly bad at formal methods.

AI? Or GPT? Earlier, reasoning-based attempts at AI seem like they would be a natural fit for formal methods.

ly3xqhl8g93y ago

bravura3y ago

I’ve said it before and I’ll say it again. A time traveling retroactive debugger integrated with an LLM for interrogating what happened and why, would be awesome.

Hirrolot3y ago

I'm imagining an AI based on a deductive system rather than sequential text generation. This is roughly how "strong type systems" work, and so it might be simpler to map this model to formal methods.

hnfong3y ago

The one thing worse than debugging code is debugging "types". Those 30-page C++ compiler errors are definitely worse than whatever I had to do with (for example) Python.

4 more replies

byby3y ago

>Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

antihipocrat3y ago

Or just learn to write it ourselves? If it takes the same amount of time to work with the LLM as coding it from scratch then I'd prefer to improve my coding ability while I do the work.

Kostchei3y ago

2 more replies

byby3y ago

I'm just saying the tech is already here. The core engine can do it.

2 more replies

Forgotthepass83y ago

Though i was surprised i can't easily run pdb instance via a python program, still have to use stdin/out apparently.

I do need to look into a nice terminal interface for N-way merges and parallel gen monitoring.

The holy grail will probably be local fine tunes/LoRAs for specific issues or libraries, since it only costs a few $ for one. Sign me up for an expert plotly AI in a box for neat plots please

I don't know what this is called but computer science is ostensibly mathematics so i assume/hope there is some rigor here

omginternets3y ago· 23 in thread

Sure, if the natural language we're talking about is formalized to the point of basically being math. Let's not confuse that with "everyone will be able to program", though.

speed_spread3y ago

_Maybe_ the AI will be able to help with the rewrite, generating test cases and translating business logic, which would be nice.

qumpis3y ago

noisy_boy3y ago

cableshaft3y ago

6gvONxR4sf7o3y ago

I'd love if the ambiguities could be a dialogue of question/answer, rather than being fully specified ahead of time like we generally have programming today. It seems much more efficient.

dools3y ago

That's what programming with ChatGPT is like.

2 more replies

quickthrower23y ago

Sounds like agile (small a) as opposed to waterfall.

grt_thr3y ago

The ambiguity is exponential. I wish that the people hyping on llms read the older literature and sentence parsing.

The only reason why people are so impressed is that chatgpt sometimes gives better results than Google. Which just ought to tell you hiw bad google has gotten.

wolfgang423y ago

Well, programmers provide a natural language interface and somehow we usually manage the ambiguity and complexity OK.

In my experience, a lot of support requests for bespoke/in-house software go like this:

> User: Why is my wibble being quarked? This shouldn’t be happening!

> Dev: Wibble ID, please?

> User: ID 234567. This is terrible!

> Dev: [rummages in git blame] Well, this wibble is frobnicated, and three years ago [links to Slack thread] you said that all frobnicated wibbles should be automatically quarked.

> User: Yes, but that was before we automated the Acme account. We never frobnicate their wibbles!

> Dev: ...so, is there a way for me to tell if a client wants their wibbles unfrobnicated, or should I hard-code an exception for Acme?

(And then, six months later: “Why are none of Acme’s wibbles being frobnicated automatically?”)

2 more replies

6gvONxR4sf7o3y ago

By having a dialogue, you can resolve only the ambiguities pertinent to the specific question at hand.

qumpis3y ago

1 more reply

cookieperson3y ago

Paul-Craft3y ago

Yeah. I actually mis-parsed the headline myself, before noting the date. I was thinking "is going to work" meant it would be happening in more and more workplaces.

bee_rider3y ago

shrimp_emoji3y ago

Isn't ChatGPT (or some now not that far fetched futuristic version that's superior) a 5GL? You ask it for a program in English, and it codes it for you. You've just coded in natural language.

goatlover3y ago

captnObvious3y ago

Check out chatgpt code interpreter. One step down…

quickthrower23y ago

It can do that if constrained. There was a show HN of an app generation tool that was quite impressive: perhaps because it only created web apps and didn’t worry about stylistic design.

takeda3y ago

I would love to see somebody writing a compiler that way.

n8cpdx3y ago

https://medium.com/@byanofsky/chatgpt-helped-me-build-a-comp...

ilaksh3y ago

People generally didn't seem very interested. Also the bigger problem was that the gpt-3.5 models really wouldn't return the same thing consistently so a minor change could throw other things off.

Also a conversation feels like a much more intuitive way of doing it. So I switched to that.

geysersam3y ago

I don't see why you need a language "formalized to the point of math" to instruct a computer to follow simple instructions (Write a crud API for this and this, it should have these endpoints etc.)

Of course there will be ambiguity, but you could say anything higher level than assembly is also ambiguous, but that's not usually a problem.

I think it's reasonable to expect that more people will be able to program if this becomes reality. Just as going from assembly to c allowed more people to program.

cableshaft3y ago

Eh, I was using it to help generate animated backgrounds with various custom geometry and animations tonight for a game of mine, and it performed admirably but it was struggling a bit.

I wouldn't mind if it had a bit more formalization to the point of math to it for those bits. Was still good enough that I got some usable options from it, though.

lcuff3y ago· 9 in thread

nordsieck3y ago

So, I think a key challenge is that modern programming languages do 2 things simultaneously.

1. They provide direct instruction to computers.

2. They document precise human intent.

galleywest2003y ago

> I would expect that if natural language programming becomes a thing, that a dialect forms - akin to legalese - that embodies best practices for precisely documenting human intent.

Sort of like how the SQL syntax seems designed for business admin folks to use, but us programmers ended up using it.

2 more replies

tbrownaw3y ago

> This ability to get Hello World done in 5 minutes instead of 1/2 a day to walk through a intro book/tutorial, well, that's a win.

So a bit like `create-react-app` or `dotnet new` or such, but now powered by AI?*

lcuff3y ago

dboreham3y ago

> This ability to get Hello World done in 5 minutes...

This has existed since at least 1992 (Visual C++) and probably before that (Borland Pascal?).

lcuff3y ago

riceart3y ago

Web hello world in rails is literally a single command and seconds.

1 more reply

65103y ago

Front end js is like the nest we've been building for it - not knowing why.

byby3y ago

>Create a web site using language X with framework Y

Not sure if you played with chatGPT, but you can ALREADY ask it to do this and it will give you something that works 50% of the time.

dboreham3y ago· 5 in thread

gwoolhurme3y ago

grt_thr3y ago

Mathematics as shown in textbooks is not rigorous. I don't understand why so many people fetishize something they saw at university.

Formalised mathematics are incomprehensible to humans and orders of magnitude longer then anything you can see in textbook or mathematical papers outside automated theorem proving.

2 more replies

behnamoh3y ago

> So both programming languages will evolve to be more natural, and natural language will evolve to be more machine-parsable.

Hard disagree. Is Rust similar to natural language? Has English become more similar to <pick the easiest programming language>?

gwoolhurme3y ago

EamonnMR3y ago

The way you're using brackets sure resembles Backus-Naur form.

dusted3y ago· 4 in thread

fsflover3y ago

> Ambiguity is in my opinion the biggest reason why "plain {insert natural language} programming won't be a thing"

How about Ithkuil? https://news.ycombinator.com/item?id=36022731

classified3y ago

You're joking. Even the inventor doesn't speak it fluently.

fsflover3y ago

I wasn't talking about fluency, just about possibility to use it for programming.

thomastjeffery3y ago

Programming languages don't entirely remove ambiguity: they isolate themselves from it.

Everything in a program is unambiguous, but no useful program stays entirely within its own black box.

Ambiguity still exists between programs. This ambiguity is the source of incompatibility.

dang3y ago· 4 in thread

Discussed at the time:

Generating code from natural language is closer than you think - https://news.ycombinator.com/item?id=1912530 - Nov 2010 (80 comments)

rexsteroxxy3y ago

From the top comment:

cookieperson3y ago

Dang dude, why aren't you a billionaire competing with Google, bing, etc tomorrow?

WastingMyTime893y ago

Talk of moving the goal post.

1 more reply

andromaton3y ago

Order update: Your wish is arriving hundreds of years early.

heikkilevanto3y ago· 1 in thread

Long time ago they tried to develop a system where the users could explain things in "plain" English instead of the difficult codes used so far. It was called cobol.

moffkalast3y ago

> It was called cobol.

Taking something that's lower level than C and replacing characters with words is not exactly what one'd call a plain English interface, it's just being obnoxiously verbose.

pyrale3y ago

It seems like this article is an advertising for Wolfram Alpha's then-new feature. It brings few elements to support the headline thesis, aside from showing these features.

makz3y ago

This is not new. There is something called rethorical algebra.

Rhetorical Algebra

It was developed by ancient Babylonians where the equation was written in the form of words that remained up to the 16th century.

Example: x + 5 = 8, is written as " The thing plus five equal to eight".

j / k navigate · click thread line to collapse