When AI promises speed but delivers debugging hell (opens in new tab)

(nsavage.substack.com)

202 pointsnsavage1y ago253 comments

253 comments

173 comments · 34 top-level

namaria1y ago· 41 in thread

Coding is trying to order bytes into doing arbitrary stuff that is useful because of some transient conjunction of factors in the real world.

We have developed programming languages because coding in machine language is horrible, and over the decades we've refined them into tools people can use fluently and just directly think in code when they have to make a computer system behave in a certain way.

Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

qwertox1y ago

The hype is around having AI replace your typing, that is, to code for you.

The hype should not be around replacing the typing, but in assisting your thoughts.

When you code, there's the dialog in your brain which thinks about the code and also creates the questions which you know you must answer in order to then transition to the dialog with the machine, that is, to type code.

And in this first part LLMs can be extremely useful, which will come to the point where you select a line, then explain your intent, and while the AI retrieves documentation and possible solutions, you can reason about the problem and then pick and choose from what the AI has collected for you.

> Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

The question is then if assistants are "sitting next to you", like a secretary and a mentor, or if they are sitting between you and the editor, as the thing you need to control.

An assistant can be a really effective refinement in the programming process. Even so far that it ends up motivating you instead of you constantly getting demotivated due to hitting the wall of "not another problem that I need to solve before I can really continue" (which happens all to often).

barrell1y ago

Personally, I haven’t found LLMs to be helpful for the internal dialogue. Even with a lot of exposition, code samples, and documentations, it always provides either obvious solutions (store items in a vector!), pointless modifications (use map and filter instead of reduce!), or it just makes up APIs that don’t exist.

I think it’s really good with the 101-level academics side. Learning the basics of anything through a conversational manner can be massively helpful.

As soon as your situation exceeds textbook level, I’ve found them to always be a waste of my time, and nothing I’ve seen as of late makes me think they’re trending in a direction to be helpful in this scenario

2 more replies

spacemanspiff011y ago

Where I find coding assistants the most useful _is_ in writing code that I already want to write.

Ala - I need to write this unit test, it has these checks, it validates these methods.

Or write a log message for me about what error got encountered here. Those are annoying to write out, but often the llm has enough context that I just start to write and it completes it appropriately.

All of these are things I can easily do myself, are easy to validate correctness, but if I were to write them would consume my limited mental energy for the day.

3 more replies

skydhash1y ago

Not everyone is the same, but I have a different view of programming that what you describe. Most tasks involve thinking about the domain and what implementation techniques to use while trying to reduce the technical debt in the project. For the domain, I talk to people, rely on past experience (mine or others), or do research. For implementation techniques I look at other people’s code, read books, or ask someone more experienced. Both are heavily influenced by the context, aka what already exists in the project and the constraints that I have to deal with. I heavily distrust LLM because it cannot assimilate the context like an experienced person and provide me direction based on experience. Why experience? Because the problem and the constraints always exist in the real world.

1 more reply

api1y ago

The hype in some areas is around replacing coders, which is a fantasy without orders of magnitude better systems.

1 more reply

mmusson1y ago

I think there is a skill issue. Just like in any other pursuit, some people are going to be better at using AI productively. It is a tool. You are still responsible for the quality of the resulting code whatever the mix of human and tool generated.

ozim1y ago

Hard disagree.

Assistant should not help you thinking, any AI agent/tool should be doing what you want with minimal amount of explanation.

Only way I accept current hype is if I am able to type in "make a Twitter clone" it does the implementation, I can run it, I write "make it red, silver and yellow color themed" and it does just that. I am the one doing thinking here - I don't care about technical details. That should be state of art.

I can write my own Twitter clone and if I have to write prompt after prompt it is going to take me more time and more typing so it is useless.

A person that cannot write their own Twitter clone is not going to prompt their way out to having working and deployed Twitter clone.

insign_bit1y ago

This reminds me of something Dijkstra wrote almost 50 years ago in On the foolishness of "natural language programming [1]:

> When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.

Although this is obviously not about LLMs, its astonishing how many parallels can be drawn to today's usage of AI systems.

1: https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

bwfan1231y ago

Wow, thanks for sharing this beautiful essay by a legend that captures the essence of the llm debate.

That, the use of formal languages - although an evil (since it is not a natural human language) is essential to avoid nonsense (ie hallucinations). While intuition/natural language is more imaginative, formalism (ie narrow interfaces) is a forcing function to make things work.

According to Dijkstra, the use of natural languages regressed civilization by thousands of years (because of the nonsense and imprecision) ! So, expect thousands of years of LLM hell if we adopted it to replace our formal languages.

Indeed, math and other symbolism/formalism is a crowning achievement of humans.

bambax1y ago

I use AI (Sonnet) to write relatively complex SQL queries, but that always need a review before implementation. It's sometimes brillant, and at the same time can suggest wrapping a single, atomic query in a transaction, for no reason. Just asking "why?" will result in profuse apologies and thank yous, but it never explains what caused the mistake in the first place.

Trusting an AI to code an app from start to finish seems crazy to me but hey... if some people can pull it off, good for them I guess.

roland351y ago

That's the type of thing AI is great for! I find AI is pretty decent at generating data related code with Pandas, and I since I only rarely use Pandas it saves me a ton of time relearning everything.

Where ai starts breaking down is how to effectively incorporate a new feature in a complicated existing codebase. That is where us engineers can continue to hold an advantage.

csomar1y ago

This massively underestimate what current LLMs can do. Yesterday, I was able to create a 600 lines script in 20 minutes or so that essentially setup a Cloudflare worker bindings (KV, Queues, Hyperdrive, etc...). The complexity is very low and debuggu-ability is easy. Reading this infra. code is fast. However, if I was to do this manually, it would have taken me a full day reading through the docs and trying the implementation back and forth for each binding I am connecting to.

Claude 3.5 did it from the first shot.

kuschku1y ago

And 2 months later I get asked to debug code like yours when it doesn't work for a customer and have to spend days or weeks digging into your code before I notice the LLM took some shortcuts that work most of the time, but are ever so slightly broken in edge cases, followed by me having to rebuild it all from scratch.

I literally just spent a full week on such a project. Respectfully, fuck people who don't read the docs/spec.

2 more replies

jason_zig1y ago

You still have to become a domain expert to debug it though?

1 more reply

danny_codes1y ago

Agree it's good for boilerplate, provided the thing you want to do is extremely basic / just setup. Once you need something slightly more complex it seems to break down rather quickly.

Claude 3.5 is pretty good at giving you the right hints though. If you aren't familiar with a library it's definitely faster than grepping through docs. If you are an expert in a library than it's pretty useless.

petercooper1y ago

Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

AppleScript might convince me of your argument.. ;-) But, seriously, we've been putting abstractions between "us and the bytes" ever since Fortran and COBOL appeared (and indeed, earlier). We can argue about the quality and expressiveness of those abstractions, and there are a lot of arguments against natural languages in this task, but the broad idea of putting things in between developers and machines is sound so it's worth continuing to explore IMHO.

klabb31y ago

All those layers are stable and deterministic. You can open it up and check, in most cases, how the upper layer calls the lower layer, should it be necessary.

If you have an LLM between you and the code, there is not even such a thing as ”source code”, only a history of prompts. You can’t check in your prompts in git, and re-generate the same code later.

In fact, it’s more like the antithesis of the reproducible builds movement. It’s introducing a proprietary networked high latency chaos agent into the critical path.

petercooper1y ago

I accept your points today, but I'm optimistic this problem will be resolved in the mid-term. I just feel some of the arguments smell similar to concerns levelled at the earliest compiler developers about the perils of abstraction. Now we gleefully stack layer upon layer with most developers being unable to grok more than a layer below their choice of abstraction (if they're good!)

I am extremely LLM-optimistic though and largely in favor of abstractions, so that fuels my viewpoint. I still remember my dad, an embedded developer in the 90s-00s, ranting about how many people were starting to use 'inefficient and unpredictable' C compilers than whatever assembly he was using. I reckon he'd be appalled to learn that now even assembly isn't always a reliable model of what's really happening on the CPU thanks to microcode and optimizations.. ;-)

3 more replies

ConspiracyFact1y ago

> It’s introducing a proprietary networked high latency chaos agent into the critical path.

Well (and amusingly) said. The same (or at least a very similar) problem exists at the other end of the pipeline, i.e., whenever a user has to use a natural-language interface to get software to do something they want. Are we really going to tell our AI assistants to take complex actions on our behalf in the real world, and then just sit back? Are we really going to do this when money is involved?

withinboredom1y ago

sure; but these languages we speak to computers in are deliberately non-ambiguous and lack nuance. Natural language is both ambiguous and nuanced.

"Have a nice day!" can mean many things (an insult, or sincere, for example).

2 more replies

s_dev1y ago

AI is a tool like any other. Autocomplete on steroids -- markov chains taken to the extreme.

We already put natural language between us and the bytes. Hence why most keywords and variable names (a hard part of computer science) are in simple English and it is considered a net positive.

namaria1y ago

The memory and compute requirements to develop and run these models make no sense if the marginal improvement in autocomplete is the big end result. They only make sense in a world where machine can derive intent from natural language and actually conform to what people mean when they ask for something. This is clearly a fantastical result that LLMs are very short of.

nailer1y ago

It’s interesting. I would’ve agreed that ‘driving intent from natural language is something that LLMs have fallen far short of’ maybe a month ago.

Since then, I spent a week trying to get cursor to work, and after dealing dealing with all the bugs, and restarting the composer each time with a new prompt, was able to get what I would consider a quality output for a moderately complex app (a parimutuel betting market).

The issue isn’t that LLMs are terrible, it’s the software like cursor is buggy and poorly written.

It should know that I don’t want to use code from an old version of the library I am using because the new library I am using is already in my projects dependencies.

It should let me set up preferences for different programming languages. And preferences for all programming languages.

So when I give it a prompt, it looks at the dependencies and language rules I already have set up, adds those to the prompt and produces the quality output I’m seeing now without me having to manually specify all those things.

Short version: LLMs rule the software is just shitty.

4 more replies

llm_trw1y ago

>The memory and compute requirements to develop and run these models make no sense

There is the story that von Neumann flew off the handle the first time he saw an assembler.

>>How dare you waste compute cycles on this frivolity? Just use machine code like everyone else.

1 more reply

JTyQZSnP3cQGa8B1y ago

> AI is a tool like any other. Autocomplete on steroids

No, AI is a shitty tool that has yet to prove its utility. Autocomplete works by analyzing the official API and interface, it's completely different than AI which hallucinates meaning between words and also stuff that it was fed before it met you.

> variable names (a hard part of computer science)

Naming is for software engineering, not CS. One more confusion by people who want to sell us AI at all cost.

criley21y ago

At some point, you become the luddite. Maybe you have no experience with modern AI dev tools, maybe you work in a language that is underrepresented in models meaning off the shelf tools don't work well, or maybe you're just an old curmudgeon who will die on a hill.

But modern AI tools are far beyond "auto complete". (I actually turn off those in-line completions, I feel they ruin flowstate). The tools now are fully prompted, with multi-file editing, with full codebase context, with web/search and doc integration, and for "on the rails" development are producing high quality code for "easier" tasks.

These modern models and tools can solve nearly every single leet code problem faster than you. They can do every single Advent of Code problem likely 10X-100X faster than you can.

In my professional, high standards, very legal and contract driven web app world, AI tools are still very useful for doing "on the rails" development. Is it architecting entire systems? No of course not (yet). Is it emulating existing patterns and extending them for new functionality 10X faster than a Jr or Mid? Yes it is. Is it writing nearly perfect automated tests based on examples? Yes it is. It is scaffolding new ideas and putting down a great starting point? Yep. And it's even able to iterate on featurework pretty well, and much faster than Jr/Mid.

The kind of work I'd give to a Jr/Mid and expect to take 2-3 days before they need serious feedback up and down the change, these AI are doing in about 30 seconds, maybe 90 seconds if you need to iterate a few times on the prompt.

I get that "AI" is a buzzword that is pumping valuations and making business people see $$$.

But coding assistants are not that. For many programmers, they are quickly becoming valuable tools that do in fact speed up development.

4 more replies

dagw1y ago

Autocomplete works by analyzing the official API and interface, it's completely different than AI

You can (and should) give the AI access to your existing codebase and any relevant documentation to use as context if you want good results. If you give the AI zero context for the problem it is trying to solve, of course it will struggle. If you give it all the necessary context, it will do much better.

I've found that just uploading the documentation of the API or library you are working with before asking the AI questions about it makes a huge difference in the quality of its output.

sixstringtheory1y ago

>> variable names (a hard part of computer science)

>Naming is for software engineering, not CS.

I figured they were referencing the “two hard problems of computer science”, those two being naming things, cache invalidation and off by one errors.

Everybody knows the hardest problems in software engineering are assembling promo packets and building consensus on number of spaces per indent.

1 more reply

lordmoma1y ago

AI is good with Rust programming, because the compiler will keep it from hallucinating!

aleph_minus_one1y ago

> Hence why most keywords and variable names (a hard part of computer science) are in simple English and it is considered a net positive.

As I'm not a native English speaker, I disagree. I learned programming long before I got decent in English, and even today I just consider the English keywords in programming languages to be some "abstract mathematical concept" that by mere coincidence is named after some real, existing English word. Even today, being somewhat decent in English, I stil think this way when I see program code.

I actually would insist that this is a much more useful way to think about good programming, since this way you have no difficulties to ask yourself all the time whether it would make sense to replace some "English-named" concept by something more useful, but which has no analogue in the English language (or any other natural language).

2 more replies

zahlman1y ago

>Hence why most keywords and variable names (a hard part of computer science) are in simple English

"Natural language" is about far more than individual words.

d_tr1y ago

Bad take. Identifiers are just labels.

nailer1y ago

I don’t think anyone disagrees that identifiers are labels. If you’re claiming that these labels are unimportant, I’d be interested in why you think this.

2 more replies

bccdee1y ago

Exactly. The formalism of computer code will always create some amount of boilerplate—there's no perfect language—but in my (admittedly limited) experience, an LLM is a middleman which distances you unacceptably from your own code. Whether you review the code or you write the code, the intellectual effort of deciding which approach is best and understanding the solution still needs to be undertaken. All it's saving you are the keystrokes, at which point it's glorified intellisense.

seanmcdirmid1y ago

Note that intelligence/code completion was never about saving keystrokes, so I assume LLMs shouldn’t be either. Code completion has always been about getting a list of things you can do on a particular type, saving you from having to remember everything in your head (and APIs have gotten more numerous as a result). LLM code assistance that I’ve seen is poorly designed in that it usually gives you one most likely choice (so save keystrokes) and doesn’t allow you to browse through a bunch of likely possibilities.

myth20181y ago

I agree, and it's sad that, despite all those pitfalls, some companies and CEOs will keep pushing the idea that human programmers can be replaced by AI.

But well I guess there's a bright side to see here: those LLMs applied to software development might become the new Genexus and there are gonna be plenty of open positions for humans to rewrite entire systems in a not so far future.

jstummbillig1y ago

That's obviously wrong, as demonstrated by the engineers invested in building tools to enable that.

gessha1y ago

A lot of engineers invested in building crypto stuff and we didn’t go far in personal banking. Hype-driven development is not guaranteed to succeed.

jstummbillig1y ago

That was not what was argued, and not what I am arguing. OP claimed that "only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive."

Unless OP is also willing to claim that all the people who are working on LLM dev tools are frauds, and act against their better knowledge, OPs claim is obviously false. The entire premise that the people who build these tools operate under is that natural language "between you and bytes" can be a net positive.

trinix9121y ago

Which has been proved again and again to only get you 70% there [1].

[1] https://addyo.substack.com/p/the-70-problem-hard-truths-abou...

dartos1y ago

Popularity doesn’t demonstrate anything

andix1y ago· 27 in thread

My social feeds are full of tech bros who keep telling people AI codes everything for them. AI obviously has some impressive coding skills, but for me it never really worked well.

So is this just an illusion they create, or is it really possible to build software with AI, at least at a mediocre level?

I'm looking for open source projects that were built mostly with AI, but so far I couldn't find any bigger projects that were built with AI coding tools.

dagw1y ago

AI isn't great at creating software, but it is great at writing functions. I often ask AI to "write a function that takes A, looks up B in a SQL database, and returns C, or write a function that implements the FooBar algorithm in C++" and on the whole that works pretty well. Asking it to write documentations for those functions also works really well. Asking it to write unit tests for those functions works pretty well (although you have to be extra careful, because sometimes the tests are wrong).

What you have to do, and what AI cannot do well, is to decide where in the codebase to put those functions, and decide how to structure the code around those functions. You have to decide how and when and why to call each of those functions.

javier21y ago

When I have to be that specific with it, it would be faster for me to just write it directly in my normal IDE with great auto complete

dagw1y ago

it would be faster for me to just write it directly in my normal IDE

Then you are a much better developer than me (which you may very well be). I'd like to think I'm pretty good, and I've many times spent hours trying to think through complex SQL queries or getting all the details right in some tricky equation or algorithm. Writing the same code with an AI often takes 2-20 minutes.

If it's faster for me, it might not be faster for everybody, but it is probably faster for many people.

3 more replies

ExtremisAndy1y ago

"AI isn't great at creating software, but it is great at writing functions."

This 100%. In my experience (ChatGPT - paid account), it often causes more problems than it solves when I ask it to do anything complex, but for writing functions that I describe in simple English, much like your example here, it has been overall pretty amazing. Also, I love asking it to generate tests for the function it writes (or that I write!). That has also been a huge timesaver for me. I find testing to be so boring and yet it's obviously essential, so it's nice to offload (some of) that to an LLM!

pydry1y ago

It manages simple functions but Ive tried to get it to do complex ones (e.g. a parser with a bunch of edge cases) and it totally shit the bed.

For the simpler cases I think prompting still took about as long as just writing the damn thing myself if I was familiar with the language.

The coding I have found it useful for is small, self contained, well defined scripts in bash where the tedious part is reminding myself of all of the command switches and the funky syntax.

rco87861y ago

Current gen AI can spit out some very, very basic web sites (I won't even elevate to the word "app") with some handholding and poking and prodding to get it to correct its own mistakes.

There is no one out there building real, marketable production apps where AI "codes everything for them". At least not yet, but even in the future it seems infeasible because of context. I think even the most pro-AI people out there are vastly underestimating the amount of context that humans have and need to manage in order to build fully fledged software.

It is pretty great as a ridealong pair programmer though. I've been using Cursor as my IDE and can't imagine going back to a non-AI coding experience.

wanderingbort1y ago

I think it’s selection bias. Marketers are going to post the proof-of-concept that it works (if only in a small isolated scenario), algorithms are going to emphasize the more amazing “toys” this produces, over the boring rebuttals. In the end, you will see hundreds of examples where it worked and not the the thousands where it produced buggy or dangerous code.

That attention does not map well to the important, hard, and more valuable parts of development.

Anecdotally, I still find it to be useful and it’s improving. I do think it’s going to be an huge impact in time.

Hype is part of the industry and it can be distracting to users, developers, and investors BUT it can also be useful (and I don’t know how to replace it) so, we live with it.

williamcotton1y ago

https://github.com/williamcotton/webdsl

Made almost entirely with Cursor and Claude 3.5 Sonnet.

11k lines of C and counting.

gray_-_wolf1y ago

What I find interesting is that the project claims MIT license, but if it is "almost entirely" AI generated, I am not sure it even is copyrightable. So either the licensing terms deserve some large disclaimers, or it is not "almost entirely" made with AI. Based on the name I assume it is your project, could you shed some light on which of those two options is correct?

2 more replies

thomasfromcdnjs1y ago

I used Windsurf mostly on a feature to build out user authentication and then another tool to generate the PR documentation entirely.

https://github.com/jsonresume/jsonresume.org/pull/176

Meets my good enough standards fo sure

imtringued1y ago

I like the concept, but honestly I don't see myself writing an entire webapp with this.

Here is some feedback:

There are a bunch of libraries that need to expose an http API. There is a niche for providing an embeddable http server that comes batteries included with all the features such as rate limiting, authentication, access control, etc. Things that constantly have to be reimplemented from scratch, but would not warrant adding a large framework by themselves.

That's where I think the idea of a "WebDSL" would shine the most.

williamcotton1y ago

Thank you for the feedback.

I also don't see myself writing an entire webapp with this either - perhaps small sites or simple API endpoints?

I was mainly scratching an itch I've had for a couple of years. I also really like tuning C code just for the fun of it!

BeefWellington1y ago

It's funny that this is MIT licensed, expecting credit for uncopyrightable work.

1 more reply

andix1y ago

Thanks, that's exactly what I'm looking for.

javier21y ago

Same experience. It has become pretty good at writing creative SQL queries though. Its actually rather good at that.

When I am working on something niche, it does not help either. I have tried to make it build modern UI applications for myself using modern Java, but it just can't. It hallucinates libs and functions that does not exists, and I cant really get it to produce what I want. I have had better experiences with languages that are simpler and more predictable (Go), and languages with huge amounts of learning material available (Typescript / React). But I have been trying to build open source UI apps in JavaFX and GTK, it just can not help me when I am stuck

mft_1y ago

I experimented with Cursor over christmas, with writing a simple-ish Swift/SwiftUI app on iOS as the challenge. I can code fairly well in Python, moderately in JS, and almost not at all in Swift. I was using Cursor on a Mac, in parallel to XCode.

Basically, it worked, but not without issues:

- The biggest issue was debugging: because the bugs appeared in XCode, not Cursor, it either meant laboriously describing/transcribing errors into Cursor, or manually fixing them.

- The 'parallel' work between Cursor and XCode was clunky, especially when Cursor created new files. It took a while to figure out a halfway-decent workflow.

- At one point something screwed up somewhere deep in the confusing depths of XCode, and the app refused to compile altogehter. Neither Cursor nor I could figure it out, but a new project with the files transferred over worked just fine.

But... after a few short hours' chatting, learning, and fixing, I had a functional app. It wasn't free of frustrations, and it's pretty far from the level where a non-coder could do the same, but it impressed me that it's already at the level where it's a decent multiplier of someone's abilities.

jstummbillig1y ago

aider (AI assistant, that will do coding for/with you, depending on how you use it) has one of the more illuminating pieces of information on this. Here is a graph of percentage contribution of aider to aider development itself over time:

https://aider.chat/HISTORY.html

cejast1y ago

I don't think it is an illusion. It can remove a lot of barriers to entry for some people, and this is probably what you're seeing in the anecdata.

For example, my brother. He is what I'd refer to as 'tech-aligned' - he can and has written code before, but does not do it for a living and only ever wrote basic Python scripts every now and then to help with his actual work.

LLM's have enabled him to build out web apps in perhaps 1/5 of the time it would have taken him if he tried to learn and build them out from scratch. I don't think he would have even attempted it without an LLM.

Now it doesn't 'code everything' - he still has to massage the output to get what he wants, and there is still a learning curve to climb. But the spring-board that LLM's can give people, particularly those who don't have much experience in software development, should not be underestimated.

andix1y ago

There is a big gap between being able to create a somehow working application and shipping a product to a customer.

Those claims are about being able to create a profitable product with 10x efficiency.

HL33tibCe71y ago

Current-gen AI can write obvious code well, but fails at anything that involves complexity or subtlety in my experience

jgilias1y ago

I think it’s that AI unlocks the ability to code something up and test an idea for people who’re technical enough to get it working, but not really developers themselves. It’s not (yet at least) a substitute for a good dev team that knows what they’re doing.

But this is still huge, and shouldn’t be disregarded.

loveparade1y ago

In my experience, AI is good at building stuff in two scenarios:

- You have zero engineering background and you use an LLM to build an MVP from scratch. As long as the MVP is sufficiently simple there is plenty of training data for LLM to do well. E.g. some kind of React website with a simple REST API backend. This works as long as the app is simple enough, but it'll start breaking down as the app becomes more complex and requires domain-specific business knowledge or more sophisticated engineering techniques. Because you don't understand what the LLM is doing, you can't debug or extend any of it.

- You are an experienced developer and know EXACTLY what you want. You then use an LLM to write all the boilerplate for you. I was surprised at how much of my daily engineering work is actually just boilerplate. Using an LLM has made me a significantly more productive. This only works if you know what you're doing, can spot mistakes immediately, and can describe in detail HOW an LLM should be doing the task.

For use cases in middle, LLMs kind of suck.

So I think the comparison to a (very) junior engineer is quit apt. If the task is simple you can just let them do it. If the task is hard or requires a lot of context, you need to give them step by step instructions on how to go about it, and that requires that you know how to do it yourself.

bdangubic1y ago

these are exactly my experiences as well. senior devs on my team and rocking and rolling with the AI. my junior devs have all but given up using it even after numerous retros etc…

frodo8sam1y ago

For me ai has been pretty useful. Difference is I'm not a software engineer, I just write scripts to help me do my job. If I wrote bigger applications I doubt llms could help me.

andix1y ago

AI is awesome for small coding tasks, with a defined scope. I don't write shell (or powershell) scripts anymore, AI does it now for me.

But once a project has more than 20 source files, most AI tools seem to be unable to grasp the context of the project. In my experience AI is really bad at multi threading code and distributed systems. It seems to be unable to build its "mental model" for those kind of problems.

Earw0rm1y ago

This. It's good at Cmake, and mostly good at dealing with COM boilerplate (although hallucinations are still a problem).

But threading and asynchronous code are implicit - there's a lot going on that you can't see on the page, you need to think about what the system is actually doing rather than simply the words to make it do the thing.

dahousecat1y ago

They are useful for small tasks like refactoring a method however big the whole project is

CharlieDigital1y ago· 26 in thread

I have a non-technical friend who in the last two months has bootstrapped a SaaS startup using nothing but AI. He's got just over a handful of paying customers at this point on a monthly subscription[0].

I asked him to show me his process[1] after trying my hand (20 year, principal) and noticed a big difference in how we used AI: I instruct the AI how to code, he asks the AI to fix problems. In other words, I have a tendency to look at the code and ask the AI to fix it in more specific and direct ways that I want it fixed. On the other hand, if something doesn't work, my friend will copy/paste the error to the AI directly out of the dev tools console and ask the AI to fix the error. The two approaches are totally different.

My lesson here is that you're not meant to debug AI generated code; hand the error off to the AI and let it fix itself. I think if you're debugging AI generated code, you're doing AI generated code wrong. If you're an experienced dev picking up AI coding, I think you need to shift your mindset entirely. Ideally, someone out there will just create a closed loop where the AI can fix itself when it finds an error (integrate some browser and autonomous test loop into Cursor, for example, and let it fix its own errors).

Conclusion: if you're going to use AI to code, commit to it and use AI to fix the errors as well. Use AI for every aspect of it.

[0] Yes, I'm sure there are security holes and code issues galore, but those can always be fixed later when he's proven the business model.

[1] Yes, I have told him that he should create a YT channel or stream on Twitch because the content itself is super interesting how well he's been able to use AI.

abossy1y ago

In my experience, AI isn't very good at debugging AI-generated code. If it fails to make the right insight, it loops continuously until it's completely off the rails. I'm surprised your friend hasn't fully gotten stuck with this, as it seems like a huge risk for his startup.

CharlieDigital1y ago

Having had an inside view of a YC startup that went from seed to C, I can tell you that code quality means a lot less than one would think when it comes to the early days of a startup.

The biggest risk to a startup is that you get the business model wrong or you don't ship code, even if it's the code is buggy and messy.

InvOfSmallC1y ago

I don't know which specific LLM your friend used but pasting the error to the LLM usually ends in a endless loop where they tell you to do the same thing over and over again or the solution doesn't really work or generate another error.

So maybe he was lucky or he is using a very good LLM I'm not aware of.

CharlieDigital1y ago

Claude Sonnet. If your choices are to pay out of pocket for an offshore contractor and wait for weeks or pay $20/mo. for an LLM, it's pretty clear that even if you have to sit there for a few days until you get what you want, using the LLM is the better bet if you're non-technical. In either case, the code would be of questionable quality and a non-technical person would not be able to tell the difference anyways. I see it as a wash.

arend3211y ago

This probably only works if you glue a bunch of high level, popular APIs together. It might work, but will be fragile and expensive.

dartos1y ago

> fragile and expensive

Unfortunately, that’s the most common kind of software in the saas industry anyway.

CharlieDigital1y ago

Most SaaS apps today can be done by gluing together popular APIs (e.g. Stripe, Shopify, etc.).

No better or worse than hiring cheap offshore contractors to do the same, IMO.

bendauphinee1y ago

As an experienced developer, that’s also how I use it. What I’m finding is that it generally rabbit holes as I give it new errors that it’s previous fix has produced.

However, usually after three or four of those kind of fixes, I can walk it back to the starting point before the initial error, and I now know how to prompt it to produce correct code, because I now have a better mental model of how the thing is supposed to work.

This has been super helpful in my process of learning new things, as well as relearning things I haven’t worked with in a while.

jfengel1y ago

In my experience, fixing security issues after the fact is extremely challenging. A secure system has a very different architecture, with security as its fundamental function and business logic almost an afterthought.

It's not impossible to fix later. But it's often more effective to scrap and rewrite. Hopefully your proven business model has yielded enough money for that, before someone else has pwned it.

Timber-65391y ago

One can only imagine how many corners your friend had to cut to get to the product you call finished.

CharlieDigital1y ago

He's got paying customers organic inbound by word of mouth only; there must be some value there.

kuschku1y ago

> word of mouth

> must be some value

https://en.wikipedia.org/wiki/Parasite_(2019_film)#Plot

1 more reply

skeeterbug1y ago

I wonder what this codebase will look like after a year or so of doing this.

extesy1y ago

Bugs will escalate from syntax errors to business logic errors ("one customer was charged twice"). There won't be anything to copy/paste, no AI will be able to fix these errors and no human will touch this codebase with a long pole.

iamflimflam11y ago

Have you seen the job market at the moment? Humans will do a lot of things to keep a roof over their heads.

jiehong1y ago

I’ve seen some users do that, and got stuck in a loop where the AI is saying "ah, this error is because", and not fixing it properly, or fixing it and adding a different issue by modifying part of the code that is not related at the same time. Next, the code is fixed but by added the old fixed issue.

I saw that with users asking VBA code to be generated by people trying to automated part of email and excel work.

CharlieDigital1y ago

It's possible, but his choices are 1) hire someone else, 2) just sit there and prompt again until it's fixed. Since he's bootstrapping this with < $50/mo, the choice is simple.

Also, it may be the case that the corpus of training data with VBA is not as good as it is with React these days.

plagiarist1y ago

I have tried that and the AI gets stuck just attempting whatever and writing more code that won't even compile. I have had more success trying to get it to follow steps or examples.

Maybe the language your friend is using has more examples for training, or perhaps the dynamism of some languages get it to runtime errors that have better details it can work with.

CharlieDigital1y ago

React and JS so I think it has some benefits since 1) it has a large corpus of recent training data, 2) the browser gives pretty good errors.

I also tried it and the biggest issue I ran into is that I'm very specific about what I want. I wanted to use `nanostores` for state and routing. Problem is that the LLM keeps using code from `react-router` instead of `@nanostores/router`. As soon as I point it out, the LLM fixes it, but the first pass code generation is almost always wrong, even using an instruction file (as documented in both Cursor and GH Copilot).

That's when I realized that we are using the AI in two totally different ways: he simply doesn't care about the implementation, prop drilling, any of the technical details. None of that matters to him except that when "this button is clicked, that action happens". So however complex or inefficient or imperfect the code is, he doesn't care whereas I still have a tendency to read the code and try to ask the AI to do it in specific ways.

arrowsmith1y ago

> integrate some browser and autonomous test loop into Cursor

Doesn't this exist yet? It's such an obvious idea I'd be astonished if no-one has done it.

CharlieDigital1y ago

They exist in separate pieces; I've not seen it integrated into one loop yet.

Code gen -> show the AI an example of how it's supposed to work -> error -> code gen -> AI tries it again by itself -> Code gen

imtringued1y ago

This only works if there is an error message. Do you instruct the AI to fill in the code with asserts and not implemented exceptions?

CharlieDigital1y ago

I was only on a session with him for like 15 minutes and he showed me his prompt history. Basically when he hits an error, he will paste the error and give a simple instruction like "I'm getting this error when I click this button: <ERROR_HERE>" and then repeat until it's fixed. Nothing special; imagine a non-technical PM giving directions to a junior dev except this junior dev codes nearly instantaneously.

Madmallard1y ago

Yeah this works for crud apps with conventional methods for accounts email payments and not at all for anything complex especially if it isn’t a super commonly used language or framework. Try coding a single game with AI that isn’t something done 10000 times already. It actually is impossible.

llamaimperative1y ago

The vast majority of code in the world is the former though

sarchertech1y ago

The last 20 years of programming tells me that this isn’t the case.

This is only the case for new projects which don’t yet have users. Add users to even the simplest project and it evolves into a special snowflake with never before seen edge cases.

That’s why low code solutions are great for prototyping but eventually always explode into a nightmare of complexity.

fcatalan1y ago· 12 in thread

This has been my experience with a recent try to guide the LLM to a complete implementation of a small internal tool. I had in an hour what would have taken me 4 or 5 to write. But after that, it was an endless loop of the LLM adding logging code to find some bug and failing to fix it, only to add more logging code and ineffectual changes and so on. The problem is that even after it's lost at sea, it's still answering in a completely confident and self assured tone, so when you decide to take matters in your hands you might be too far gone from sanity and have an unfixable mess in your hands. I guess I can go back to where it strayed and retake it from there, but by now the experiment seems to be a failure.

almog1y ago

Back in early 2023 I tried to write a tool to do my taxes based on my broker CVS files. Since I wasn't familiar with how the data was structured, I let the LLM lead me while building this in incremental steps. The result was not just buggy, it simply failed to detect the relationships in the data (multiple somewhat implicitly embedded tables that needed to be joined). Even after I pointed this out, it failed to handle it, getting stuck in the same kind of loop you described.

To this day, no LLM that I tried passed this task of leading the development while detecting the underlying structure of the data.

morsecodist1y ago

At least in my experience as soon as something goes a little wrong it just gets worse from there. The more of it's confusion and contradictory information are in the chat history the worse it gets. It also has to make changes to the code so you accumulate these spurious changes and the problem gets more confusing. I've had some luck starting over with a new chat asking what is wrong but if that doesn't work I just assume I'm on my own.

diggan1y ago

I've found that quality degrades really quickly after just the first reply, for some reason. They all seem heavily biased towards one-shot correct answers, and as you say, they go down the wrong path really quickly if you even get the first message slightly wrong.

I tend to restart chats from the beginning pretty much all the time, because of this.

iamflimflam11y ago

I’ve also found this to be the case. Starting a new chat or in Cursor composer session puts things back on the right track. Also, prompting is really important. A lot of people just seem to think they have some kind of oracle - “fix the bug” - how is anything supposed to work from that?

KronisLV1y ago

> But after that, it was an endless loop of the LLM adding logging code to find some bug and failing to fix it, only to add more logging code and ineffectual changes and so on. The problem is that even after it's lost at sea, it's still answering in a completely confident and self assured tone, so when you decide to take matters in your hands you might be too far gone from sanity and have an unfixable mess in your hands.

I wonder how much better or worse things would get, if we took the human factor out of the loop. Give the LLM the ability to run tests and see the results, then iterate on its own output and branch off with different approaches, gradually increase the temperature etc.

Maybe it’d turn out that you need 10 LLMs running in parallel for an hour to fix something, or perhaps even a 100 would never stumble upon a solution for a particular type of problem. And even then I wonder, whether it’d get better if you fed it your entire codebase or the codebases of the entire libraries or frameworks that you use (though at that point you’re either training it yourself or are selectively finding and feeding the correct bits not to exceed the context).

renewedrebecca1y ago

But why? What is there to be gained in all of this work around the inherent limitations of this technology?

KronisLV1y ago

Exploration of what’s possible and what’s not, identifying whether the weaknesses can or cannot be addressed.

A bit like traditional autocomplete can help streamline familiarising oneself with various libraries, a clear step ahead when compared to just needing to dig through documentation as much.

Maybe there’s a class of code problems that LLMs can be decent at solving, given the ability to iterate, verify solutions and what works or doesn’t, perhaps with 10x more compute than is utilized in the typical chat mode of interaction though.

whattheheckheck1y ago

Get more people into computer science. Knuth said early on in his career he thought he needed to make the computer faster or cheaper but really it was about getting more users. Anyone can program. Or try to then learn about computer science

1 more reply

williamcotton1y ago

Part of the skill in using these tools is recognizing when it spins off the rails and backtracking immediately. Most of the time something can be gleaned from that wrong approach which can then guide further attempts.

joshstrange1y ago

This is my experience with Aider. When I first started using it, I turned off the auto git commits, but I’ve since turned them back on because they serve as perfect rollback points. My personal style is only commit once I have a feature fully working but with Aider it's best to have it commit after each exchange.

I've gone 2-6 steps down a path before realizing this isn't going to work or the LLM is stuck in a loop. I just hard reset back to the first commit in that chain and either approach the task differently or skip it if it wasn't really that important.

fragmede1y ago

you're not graded on getting the LLM to output perfect code, the point is to get the code in git and PR'd. If your LLM tooling doesn't automatically commit to git so you can trivially go back to "where it strayed" you need to find a better tool. (My current favorite is aider)

It's a tool not a person. When was the last time you got mad at a hammer for being smug?

sebastiennight1y ago

To drive the point home, hammers are quite smug.

Mine always thinks it nailed it on the first try, and it's pretty hard-headed when you point out mistakes.

If you can't work around those limitations, you're screwed.

alwinaugustin1y ago· 7 in thread

The current state of so-called AI does not provide much meaningful assistance in software development beyond basic tasks such as explaining workflows, breaking down thought processes, and performing simple conversions. I believe that generative AI, in its current form, is not true artificial intelligence. Rather, it is a sophisticated prediction engine that lacks genuine reasoning or understanding.

True AI should be capable of comprehending problems and devising its own solutions, rather than merely generating statistically likely outputs. Until AI reaches that level of cognitive ability, its applications in the real world remain limited, and much of what we see today is largely hype.

Tokenization and embeddings merely help models predict the most probable next token, a process that is executed at scale using vast computational resources. This is not intelligence but large-scale probabilistic prediction. The terminology used in computer science, especially in recent years, can often be misleading.

lubujackson1y ago

I think this comment underlines the biggest difference between people that say AI is a transformative tool and people that say it is nowhere close to working as expected.

I never expect some magic "understanding" to ever arrive, but doing remedial pattern matching is already a hugely valuable power that frees up humans to do more interesting work. This is how I use current AI - spitting out 5 line functions I could spend 5 minutes writing that he can do in 3 seconds and take me 10 seconds to review. Like "check for circular references" or "use Django ORM to write a query for all categories that have this flag for users that have this permission".

It doesn't "write the app" or solve difficult problems for me (unless it is some configuration issue). I can paste in a error code and save myself a few minutes of manual debugging. If I add a new parameter to a function it prefills the correct type definition and things like that. These are all micro-improvements but add up to a lot of saved time. Some people have success with editing across files but I rarely even try that - it excels at solving discrete, repeatable bits of work with tidy solutions so I use it for that.

Until AI can return "I don't know" or, better, "did you want it this way or that way?" it will be severely limited. Yes, it acts like a junior dev in some ways, but a junior dev that never asks any questions, which is not the junior dev you ever want to give important work.

notjoemama1y ago

Do we really want this? As soon as possible, employers will fire software engineers and replace them with AI. I’m positive they will not care about what AI can do, only how many salaries they can eliminate and still achieve the same results. You and I will not be the inheritors of AI.

jfengel1y ago

I think that by the time AI can genuinely replace software engineers, a lot else in society will change.

It's hard to predict what it will look like. I could write both utopian and dystopian narratives and I can pretty much guarantee they'll both be wrong. Not "in the middle" but something unexpected, the way nobody predicted cat videos or doomscrolling.

But you are almost certainly right that we will not be the inheritors.

fellowmartian1y ago

Yes, because employers will also be replaced by AI. Technology penetration won’t stop at some arbitrary boundary, it will go all the way through to logical conclusion. We have a chance at qualitatively better world, but we’ll need to act and push for new economic systems - when the time comes.

pavel_lishin1y ago

Maybe I read too much science fiction, but my first thought when speaking about "true AI" isn't the worry that a lot of us will get fired, it's the worry that we'll have created an army of digital slaves.

karamanolev1y ago

"Army of digital slaves" doesn't really sound that bad when I think about it. As long as it's your army and not your adversary's army... In what ways do you think "an army of digital slaves" is bad?

1 more reply

lazide1y ago

“and still achieve the same results”.

That is the part that won’t actually happen, at least pretty quickly.

zug_zug1y ago· 5 in thread

This roughly mirrors my experience so far. Mind you I'm an extremely qualified engineer who has worked at FAANG.

Except I'd add that as one gets experience working with the AI I can only assume they'd get much better at making it go smoothly. For example, I wouldn't manually rewrite localhost, I'd tell the AI "Why is localhost everywhere? Will this worker if I deploy to a droplet?" and it will fix it for you.

Also I just paste error-messages directly into the AI and it usually knows how to fix them.

Sometimes it's net positive, sometimes it's net-negative due to creating a mess that's really hard to get out of or debug. But I imagine it's only a matter of time until the scopes in which it's cost-effective go up.

I don't like that AI is a threat of huge monopolistic and job-reducing potential, but I don't think downplaying it is a long-term strategy to combat that.

skydhash1y ago

> For example, I wouldn't manually rewrite localhost, I'd tell the AI "Why is localhost everywhere? Will this worker if I deploy to a droplet?" and it will fix it for you.

The solution is multi occur (emacs), quickfix list (vim), or any editors that have whole project find and replace.

MaKey1y ago

Which will also be much faster because you don't have to worry about sanitizing your code before sending it to an LLM or that the LLM made a mistake somewhere along the way.

GeoAtreides1y ago

> I'm an extremely qualified engineer who has worked at FAANG.

> I just paste error-messages directly into the AI

...

scarface_741y ago

I find it funny that commenters on HN actually think their having past or current experience working at a FAANG is some sort of signal for two reasons.

On HN especially, that’s really nothing novel, many of us have (including me) and the only thing that it takes to get into one as a software engineer is memorizing the solution to coding problems.

When I’m hiring - mostly for green field initiatives - coming from BigTech is usually a negative signal for me.

zug_zug1y ago

I'm not sure what your point is here...

Where the author went wrong in this post is that he tried to interpret an error ("I was asking claude to solve the wrong problem"), was wrong, and then wasted a lot of his own time.

I really think it's best practice when describing a problem to anybody that you start with what you observe and then if you want to hint your suspicions you call those out afterward as such. If you're very confident the LLM is going down a wrong path, you can ask it things like "How would I test the theory that environment variables aren't set in my docker container?"

BoredPositron1y ago· 5 in thread

Can anyone explain why everyone is so hyper focused on speed? 500 images per second, 100 minutes video in 30 minutes, thousand lines of loc per hour. Who is going to consume all that?

martin-t1y ago

Most of what generative models produce is shit so they have to produce a lot in hopes _some_ of it is OK-ish.

It's also about responsiveness. LLMs produce junior-level quality of code at a rate of hundreds of lines per minute. I need it to produce enough to spot where it's completely wrong as quickly as possible to I can change the prompt.

It's like a edit-compile-run cycle which you also need to be fast or you lose attention.

I was tempted to say it's another _step_ in the edit-compile-run but often the code is so bad I don't even bother compiling.

deergomoo1y ago

I'm firmly of the belief that most software would benefit immensely from us all slowing the hell down and putting more thought into what we build. But it would appear stability and a focus on core strengths doesn't sell nearly as well as endless new features for the marketing sheet added as quickly as possible.

geor9e1y ago

"Who is going to consume 1000 lines of code per hour?" he types into his mass-manufactured thinking machine running an advanced operating system, before clicking reply, sending it across a global mesh of said devices.

osmsucks1y ago

Other machines.

JTyQZSnP3cQGa8B1y ago

The images are almost good but still in the uncanny valley. The code is almost good but full of bad practices and hidden bugs and undefined behavior. Since most AI grifters are neither coders nor artists, all they can do is produce more more more capitalism-style.

stuaxo1y ago· 4 in thread

Oh look, a load of future work to fix these.

Why is this just like the last cost cutting exercise where the cheapest people in India produced a lot of "interesting" code.

SunlitCat1y ago

Way back in the 2000 (or even before that, can't remember!) I wanted to get into winsock programming. I found a page where someone from India explained that with examples.

The variables, functions and so on had names like:

a aa aaa b bb bbb

It helped me to grasp the basic concept, but was kinda hard to follow, tho. :D

nailer1y ago

You can update requirements, educate developers, and fix bad code with an LLM many orders of magnitude faster than you can with Wipro.

llm_trw1y ago

Because ignoring a heroic effort from all the women in India the number of Indian developers does not double every 4 years.

The number of flops a gpu can output on the other hand does.

scarface_741y ago

See also: almost every bespoke internal app written in FoxPro, VB, Excel with VBScript, etc

owenthejumper1y ago· 4 in thread

I have had great experience with Claude for coding, but you really need to be a programmer yourself, to be able to divide the problems into manageable chunks.

bboygravity1y ago

Same here, I really don't get all the "it's totally useless for programming" posts on here.

It makes me think many people haven't taken the time to actually learn to use the tool.

It just feels like they tried Copilot or ChatGPT for 5 minutes last year and concluded that all LLM's are useless and will be useless forever.

It makes me wonder if those people know that Claude 3.5 sonnet projects and/or Cursor with Claude exist?

Do they not appreciate some help to document their code? Do they never need to write or quickly understand scripts or code in one of the 100's of languages/stacks they're not too familiar with that they might encounter in the wild? How to get out of yet another git mess? Build a proof of concept in an hour that would've taken you days? A refresher on how to set up x toolchain to get started asap (the nr 1 hardest thing in programming :p) etc etc.

chillingeffect1y ago

Same here. I see these tools as teaching me patiently and challenging me (unwittingly) in areas where i'm out of depth. When i'm lucky they will do simpler stuff for me, but for $40/month, I don't feel entitled to a SaaS-unicorn-terraformer.

MaKey1y ago

> Do they not appreciate some help to document their code?

How does an LLM help there? What the code does should be obvious by looking at it, WHY it was written that way is the interesting question. Answering it often requires more context and domain knowledge.

> Do they never need to write or quickly understand scripts or code in one of the 100's of languages/stacks they're not too familiar with that they might encounter in the wild?

I'd rather take the time to do it myself because if I'm not familiar with a language/stack I won't be able to spot mistakes made by the LLM as easily.

> How to get out of yet another git mess?

Learn to solve the git issue and apply the knowledge in the future so you don't rely on yet another tool.

> Build a proof of concept in an hour that would've taken you days?

I question the premise.

> A refresher on how to set up x toolchain to get started asap (the nr 1 hardest thing in programming :p) etc etc.

How often do you do that? I think it's worth spending the time to do it yourself so you get an understanding of what exactly you're doing there. When you're done you can document the process and come back to it next time.

bboygravity1y ago

What you're basically saying here is: you should just learn more and know more faster.

And what I'm saying is: that's exactly what LLM's are super useful for.

To answer your last question: about every 6 months or so. I'm a freelancer, I do a new project for a new client every 6 months on average. All of their toolchains, build systems, OS of choice for the dev machine, OS of choice for the SoC, documentation methods, PCB design tools, version management systems, release systems, testing frameworks are completely different per client and change constantly (even within the same company) depending on department and moment in time.

1 more reply

sega_sai1y ago· 4 in thread

It seems there is a battle of two opposite view-points. One is that LLMs are just dumb autocompletes with no ability to understand anything. Another is that LLMs can already right now be substitutes for programmers. I personally thing it is neither, but for experts who know what they are doing it is a massive time saver. I.e. in cases you know what code you want to write, but it's tedious, LLMs can do for you. Also LLMs are great in cases where you are less familiar with a new API, language, but have generally good understanding of programming.

Despite my broadly positive view on usefulness of LLMs, I do not think they are good enough (yet) to build a full system from scratch without an expert supervisor. This should not IMO be used as a 'proof' they are dumb autocompleters.

deergomoo1y ago

> I.e. in cases you know what code you want to write, but it's tedious, LLMs can do for you

I feel like I'm living on another planet when I see this point. I have almost never in my career encountered the situation where actually typing out the code is the time consuming part. The time consuming part is knowing what code you want to write, running it in a variety of circumstances to gain confidence that it's correct, and iterating when it isn't.

Please don't think I'm saying you're wrong by the way—if anything this just shows how diverse programming can be as a career. But I see this point raised a lot and it doesn't match my experience at all.

deadbabe1y ago

Experts who know what they are doing have long had alternatives beyond LLMs to make their work faster.

They have open source libraries, stack overflow, tutorials, documentation, simple code generator tools and snippets.

The speed up we’re seeing is from LLMs basically caching all those things into a huge mathematical model and retrieving information in summarized form ready for consumption.

And while speed is always nice, LLMs are expensive, require maintenance themselves to maintain relevant context, are still error prone, and terrible at true innovation.

In a few years we’ll be talking about the big “AI crash” and “what went wrong” when it has been obvious to experts all along. Winter is coming.

sega_sai1y ago

I am sorry, but the comparison of 'stack overflow' and tutorials to LLMs is bizarre. The amount of time to get to the answer from LLMs is drastically shorter. And claiming that the they only 'cache thing' is just wrong. They are certainly capable of correctly answering things that were not directly in their training set.

deadbabe1y ago

Do you have any examples of a question you could ask to an AI right now that you couldn’t find from a basic search on stack overflow and Google? Didn’t think so.

fredgrott1y ago· 2 in thread

the real revolution will be when an AI tool can just be powered by our laptop to use our own codebase as the input....

Until then its just nonsense pretending to be something else...

fragmede1y ago

So, an M3 MacBook with 64GiB of RAM running Deepseek R1 Zero in ollama prompted via aider?

esafak1y ago

Coding assistants do use your code base.

emporas1y ago· 1 in thread

> LLMs are useless if you don’t understand the context > AI can be worse than useless when you don't understand the underlying technologies

I made a saying about this some weeks ago: "A.I. can make the road for you, but you have to know where you are going". In Greek it sounds a little bit better.

Also code is the truth, but it is not the only truth. The underlying computer, the network infrastructure and other things have an effect on the code. So, there could be a saying in addition to the first: "A.I. can make the road for you, but you have to test the road".

fenomas1y ago

I put it: "copilot doesn't save me much thinking, but it saves a ton of typing".

yodsanklai1y ago· 1 in thread

Maybe AI will shine when working with strongly typed languages. Most errors can be caught at compile time avoiding debugging hell.

bandrami1y ago

There's not enough of a corpus out there for the LLMs to snarf up

HL33tibCe71y ago

This. Great, AI can produce code. But it produces code without inducing understanding of the code in the person who wrote (or rather supervised the production of) it, which is half the point.

At some point AI will probably be good enough that this won’t matter. But it feels like we’re still a long way off that.

cduzz1y ago

Programs are communication between 2 loosely coupled audiences -- the humans who have to maintain / modify the code and the computer that gets to run the code.

Human language, used to convey ideas to other humans, is imprecise. It's fine that it's imprecise because the media (humans) have both good error correction and a reasonable set of global defaults.

Computer languages require enormous precision because they're some mechanical translation to a set of machine code runtime.

Perhaps you can train an LLM on lots of code, and it'll find semantic relationships between some clever code it's been trained on to and your specific request. Perhaps not, and it'll just give a dumb answer or an incorrect answer, (ideally some code copilot will actually try running the candidate answer code against your specific ask?) -- but once the answer gets complex you run into the "it's much harder to debug code than write it, so don't write code that's almost too complex for you to understand" problem.

At work, I constantly have to remind people "don't use math data structures for identities" "but int is smaller" "Are you ever going to want the 95th percentile customerID?" "no that's silly" "then it isn't a number". Or I get to constantly remind people "a string with lots of curly braces and quotes isn't necessarily json; if you're not using a serialized API and just sending bytes to stdout someone else has to parse it" "but I'm using a logging library" "does anything else ever send stuff to stdout while your logging library is running?" "oh yes, we're going to open a ticket to debug that." So I'm not optimistic that running code written by a machine is long-term viable.

That said -- there are situations where machine generated code works -- I think it's been a long time since anyone manually drew masks for etching dies when making CPUs.

jbirer1y ago

Anyone who has ever worked with VCs or shareholders before knows that, if you tell them the reality and limitations of something, they will either fire you or ignore what you say. They have been desperate to remove the leverage programmers have due to their skill and replace us with AI that they don't have to pay salaries to. All we can do at this point is just take VC money promising them exactly what they want to hear, that they will be able to replace us with a NLP model. Sometimes you just can't save people but you can profit from their voluntary fall from the cliff?

keyle1y ago

If you don't why it works when it works, you won't know why it doesn't work when it doesn't work.

The key issues here were staying on top of the AI's help.

Use AI wisely: as an assistant, not as a drunken lead developer.

rchowe1y ago

I played with OpenHands for a few days (using gpt-4o since I already had an OpenAI account). I found it to be decent at writing new code, but then it had a hard time making changes when there was a lot of repetitive code (in a TypeScript / React project that I had it create with vite).

One of the interesting things about OpenHands is that you can see what the AI is doing in the terminal window where you launched it. Since it can't really load the whole codebase into its context window, it does a lot of greping files, showing 10 lines on either side of the match, and then doing a search and replace based on this. This is pretty similar to what a human might do: attempt to identify the relevant function and change it.

I think I might have better luck with a simpler project, e.g. a Sinatra or Flask app where each route is relatively self-contained. I might give it or Cursor another try in the future when the tech has progressed a bit.

morsecodist1y ago

I appreciate posts that are about practical usage of AI and it's strengths/weaknesses and the kind of conversation it generates. Conversations about AI are tough for me to navigate because there are camps of people that seem very invested in AI being either omniscient or completely useless. I regularly see people saying that AI is at the level that it can replace engineers or build whole apps. When I try this with state of the art models, I am seeing results that are nowhere close. That said, I still use AI every day during my development and I have a flow I think makes me way more productive. I want more conversations like this about the mechanics of using AI as it currently, and honestly evaluating it's strengths and weaknesses without getting into hypothetical debates about the future or whether or not the AI "understands".

cushychicken1y ago

One thing I think would have helped the author: write a spec first.

Seriously. It seems stupid. But AI works a lot better with a written spec.

The incredible thing is that the AI can actually be an excellent resource for writing the spec. And it will actually produce better code when you feed the spec back into said AI!

The current generation of AI seems to have fooled a lot of people into thinking that somehow you can jump straight to coding. (Well, you can, and it will probably work if you want to make something small or limited in scope.) Not so!

But, on the bright side, it’s just as good at design as code if you ask the right questions!

I say this having used 4 and 4o extensively in this manner. Just started using sonnet3.5 in this way in the last month or so, and it is amazing at this.

n_ary1y ago

The issue with AI is, it generates what it is trained on. Most publicly available coding contents/examples are just docs or blogspam(geeksforgeeks/javapoint/whatever) where mostly surface level code is mostly peddled. Even, many OSS(small scale) do not have best practices or good code base, just enough to get whatever is needed to be done. Now when you train AI on such data, it’ll excel reproducing(statistically) the same thread of code.

Once the quality of training data improves(somehow getting access to high quality codebase behind corporate walls by promoting these assistants and ingesting the codebase), the output improves.

There was a popular saying, garbage in garbage out.

siva71y ago

It delivers debugging hell if you don't know what you do which is usually the case for inexperienced developers. It assists experienced developers very well who can sort through which parts of output are useful from the AI and which not so.

ibloomt1y ago

Heh, after decades of functional programmers being the "well, actually..." crowd at every conference, turns out they were right all along. Just for the wrong reasons!

The pitch:

AI generates tons of plausible-looking garbage Static types catch garbage at compile time OCaml/F#/Haskell fans quietly sipping tea in the corner

The irony? We spent years debating static vs dynamic typing for human developers. But the killer use case may ended up being catching AI hallucinations.

Finally, a business case for monads that doesn't require a PhD!

Time to dust off those Haskell books. Who knew safety could be so profitable? Plot twist: Category theory becomes a required interview question by 2025

peterkelly1y ago

I dream of a world in which more investment is put into creating better programming languages and runtime environments than trying to use LLMs as a way of coping with the complexities of current systems.

avidphantasm1y ago

I was recently experimenting with local-only LLM coding assistant in JetBrains products. They did speed things up a bit, but I quickly realized that they were essentially automating the creating a copy-paste errors, resulting in time lost to debugging errors I never would have introduced myself, so I stopped using them.

iamflimflam11y ago

It’s not mentioned anywhere in the post. But would be good to hear what the total time was including all the problems.

SunlitCat1y ago

Well, still trying to get into nvrhi, I went on to ask ChatGPT to write me an example program using it.

To make it short, it got better when I made a project, uploaded the headers and docs of it as project files and moved my chat into that project as well.

That said, AI can help you but needs a lot support from you to do things somewhat right.

senko1y ago

Content marketing for a new text editor thinly disguised as AI rage-bait.

HN fell for it hard - 156 points, 180 comments (as of this writing).

Well done Nick! :) And congrats on launching Codescribble! Hope to see a "how my post on AI grew my userbase" followup in a few weeks!

3ptow1y ago

If you drop all pretenses and use a photocopier to steal code directly instead of performing an elaborate laundering step, you will not have these issues.

wlindley1y ago

Garbage in, garbage out. Code spewed by a random generator that has not the slightest understanding of what it is doing, whacked at by a hammer until it seems to be working.

What is this supposed to produce other than a mass of bugs and vulnerabilities? "A.I." is utter garbage and always will be, it is foolish to think otherwise.

nowittyusername1y ago

AI allows for more people to be more productive an therefore code more and produce more lines of code. That alone means more debugging needs to be done. when more people are doing anything, within that realm of action there is more liability naturally simply because of a larger participation in those actions.

joshstrange1y ago

AI tools are just that, tools. I’ve said this since the very beginning of LLMs. I’ve yet to see anything change my mind. Aider/Devin/Copilot/Cursor/etc, all the different flavors of LLM tools are great but if you don’t know what you’re doing they are going to get stuck in a loop/corner/bad-path. Sometimes it takes 2-6+ exchanges before you realize it’s lost the thread which is why I love Aider’s “auto git commit” feature (defaults to on). You can always jump back X steps if you realize the LLM is lost.

You also have to get a good feel for when it’s best if you make a change vs the LLM. Aider doesn’t handle new files and moving around massive chunks super well. It can do it but if I want to rename someone everywhere or break out components/types/etc into different files then I know I should be doing that in my IDE myself. Same for little syntax errors when a diff the LLM makes isn’t quite right.

I spent a few nights last week using LLMs to help build a chrome extension to match my Amazon transactions with my YNAB transactions for the purpose of updating the memo field in YNAB with the item names I bought from Amazon to speed up my categorization and serve as history of what I bought (previously I did this whole process manually). I think it really helped and made the whole process go much faster.

It really excels (for me) in UI. I’d like to think I’m pretty competent at writing code/logic but I’m not great at UI. In many projects I get bogged down when it comes to UI. If I get stuck coming up with a UI or I don’t like how something looks I can lose motivation to continue forward on it. With Aider I can ask for UI and while it might be abhorrent to a designer I think it looks pretty damn good (better than what I could do) and lets me focus on the logic. Aider also lets me try radical changes knowing I can easy reset back a few steps if it doesn’t work out.

I’ve said many times at work that a huge power of LLMs is taking something that would take 30-60min down to <5min, specifically around things like little scripts to investigate a problem or get more details. For example, I might have a log that I can see there is data in that I want to extract. I know I can write a chained/piped command of sed/awk/grep/cut/sort/uniq/etc but it’s going to take some trial and error as well as time. With an LLM I can bang out the full command in 1-3 exchanges.

Same deal with visualizing some piece of data in the logs (note: yes, we use Prometheus/Grafana but not everything can go in there and for new bugs/issues in the field I’m normally dealing with something we haven’t seen before and thus haven’t setup monitoring/alerting on). I’ve had LLMs churn out simple HTML/JS/CSS files that I can feed data into “graph all instances of this happening if X > Y and time is between A and B, etc”.

Again, I can write this stuff from scratch but often don’t do it in practice because the ROI isn’t guaranteed. In the middle of a production issue do I want to waste 10-30+ min writing the script to see if I can prove a theory? No, it’s not worth it if it doesn’t pan out, but if I’m using an LLM and it takes me less than five minutes then I can throw a lot more stuff at the wall to see if it sticks.

yapyap1y ago

Never believe the snake oil sellers

macNchz1y ago

I’ve built and iterated a bunch of web applications with Claude in the past year—I think the author’s experience here was similar to some of my first tries, where I nearly just decided not to bother any further, but I’ve since come to see it as a massive accelerant as I’ve gotten used to the strengths and weaknesses. Quick thoughts on that:

1. It’s fun to use it to try unfamiliar languages and frameworks, but that exponentially increases the chance you get firmly stuck in a corner like OP’s deployment issue, where the AI can no longer figure it out and you find yourself needing to learn everything on the fly. I use a Django/Vue/Docker template repo that I’ve deployed many production apps from and know like the back of my hand, and I’m deeply familiar with each of the components of the stack.

2. Work in smaller chunks and keep it on a short leash. Agentic editors like Windsurf have a lot of promise but have the potential to make big sweeping messes in one go. I find the manual file context management of Aider to work pretty well. I think through the project structure I want and I ask it to implement it chunk by chunk—one or two moving pieces at a time. I work through it like I would pair programming with someone else at the keyboard: we take it step by step rather than giving a big upfront ask. This is still extremely fast because it’s less prone to big screwups. “Slow is smooth and smooth is fast.”

3. Don’t be afraid to undo everything it just did and re-prompt.

4. Use guidelines—I have had great success getting the AI to follow my desired patterns, e.g. how and where to make XHRs, by stubbing them in somewhere as an example or explicitly detailing them in a file.

5. Suggest the data structures and algorithms you want it to use. Design the software intentionally yourself. Tell it to make a module that does X with three classes that do A, B and C.

6. Let the AI do some gold plating: sometimes you gotta get in there and write the code yourself, but having an LLM assistant can help make it much more robust than I’d bother to in a PoC type project—thorough and friendly error handling, nice UI around data validation, extensive tests I’m less worried about maintaining, etc. There are lots of areas where I find myself able to do more and make better quality-oriented things even when I’m coding the core functionality myself.

7. Use frameworks and libraries the AI “knows” about. If your goal is speed, using something sufficiently mainstream that it has been trained on lots of examples helps a lot. That said, if something you’re using has had a major API change, you might struggle with it writing 1.0-style code even though you’re using 2.0.

8. Mix in other models. I’ve often had Claude back itself into a corner, only to loop in o1 via Aider’s architect mode and have it figure out the issue and tell Claude how to fix it.

9. Get a feel for what it’s good at in your domain—since I’m always ready to quickly roll back changes, I always go for the ambitious ask and see whether it can pull it off—sometimes it’s truly amazing in one shot! Other times it’s a mess and I undo it. Either way over time you get an intuition for when it will screw up. Just last week I was playing around with a project where I had a need to draw polygons over a photograph for debugging purposes. A nice to have on top of that was being able to add, delete, and drag to reshape them, but I never would have bothered coding it myself or pulling in a library just for that. I asked Claude for it, and got it in one shot.

j / k navigate · click thread line to collapse

253 comments

173 comments · 34 top-level

namaria1y ago· 41 in thread

Coding is trying to order bytes into doing arbitrary stuff that is useful because of some transient conjunction of factors in the real world.

Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

qwertox1y ago

The hype is around having AI replace your typing, that is, to code for you.

The hype should not be around replacing the typing, but in assisting your thoughts.

> Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

The question is then if assistants are "sitting next to you", like a secretary and a mentor, or if they are sitting between you and the editor, as the thing you need to control.

barrell1y ago

I think it’s really good with the 101-level academics side. Learning the basics of anything through a conversational manner can be massively helpful.

2 more replies

spacemanspiff011y ago

Where I find coding assistants the most useful _is_ in writing code that I already want to write.

Ala - I need to write this unit test, it has these checks, it validates these methods.

All of these are things I can easily do myself, are easy to validate correctness, but if I were to write them would consume my limited mental energy for the day.

3 more replies

skydhash1y ago

1 more reply

api1y ago

The hype in some areas is around replacing coders, which is a fantasy without orders of magnitude better systems.

1 more reply

mmusson1y ago

ozim1y ago

Hard disagree.

Assistant should not help you thinking, any AI agent/tool should be doing what you want with minimal amount of explanation.

I can write my own Twitter clone and if I have to write prompt after prompt it is going to take me more time and more typing so it is useless.

A person that cannot write their own Twitter clone is not going to prompt their way out to having working and deployed Twitter clone.

insign_bit1y ago

This reminds me of something Dijkstra wrote almost 50 years ago in On the foolishness of "natural language programming [1]:

> When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.

Although this is obviously not about LLMs, its astonishing how many parallels can be drawn to today's usage of AI systems.

1: https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

bwfan1231y ago

Wow, thanks for sharing this beautiful essay by a legend that captures the essence of the llm debate.

Indeed, math and other symbolism/formalism is a crowning achievement of humans.

bambax1y ago

Trusting an AI to code an app from start to finish seems crazy to me but hey... if some people can pull it off, good for them I guess.

roland351y ago

That's the type of thing AI is great for! I find AI is pretty decent at generating data related code with Pandas, and I since I only rarely use Pandas it saves me a ton of time relearning everything.

Where ai starts breaking down is how to effectively incorporate a new feature in a complicated existing codebase. That is where us engineers can continue to hold an advantage.

csomar1y ago

Claude 3.5 did it from the first shot.

kuschku1y ago

I literally just spent a full week on such a project. Respectfully, fuck people who don't read the docs/spec.

2 more replies

jason_zig1y ago

You still have to become a domain expert to debug it though?

1 more reply

danny_codes1y ago

Agree it's good for boilerplate, provided the thing you want to do is extremely basic / just setup. Once you need something slightly more complex it seems to break down rather quickly.

petercooper1y ago

Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

klabb31y ago

All those layers are stable and deterministic. You can open it up and check, in most cases, how the upper layer calls the lower layer, should it be necessary.

In fact, it’s more like the antithesis of the reproducible builds movement. It’s introducing a proprietary networked high latency chaos agent into the critical path.

petercooper1y ago

3 more replies

ConspiracyFact1y ago

> It’s introducing a proprietary networked high latency chaos agent into the critical path.

withinboredom1y ago

sure; but these languages we speak to computers in are deliberately non-ambiguous and lack nuance. Natural language is both ambiguous and nuanced.

"Have a nice day!" can mean many things (an insult, or sincere, for example).

2 more replies

s_dev1y ago

AI is a tool like any other. Autocomplete on steroids -- markov chains taken to the extreme.

We already put natural language between us and the bytes. Hence why most keywords and variable names (a hard part of computer science) are in simple English and it is considered a net positive.

namaria1y ago

nailer1y ago

It’s interesting. I would’ve agreed that ‘driving intent from natural language is something that LLMs have fallen far short of’ maybe a month ago.

The issue isn’t that LLMs are terrible, it’s the software like cursor is buggy and poorly written.

It should know that I don’t want to use code from an old version of the library I am using because the new library I am using is already in my projects dependencies.

It should let me set up preferences for different programming languages. And preferences for all programming languages.

Short version: LLMs rule the software is just shitty.

4 more replies

llm_trw1y ago

>The memory and compute requirements to develop and run these models make no sense

There is the story that von Neumann flew off the handle the first time he saw an assembler.

>>How dare you waste compute cycles on this frivolity? Just use machine code like everyone else.

1 more reply

JTyQZSnP3cQGa8B1y ago

> AI is a tool like any other. Autocomplete on steroids

> variable names (a hard part of computer science)

Naming is for software engineering, not CS. One more confusion by people who want to sell us AI at all cost.

criley21y ago

These modern models and tools can solve nearly every single leet code problem faster than you. They can do every single Advent of Code problem likely 10X-100X faster than you can.

I get that "AI" is a buzzword that is pumping valuations and making business people see $$$.

But coding assistants are not that. For many programmers, they are quickly becoming valuable tools that do in fact speed up development.

4 more replies

dagw1y ago

Autocomplete works by analyzing the official API and interface, it's completely different than AI

I've found that just uploading the documentation of the API or library you are working with before asking the AI questions about it makes a huge difference in the quality of its output.

sixstringtheory1y ago

>> variable names (a hard part of computer science)

>Naming is for software engineering, not CS.

I figured they were referencing the “two hard problems of computer science”, those two being naming things, cache invalidation and off by one errors.

Everybody knows the hardest problems in software engineering are assembling promo packets and building consensus on number of spaces per indent.

1 more reply

lordmoma1y ago

AI is good with Rust programming, because the compiler will keep it from hallucinating!

aleph_minus_one1y ago

> Hence why most keywords and variable names (a hard part of computer science) are in simple English and it is considered a net positive.

2 more replies

zahlman1y ago

>Hence why most keywords and variable names (a hard part of computer science) are in simple English

"Natural language" is about far more than individual words.

d_tr1y ago

Bad take. Identifiers are just labels.

nailer1y ago

I don’t think anyone disagrees that identifiers are labels. If you’re claiming that these labels are unimportant, I’d be interested in why you think this.

2 more replies

bccdee1y ago

seanmcdirmid1y ago

myth20181y ago

I agree, and it's sad that, despite all those pitfalls, some companies and CEOs will keep pushing the idea that human programmers can be replaced by AI.

jstummbillig1y ago

That's obviously wrong, as demonstrated by the engineers invested in building tools to enable that.

gessha1y ago

A lot of engineers invested in building crypto stuff and we didn’t go far in personal banking. Hype-driven development is not guaranteed to succeed.

jstummbillig1y ago

trinix9121y ago

Which has been proved again and again to only get you 70% there [1].

[1] https://addyo.substack.com/p/the-70-problem-hard-truths-abou...

dartos1y ago

Popularity doesn’t demonstrate anything

andix1y ago· 27 in thread

My social feeds are full of tech bros who keep telling people AI codes everything for them. AI obviously has some impressive coding skills, but for me it never really worked well.

So is this just an illusion they create, or is it really possible to build software with AI, at least at a mediocre level?

I'm looking for open source projects that were built mostly with AI, but so far I couldn't find any bigger projects that were built with AI coding tools.

dagw1y ago

javier21y ago

When I have to be that specific with it, it would be faster for me to just write it directly in my normal IDE with great auto complete

dagw1y ago

it would be faster for me to just write it directly in my normal IDE

If it's faster for me, it might not be faster for everybody, but it is probably faster for many people.

3 more replies

ExtremisAndy1y ago

"AI isn't great at creating software, but it is great at writing functions."

pydry1y ago

It manages simple functions but Ive tried to get it to do complex ones (e.g. a parser with a bunch of edge cases) and it totally shit the bed.

For the simpler cases I think prompting still took about as long as just writing the damn thing myself if I was familiar with the language.

The coding I have found it useful for is small, self contained, well defined scripts in bash where the tedious part is reminding myself of all of the command switches and the funky syntax.

rco87861y ago

Current gen AI can spit out some very, very basic web sites (I won't even elevate to the word "app") with some handholding and poking and prodding to get it to correct its own mistakes.

It is pretty great as a ridealong pair programmer though. I've been using Cursor as my IDE and can't imagine going back to a non-AI coding experience.

wanderingbort1y ago

That attention does not map well to the important, hard, and more valuable parts of development.

Anecdotally, I still find it to be useful and it’s improving. I do think it’s going to be an huge impact in time.

Hype is part of the industry and it can be distracting to users, developers, and investors BUT it can also be useful (and I don’t know how to replace it) so, we live with it.

williamcotton1y ago

https://github.com/williamcotton/webdsl

Made almost entirely with Cursor and Claude 3.5 Sonnet.

11k lines of C and counting.

gray_-_wolf1y ago

2 more replies

thomasfromcdnjs1y ago

I used Windsurf mostly on a feature to build out user authentication and then another tool to generate the PR documentation entirely.

https://github.com/jsonresume/jsonresume.org/pull/176

Meets my good enough standards fo sure

imtringued1y ago

I like the concept, but honestly I don't see myself writing an entire webapp with this.

Here is some feedback:

That's where I think the idea of a "WebDSL" would shine the most.

williamcotton1y ago

Thank you for the feedback.

I also don't see myself writing an entire webapp with this either - perhaps small sites or simple API endpoints?

I was mainly scratching an itch I've had for a couple of years. I also really like tuning C code just for the fun of it!

BeefWellington1y ago

It's funny that this is MIT licensed, expecting credit for uncopyrightable work.

1 more reply

andix1y ago

Thanks, that's exactly what I'm looking for.

javier21y ago

Same experience. It has become pretty good at writing creative SQL queries though. Its actually rather good at that.

mft_1y ago

Basically, it worked, but not without issues:

- The biggest issue was debugging: because the bugs appeared in XCode, not Cursor, it either meant laboriously describing/transcribing errors into Cursor, or manually fixing them.

- The 'parallel' work between Cursor and XCode was clunky, especially when Cursor created new files. It took a while to figure out a halfway-decent workflow.

jstummbillig1y ago

https://aider.chat/HISTORY.html

cejast1y ago

I don't think it is an illusion. It can remove a lot of barriers to entry for some people, and this is probably what you're seeing in the anecdata.

andix1y ago

There is a big gap between being able to create a somehow working application and shipping a product to a customer.

Those claims are about being able to create a profitable product with 10x efficiency.

HL33tibCe71y ago

Current-gen AI can write obvious code well, but fails at anything that involves complexity or subtlety in my experience

jgilias1y ago

But this is still huge, and shouldn’t be disregarded.

loveparade1y ago

In my experience, AI is good at building stuff in two scenarios:

For use cases in middle, LLMs kind of suck.

bdangubic1y ago

these are exactly my experiences as well. senior devs on my team and rocking and rolling with the AI. my junior devs have all but given up using it even after numerous retros etc…

frodo8sam1y ago

For me ai has been pretty useful. Difference is I'm not a software engineer, I just write scripts to help me do my job. If I wrote bigger applications I doubt llms could help me.

andix1y ago

AI is awesome for small coding tasks, with a defined scope. I don't write shell (or powershell) scripts anymore, AI does it now for me.

Earw0rm1y ago

This. It's good at Cmake, and mostly good at dealing with COM boilerplate (although hallucinations are still a problem).

dahousecat1y ago

They are useful for small tasks like refactoring a method however big the whole project is

CharlieDigital1y ago· 26 in thread

Conclusion: if you're going to use AI to code, commit to it and use AI to fix the errors as well. Use AI for every aspect of it.

[0] Yes, I'm sure there are security holes and code issues galore, but those can always be fixed later when he's proven the business model.

[1] Yes, I have told him that he should create a YT channel or stream on Twitch because the content itself is super interesting how well he's been able to use AI.

abossy1y ago

CharlieDigital1y ago

Having had an inside view of a YC startup that went from seed to C, I can tell you that code quality means a lot less than one would think when it comes to the early days of a startup.

The biggest risk to a startup is that you get the business model wrong or you don't ship code, even if it's the code is buggy and messy.

InvOfSmallC1y ago

So maybe he was lucky or he is using a very good LLM I'm not aware of.

CharlieDigital1y ago

arend3211y ago

This probably only works if you glue a bunch of high level, popular APIs together. It might work, but will be fragile and expensive.

dartos1y ago

> fragile and expensive

Unfortunately, that’s the most common kind of software in the saas industry anyway.

CharlieDigital1y ago

Most SaaS apps today can be done by gluing together popular APIs (e.g. Stripe, Shopify, etc.).

No better or worse than hiring cheap offshore contractors to do the same, IMO.

bendauphinee1y ago

As an experienced developer, that’s also how I use it. What I’m finding is that it generally rabbit holes as I give it new errors that it’s previous fix has produced.

This has been super helpful in my process of learning new things, as well as relearning things I haven’t worked with in a while.

jfengel1y ago

It's not impossible to fix later. But it's often more effective to scrap and rewrite. Hopefully your proven business model has yielded enough money for that, before someone else has pwned it.

Timber-65391y ago

One can only imagine how many corners your friend had to cut to get to the product you call finished.

CharlieDigital1y ago

He's got paying customers organic inbound by word of mouth only; there must be some value there.

kuschku1y ago

> word of mouth

> must be some value

https://en.wikipedia.org/wiki/Parasite_(2019_film)#Plot

1 more reply

skeeterbug1y ago

I wonder what this codebase will look like after a year or so of doing this.

extesy1y ago

iamflimflam11y ago

Have you seen the job market at the moment? Humans will do a lot of things to keep a roof over their heads.

jiehong1y ago

I saw that with users asking VBA code to be generated by people trying to automated part of email and excel work.

CharlieDigital1y ago

It's possible, but his choices are 1) hire someone else, 2) just sit there and prompt again until it's fixed. Since he's bootstrapping this with < $50/mo, the choice is simple.

Also, it may be the case that the corpus of training data with VBA is not as good as it is with React these days.

plagiarist1y ago

I have tried that and the AI gets stuck just attempting whatever and writing more code that won't even compile. I have had more success trying to get it to follow steps or examples.

Maybe the language your friend is using has more examples for training, or perhaps the dynamism of some languages get it to runtime errors that have better details it can work with.

CharlieDigital1y ago

React and JS so I think it has some benefits since 1) it has a large corpus of recent training data, 2) the browser gives pretty good errors.

arrowsmith1y ago

> integrate some browser and autonomous test loop into Cursor

Doesn't this exist yet? It's such an obvious idea I'd be astonished if no-one has done it.

CharlieDigital1y ago

They exist in separate pieces; I've not seen it integrated into one loop yet.

Code gen -> show the AI an example of how it's supposed to work -> error -> code gen -> AI tries it again by itself -> Code gen

imtringued1y ago

This only works if there is an error message. Do you instruct the AI to fill in the code with asserts and not implemented exceptions?

CharlieDigital1y ago

Madmallard1y ago

llamaimperative1y ago

The vast majority of code in the world is the former though

sarchertech1y ago

The last 20 years of programming tells me that this isn’t the case.

This is only the case for new projects which don’t yet have users. Add users to even the simplest project and it evolves into a special snowflake with never before seen edge cases.

That’s why low code solutions are great for prototyping but eventually always explode into a nightmare of complexity.

fcatalan1y ago· 12 in thread

almog1y ago

To this day, no LLM that I tried passed this task of leading the development while detecting the underlying structure of the data.

morsecodist1y ago

diggan1y ago

I tend to restart chats from the beginning pretty much all the time, because of this.

iamflimflam11y ago

KronisLV1y ago

renewedrebecca1y ago

But why? What is there to be gained in all of this work around the inherent limitations of this technology?

KronisLV1y ago

Exploration of what’s possible and what’s not, identifying whether the weaknesses can or cannot be addressed.

A bit like traditional autocomplete can help streamline familiarising oneself with various libraries, a clear step ahead when compared to just needing to dig through documentation as much.

whattheheckheck1y ago

1 more reply

williamcotton1y ago

joshstrange1y ago

fragmede1y ago

It's a tool not a person. When was the last time you got mad at a hammer for being smug?

sebastiennight1y ago

To drive the point home, hammers are quite smug.

Mine always thinks it nailed it on the first try, and it's pretty hard-headed when you point out mistakes.

If you can't work around those limitations, you're screwed.

alwinaugustin1y ago· 7 in thread

lubujackson1y ago

I think this comment underlines the biggest difference between people that say AI is a transformative tool and people that say it is nowhere close to working as expected.

notjoemama1y ago

jfengel1y ago

I think that by the time AI can genuinely replace software engineers, a lot else in society will change.

But you are almost certainly right that we will not be the inheritors.

fellowmartian1y ago

pavel_lishin1y ago

karamanolev1y ago

"Army of digital slaves" doesn't really sound that bad when I think about it. As long as it's your army and not your adversary's army... In what ways do you think "an army of digital slaves" is bad?

1 more reply

lazide1y ago

“and still achieve the same results”.

That is the part that won’t actually happen, at least pretty quickly.

zug_zug1y ago· 5 in thread

This roughly mirrors my experience so far. Mind you I'm an extremely qualified engineer who has worked at FAANG.

Also I just paste error-messages directly into the AI and it usually knows how to fix them.

I don't like that AI is a threat of huge monopolistic and job-reducing potential, but I don't think downplaying it is a long-term strategy to combat that.

skydhash1y ago

> For example, I wouldn't manually rewrite localhost, I'd tell the AI "Why is localhost everywhere? Will this worker if I deploy to a droplet?" and it will fix it for you.

The solution is multi occur (emacs), quickfix list (vim), or any editors that have whole project find and replace.

MaKey1y ago

Which will also be much faster because you don't have to worry about sanitizing your code before sending it to an LLM or that the LLM made a mistake somewhere along the way.

GeoAtreides1y ago

> I'm an extremely qualified engineer who has worked at FAANG.

> I just paste error-messages directly into the AI

...

scarface_741y ago

I find it funny that commenters on HN actually think their having past or current experience working at a FAANG is some sort of signal for two reasons.

On HN especially, that’s really nothing novel, many of us have (including me) and the only thing that it takes to get into one as a software engineer is memorizing the solution to coding problems.

When I’m hiring - mostly for green field initiatives - coming from BigTech is usually a negative signal for me.

zug_zug1y ago

I'm not sure what your point is here...

Where the author went wrong in this post is that he tried to interpret an error ("I was asking claude to solve the wrong problem"), was wrong, and then wasted a lot of his own time.

BoredPositron1y ago· 5 in thread

Can anyone explain why everyone is so hyper focused on speed? 500 images per second, 100 minutes video in 30 minutes, thousand lines of loc per hour. Who is going to consume all that?

martin-t1y ago

Most of what generative models produce is shit so they have to produce a lot in hopes _some_ of it is OK-ish.

It's like a edit-compile-run cycle which you also need to be fast or you lose attention.

I was tempted to say it's another _step_ in the edit-compile-run but often the code is so bad I don't even bother compiling.

deergomoo1y ago

geor9e1y ago

osmsucks1y ago

Other machines.

JTyQZSnP3cQGa8B1y ago

stuaxo1y ago· 4 in thread

Oh look, a load of future work to fix these.

Why is this just like the last cost cutting exercise where the cheapest people in India produced a lot of "interesting" code.

SunlitCat1y ago

Way back in the 2000 (or even before that, can't remember!) I wanted to get into winsock programming. I found a page where someone from India explained that with examples.

The variables, functions and so on had names like:

a aa aaa b bb bbb

It helped me to grasp the basic concept, but was kinda hard to follow, tho. :D

nailer1y ago

You can update requirements, educate developers, and fix bad code with an LLM many orders of magnitude faster than you can with Wipro.

llm_trw1y ago

Because ignoring a heroic effort from all the women in India the number of Indian developers does not double every 4 years.

The number of flops a gpu can output on the other hand does.

scarface_741y ago

See also: almost every bespoke internal app written in FoxPro, VB, Excel with VBScript, etc

owenthejumper1y ago· 4 in thread

I have had great experience with Claude for coding, but you really need to be a programmer yourself, to be able to divide the problems into manageable chunks.

bboygravity1y ago

Same here, I really don't get all the "it's totally useless for programming" posts on here.

It makes me think many people haven't taken the time to actually learn to use the tool.

It just feels like they tried Copilot or ChatGPT for 5 minutes last year and concluded that all LLM's are useless and will be useless forever.

It makes me wonder if those people know that Claude 3.5 sonnet projects and/or Cursor with Claude exist?

chillingeffect1y ago

MaKey1y ago

> Do they not appreciate some help to document their code?

> Do they never need to write or quickly understand scripts or code in one of the 100's of languages/stacks they're not too familiar with that they might encounter in the wild?

I'd rather take the time to do it myself because if I'm not familiar with a language/stack I won't be able to spot mistakes made by the LLM as easily.

> How to get out of yet another git mess?

Learn to solve the git issue and apply the knowledge in the future so you don't rely on yet another tool.

> Build a proof of concept in an hour that would've taken you days?

I question the premise.

> A refresher on how to set up x toolchain to get started asap (the nr 1 hardest thing in programming :p) etc etc.

bboygravity1y ago

What you're basically saying here is: you should just learn more and know more faster.

And what I'm saying is: that's exactly what LLM's are super useful for.

1 more reply

sega_sai1y ago· 4 in thread

deergomoo1y ago

> I.e. in cases you know what code you want to write, but it's tedious, LLMs can do for you

deadbabe1y ago

Experts who know what they are doing have long had alternatives beyond LLMs to make their work faster.

They have open source libraries, stack overflow, tutorials, documentation, simple code generator tools and snippets.

The speed up we’re seeing is from LLMs basically caching all those things into a huge mathematical model and retrieving information in summarized form ready for consumption.

And while speed is always nice, LLMs are expensive, require maintenance themselves to maintain relevant context, are still error prone, and terrible at true innovation.

In a few years we’ll be talking about the big “AI crash” and “what went wrong” when it has been obvious to experts all along. Winter is coming.

sega_sai1y ago

deadbabe1y ago

Do you have any examples of a question you could ask to an AI right now that you couldn’t find from a basic search on stack overflow and Google? Didn’t think so.

fredgrott1y ago· 2 in thread

the real revolution will be when an AI tool can just be powered by our laptop to use our own codebase as the input....

Until then its just nonsense pretending to be something else...

fragmede1y ago

So, an M3 MacBook with 64GiB of RAM running Deepseek R1 Zero in ollama prompted via aider?

esafak1y ago

Coding assistants do use your code base.

emporas1y ago· 1 in thread

> LLMs are useless if you don’t understand the context > AI can be worse than useless when you don't understand the underlying technologies

I made a saying about this some weeks ago: "A.I. can make the road for you, but you have to know where you are going". In Greek it sounds a little bit better.

fenomas1y ago

I put it: "copilot doesn't save me much thinking, but it saves a ton of typing".

yodsanklai1y ago· 1 in thread

Maybe AI will shine when working with strongly typed languages. Most errors can be caught at compile time avoiding debugging hell.

bandrami1y ago

There's not enough of a corpus out there for the LLMs to snarf up

HL33tibCe71y ago

This. Great, AI can produce code. But it produces code without inducing understanding of the code in the person who wrote (or rather supervised the production of) it, which is half the point.

At some point AI will probably be good enough that this won’t matter. But it feels like we’re still a long way off that.

cduzz1y ago

Programs are communication between 2 loosely coupled audiences -- the humans who have to maintain / modify the code and the computer that gets to run the code.

Human language, used to convey ideas to other humans, is imprecise. It's fine that it's imprecise because the media (humans) have both good error correction and a reasonable set of global defaults.

Computer languages require enormous precision because they're some mechanical translation to a set of machine code runtime.

That said -- there are situations where machine generated code works -- I think it's been a long time since anyone manually drew masks for etching dies when making CPUs.

jbirer1y ago

keyle1y ago

If you don't why it works when it works, you won't know why it doesn't work when it doesn't work.

The key issues here were staying on top of the AI's help.

Use AI wisely: as an assistant, not as a drunken lead developer.

rchowe1y ago

morsecodist1y ago

cushychicken1y ago

One thing I think would have helped the author: write a spec first.

Seriously. It seems stupid. But AI works a lot better with a written spec.

The incredible thing is that the AI can actually be an excellent resource for writing the spec. And it will actually produce better code when you feed the spec back into said AI!

But, on the bright side, it’s just as good at design as code if you ask the right questions!

I say this having used 4 and 4o extensively in this manner. Just started using sonnet3.5 in this way in the last month or so, and it is amazing at this.

n_ary1y ago

Once the quality of training data improves(somehow getting access to high quality codebase behind corporate walls by promoting these assistants and ingesting the codebase), the output improves.

There was a popular saying, garbage in garbage out.

siva71y ago

ibloomt1y ago

Heh, after decades of functional programmers being the "well, actually..." crowd at every conference, turns out they were right all along. Just for the wrong reasons!

The pitch:

AI generates tons of plausible-looking garbage Static types catch garbage at compile time OCaml/F#/Haskell fans quietly sipping tea in the corner

The irony? We spent years debating static vs dynamic typing for human developers. But the killer use case may ended up being catching AI hallucinations.

Finally, a business case for monads that doesn't require a PhD!

Time to dust off those Haskell books. Who knew safety could be so profitable? Plot twist: Category theory becomes a required interview question by 2025

peterkelly1y ago

avidphantasm1y ago

iamflimflam11y ago

It’s not mentioned anywhere in the post. But would be good to hear what the total time was including all the problems.

SunlitCat1y ago

Well, still trying to get into nvrhi, I went on to ask ChatGPT to write me an example program using it.

To make it short, it got better when I made a project, uploaded the headers and docs of it as project files and moved my chat into that project as well.

That said, AI can help you but needs a lot support from you to do things somewhat right.

senko1y ago

Content marketing for a new text editor thinly disguised as AI rage-bait.

HN fell for it hard - 156 points, 180 comments (as of this writing).

Well done Nick! :) And congrats on launching Codescribble! Hope to see a "how my post on AI grew my userbase" followup in a few weeks!

3ptow1y ago

If you drop all pretenses and use a photocopier to steal code directly instead of performing an elaborate laundering step, you will not have these issues.

wlindley1y ago

Garbage in, garbage out. Code spewed by a random generator that has not the slightest understanding of what it is doing, whacked at by a hammer until it seems to be working.

What is this supposed to produce other than a mass of bugs and vulnerabilities? "A.I." is utter garbage and always will be, it is foolish to think otherwise.

nowittyusername1y ago

joshstrange1y ago

yapyap1y ago

Never believe the snake oil sellers

macNchz1y ago

3. Don’t be afraid to undo everything it just did and re-prompt.

5. Suggest the data structures and algorithms you want it to use. Design the software intentionally yourself. Tell it to make a module that does X with three classes that do A, B and C.

8. Mix in other models. I’ve often had Claude back itself into a corner, only to loop in o1 via Aider’s architect mode and have it figure out the issue and tell Claude how to fix it.

j / k navigate · click thread line to collapse