undefined | Better HN

0 pointsdoubled11221d ago0 comments

> mixed-technique approach

I think my biggest annoyance with the way we rolled out AI is that nobody seemed to want to use it to augment already working solutions.

Just throw everything out and have an LLM do it instead.

0 comments

36 comments · 10 top-level

NateEag21d ago· 11 in thread

I recently saw a Claude skill that used Claude, with no tools, as a spell checker.

I wanted to hurl my laptop out to the window.

mattkrause21d ago

Isn't this pretty much why language models were invented?

Pasting something directly into the chat interface seems weird, but if you could somehow just see where P(token | context) falls off a cliff, that's a pretty good hint that your writing has problem.

lambda21d ago

Yeah, but for this use case you don't need Claude. You probably want a tuned lightweight small model that can run locally.

Even Haiku is massive overkill for this use case.

1 more reply

kangalioo21d ago

What would be a better way to incorporate AI as a spell checker?

In comparison to non-AI traditional tools, AI has the advantage of "understanding" the text, reducing the number of "stupid" mis-corrections. And its spelling correctness is usually already impeccable, so what is there to gain by interfacing it with traditional solutions, and how can it be achieved?

Hizonner21d ago

> What would be a better way to incorporate AI as a spell checker?

Don't do a stupid thing like that in the first place.

> In comparison to non-AI traditional tools, AI has the advantage of "understanding" the text, reducing the number of "stupid" mis-corrections.

I doubt it, but if that's true, run a normal spell checker, and then give the output to your LLM to filter.

> what is there to gain by interfacing it with traditional solutions,

About a billionfold improvement in compute efficiency, and a lower error rate.

> and how can it be achieved?

10 seconds of actual thought.

trollbridge21d ago

AI can’t really spell check without risking changing the meaning of sentences. Spell checking was a solved problem before this.

4 more replies

jlarocco21d ago

>What would be a better way to incorporate AI as a spell checker?

You just don't need AI to do spell checking. It's a waste of energy, bandwidth and tokens. It's like Java Enterprise Fizz-Buzz - 1000x more complicated than it needs to be and complete overkill.

But at least you can tell your manager you're using AI!

saynay21d ago

AI certainly is the shiny new hammer, and it is tempting to see the world as nails.

Traditional methods might not be perfect, but they also easily fit in the memory of even low power devices. Perhaps it isn't a problem worth burning a dollar of tokens for every spelling mistake.

sarchertech21d ago

The fact that it produces correctly spelled words says nothing about it’s ability to find spelling mistakes or to correct them without errors like completely changing the word.

1 more reply

ceejayoz21d ago

I am skeptical that AI brings any benefit to spell checking at all.

gedy21d ago

I swear that so many AI usecases I see are: "I did not have the skill or realize that you can write a program for this obvious logic".

I guess that works if you aren't a programmer or don't want to hire somebody, but then wtf would I pay for your service or product?

julianlam21d ago

This type of laziness isn't novel.

Check out left pad or the two dozen other "utility" packages that could be done in a single line of code.

neutronicus21d ago· 10 in thread

I've been frustrated with Copilot in this regard.

I work on a large C++ codebase, with large files. Human developers jump around between files with the Visual Studio fuzzy search, set breakpoints to trace execution in the Debugger, use the IDE's refactoring tools.

Microsoft's answer to this was to just ... expose none of this to their Agent Mode!? Replace the working semantic autocomplete with fucking lies!?

Maybe it's changed, I haven't been paying that much attention after bouncing off of this. I've gotten mild acceleration from using gptel-mode in emacs, manually adding references to context, and having models do various mechanical transformations on code. And I've even had some limited success writing tools for it to do LSP lookups.

xnorswap21d ago

It frustrates me too, it really feels like the next breakthrough will be when someone gets agents working "natively" with LSP on large code-bases.

Anthropic added LSP support to claude-code, but the current implementation is worse than useless, because any changes aren't reflected fast enough, so it's constantly working on outdated views / compilation caches, and it gets in a right muddle between its "internal" state / understanding in context, the real-world file, and the LSP.

If it could just leverage LSP to apply refactorings it would be amazing, but it feels like the LSP can't keep up, and I don't know if that's an LSP problem or a claude problem.

So we binned the LSP plugin and we're back to watching a machine find/replace, because while waiting on that is slower than LSP, it's a "Action => Wait" which the tooling understands, while LSP is "Possibly Wait for LSP to catch up => Action" which it doesn't understand nearly as well.

I suspect the LSP plugins also need better skills that pair with them so it reaches for them more often.

It hurts my soul to see it reach for find/replace to rename a class, complete with mistakes made in complex solutions where you might have name clashes in different namespaces. Something the LSP handles without problem, but can trip up an LLM.

bee_rider21d ago

I wonder, is the problem here that LSP is updating too slow all the time? Or just that there’s a chance it will update very slow, and you never really know if you’ll hit that chance, so your model always has to do the “long time wait” just in case? It seems like it ought to be possible for LSP to report that it is still processing, in the latter case, somehow…

1 more reply

rurban21d ago

Oh-my-pi work nice with LSP, better than the others.

hamburglar21d ago

I work in Unity and I got frustrated with Claude constantly doing gross bash/grep/awk/sed/grep nested loops that took forever that I finally described (and had Claude implement and install) a tool that could, in a single pass, gather all this info from a Unity forest of scenes at once and answer all the questions Claude ever wanted to ask about a Unity project in a single pass that takes 50ms instead of 10 30 second iterations. It still took a lot of coaching to get it to actually use this tool, but it seems like I’ve convinced it.

wincy21d ago

Haha yep I’m experimenting with Unreal engine and Codex and it spent 10 minutes while I was AFK confidently trying to build a scene. I load it up and fall through the world. I say “can’t you write a tool to screenshot so you know you’ve done a reasonable job correctly?” and now it does that.

It reminds me of working with a junior dev and he was pushing his code to dev, then waiting for it to build for every update because he couldn’t get it to build locally. 5 minutes of my time fixing his config surely saved him hours over the project. He wasn’t a bad dev either!

You have to do a lot of the meta thinking for the agents, because they’ll take an “everything looks like a nail if you have a hammer” with their toolkit.

Writing an entire local generated asset pipeline using flux and hunyuan3D-2.1 was a really fun experience. I’ve done software for years but never game dev and it’s just so much fun even if it’s junky little games to impress my kids and get them involved in the creative process.

evntdrvn21d ago

if it helps, I've found that using context (Claude.md etc) is way less effective for this type of pattern compared to using PreToolHook to capture "bad patterns" and either transparently rewriting them to "do the right thing" if that is possible statically, or if not then rejecting the tool use with a message that tells the agent "how" to use the intended tooling itself.

1 more reply

murphyslaw21d ago

Shouldn't it be possible to simply state in the contract to use that tool only? I've had good success with that in my coding.

1 more reply

vablings21d ago

tool_call is just a fancy wrapper to a black box that executes console commands. Said commands are now the actual backbone of all agentic AI, It feels like the linux people are incredibly vindicated in the single responsibility principle

wincy21d ago

Codex did take control of chrome to run a skill I’d given it for a website without an API the other day. It can do it but it’s excruciatingly slow compare to the tool calls for sure.

selcuka21d ago

My pet peeve: Whenever I type

    MyModel.obj

and wait when working on a Django project, Copilot completes it with

    MyModel.objects.all().delete()

peteforde21d ago· 2 in thread

Hey man, speak for yourself.

It's never occurred to me to even try getting an LLM to design or layout a circuit for me.

Instead, I have dozens or hundreds of chats in my history where I debate the merits of different parts for different tasks and scenarios, the nuances of decoupling strategies (package size vs deregulation), work out resistor network ratios from the reels I have on hand.

Then being able to feed an LLM a datasheet and have it write a custom driver against the registers I need so that it does exactly what I want without the cognitive overhead of a buggy package with someone else's strong opinions about how a part should be used is amazing.

Frontier models are incredibly good at electronics, and it's got nothing to do with what happens inside the EDA.

nubinetwork21d ago

Design, no... but I've definitely thought about letting one route traces... while autorouters work, I was hoping Claude could do matched traces better. At the time, it didn't want to generate the kicad pcbnew file though. /shrug

peteforde21d ago

Everyone is different, but board layout is one area where I aggressively don't want any LLM input until such a time as it is as good at board layout as it is at refactoring code.

We're still a ways off from that, and that's likely because board layout requires a much more nuanced perspective of the enclosure shape, power requirements, heat dissipation, RF...

It's really not about placing ICs with caps nearby. I actually really enjoy that part anyhow. That's the fun part!

ahartmetz21d ago· 1 in thread

Something something bitter lesson blah blah

I think the bitter lesson is severely misapplied in the current situation: If progress from "just add more resources" is very slow, and a huge amount of money is at stake, continous work on hand-engineering can give a continuous and very valuable competitive advantage.

The labs all seem to be going for AGI through bigger LLMs, and I am reasonably sure that it's not going to happen like that.

irthomasthomas21d ago

> The labs all seem to be going for AGI through bigger LLMs

I don't know if this is still the case. Labs like anthropic and openai are spending a huge amount of their time on custom model wrappers. Something which they used to leave to their customers.

PyWoody21d ago· 1 in thread

A few days ago someone on HN commented that a teammate uses Claude to search for text in files on their own computer. Buddy... There's Command-line Tools Can Be 235x Faster Than Your Hadoop Cluster and then there's Command-line Tools Can Be ∞ Faster Than Your AI.

dylan60421d ago

As snark, I've been using the phrase "ask GPT about it" for things that clearly do not need an LLM to be involved. The other day, I was on a zoom call and said it, only to see the present actually doing it. I hope my unmuted laugh wasn't too distracting.

ACCount3721d ago· 1 in thread

Way too much engineering effort to make something that might get leapfrogged by the next gen LLM.

It's a tantalizing thing, but far too treacherous to actually go for it, most of the time.

intrasight21d ago

There are many domains where a hybrid of numeric and AI approaches would make sense. For example in those domains where there's already a rich practice of numeric tools such as with IC layout.

ajross21d ago

> nobody [wants to use AI] to augment already working solutions

Plenty of people do, but that only produces a blog post that will get you to the front page of HN. If you want VCs to drop $40M on your head, you need to pretend to reinvent the world.

Then, to further appease the rain gods, you need to sue the bloggers on the front page of HN who are challenging your world-changing narrative. Which will, heh, drop you on the front page of HN.

Our community is, literally, eating itself at this point. There was a time when we actually took "make something people want" literally. Now it's just part of the fiction.

gmueckl21d ago

If its any consolation: once we've burnt the last crumb of coal, the last drop of oil and last bit of natural gas to fuel the AI overlords, that particular problem will take care of itself.

ChrisMarshallNY21d ago

> augment already working solutions

That's exactly how I use it, but I'm just a geezer on his own, writing free software for people that can't pay for it.

jlarocco21d ago

Annoying, but not surprising.

The future is using AI to do everything, and nobody gets funded saying they're taking a small step forward.

j / k navigate · click thread line to collapse

0 comments

36 comments · 10 top-level

NateEag21d ago· 11 in thread

I recently saw a Claude skill that used Claude, with no tools, as a spell checker.

I wanted to hurl my laptop out to the window.

mattkrause21d ago

Isn't this pretty much why language models were invented?

Pasting something directly into the chat interface seems weird, but if you could somehow just see where P(token | context) falls off a cliff, that's a pretty good hint that your writing has problem.

lambda21d ago

Yeah, but for this use case you don't need Claude. You probably want a tuned lightweight small model that can run locally.

Even Haiku is massive overkill for this use case.

1 more reply

kangalioo21d ago

What would be a better way to incorporate AI as a spell checker?

Hizonner21d ago

> What would be a better way to incorporate AI as a spell checker?

Don't do a stupid thing like that in the first place.

> In comparison to non-AI traditional tools, AI has the advantage of "understanding" the text, reducing the number of "stupid" mis-corrections.

I doubt it, but if that's true, run a normal spell checker, and then give the output to your LLM to filter.

> what is there to gain by interfacing it with traditional solutions,

About a billionfold improvement in compute efficiency, and a lower error rate.

> and how can it be achieved?

10 seconds of actual thought.

trollbridge21d ago

AI can’t really spell check without risking changing the meaning of sentences. Spell checking was a solved problem before this.

4 more replies

jlarocco21d ago

>What would be a better way to incorporate AI as a spell checker?

You just don't need AI to do spell checking. It's a waste of energy, bandwidth and tokens. It's like Java Enterprise Fizz-Buzz - 1000x more complicated than it needs to be and complete overkill.

But at least you can tell your manager you're using AI!

saynay21d ago

AI certainly is the shiny new hammer, and it is tempting to see the world as nails.

Traditional methods might not be perfect, but they also easily fit in the memory of even low power devices. Perhaps it isn't a problem worth burning a dollar of tokens for every spelling mistake.

sarchertech21d ago

The fact that it produces correctly spelled words says nothing about it’s ability to find spelling mistakes or to correct them without errors like completely changing the word.

1 more reply

ceejayoz21d ago

I am skeptical that AI brings any benefit to spell checking at all.

gedy21d ago

I swear that so many AI usecases I see are: "I did not have the skill or realize that you can write a program for this obvious logic".

I guess that works if you aren't a programmer or don't want to hire somebody, but then wtf would I pay for your service or product?

julianlam21d ago

This type of laziness isn't novel.

Check out left pad or the two dozen other "utility" packages that could be done in a single line of code.

neutronicus21d ago· 10 in thread

I've been frustrated with Copilot in this regard.

Microsoft's answer to this was to just ... expose none of this to their Agent Mode!? Replace the working semantic autocomplete with fucking lies!?

xnorswap21d ago

It frustrates me too, it really feels like the next breakthrough will be when someone gets agents working "natively" with LSP on large code-bases.

If it could just leverage LSP to apply refactorings it would be amazing, but it feels like the LSP can't keep up, and I don't know if that's an LSP problem or a claude problem.

I suspect the LSP plugins also need better skills that pair with them so it reaches for them more often.

bee_rider21d ago

1 more reply

rurban21d ago

Oh-my-pi work nice with LSP, better than the others.

hamburglar21d ago

wincy21d ago

You have to do a lot of the meta thinking for the agents, because they’ll take an “everything looks like a nail if you have a hammer” with their toolkit.

evntdrvn21d ago

1 more reply

murphyslaw21d ago

Shouldn't it be possible to simply state in the contract to use that tool only? I've had good success with that in my coding.

1 more reply

vablings21d ago

wincy21d ago

Codex did take control of chrome to run a skill I’d given it for a website without an API the other day. It can do it but it’s excruciatingly slow compare to the tool calls for sure.

selcuka21d ago

My pet peeve: Whenever I type

    MyModel.obj

and wait when working on a Django project, Copilot completes it with

    MyModel.objects.all().delete()

peteforde21d ago· 2 in thread

Hey man, speak for yourself.

It's never occurred to me to even try getting an LLM to design or layout a circuit for me.

Frontier models are incredibly good at electronics, and it's got nothing to do with what happens inside the EDA.

nubinetwork21d ago

peteforde21d ago

Everyone is different, but board layout is one area where I aggressively don't want any LLM input until such a time as it is as good at board layout as it is at refactoring code.

We're still a ways off from that, and that's likely because board layout requires a much more nuanced perspective of the enclosure shape, power requirements, heat dissipation, RF...

It's really not about placing ICs with caps nearby. I actually really enjoy that part anyhow. That's the fun part!

ahartmetz21d ago· 1 in thread

Something something bitter lesson blah blah

The labs all seem to be going for AGI through bigger LLMs, and I am reasonably sure that it's not going to happen like that.

irthomasthomas21d ago

> The labs all seem to be going for AGI through bigger LLMs

I don't know if this is still the case. Labs like anthropic and openai are spending a huge amount of their time on custom model wrappers. Something which they used to leave to their customers.

PyWoody21d ago· 1 in thread

dylan60421d ago

ACCount3721d ago· 1 in thread

Way too much engineering effort to make something that might get leapfrogged by the next gen LLM.

It's a tantalizing thing, but far too treacherous to actually go for it, most of the time.

intrasight21d ago

There are many domains where a hybrid of numeric and AI approaches would make sense. For example in those domains where there's already a rich practice of numeric tools such as with IC layout.

ajross21d ago

> nobody [wants to use AI] to augment already working solutions

Plenty of people do, but that only produces a blog post that will get you to the front page of HN. If you want VCs to drop $40M on your head, you need to pretend to reinvent the world.

Then, to further appease the rain gods, you need to sue the bloggers on the front page of HN who are challenging your world-changing narrative. Which will, heh, drop you on the front page of HN.

Our community is, literally, eating itself at this point. There was a time when we actually took "make something people want" literally. Now it's just part of the fiction.

gmueckl21d ago

If its any consolation: once we've burnt the last crumb of coal, the last drop of oil and last bit of natural gas to fuel the AI overlords, that particular problem will take care of itself.

ChrisMarshallNY21d ago

> augment already working solutions

That's exactly how I use it, but I'm just a geezer on his own, writing free software for people that can't pay for it.

jlarocco21d ago

Annoying, but not surprising.

The future is using AI to do everything, and nobody gets funded saying they're taking a small step forward.

j / k navigate · click thread line to collapse