undefined | Better HN

0 pointsfc417fc8021mo ago0 comments

Rephrase that in terms of the human mechanic and hopefully you can see the error of that reasoning. LLMs that perform tasks (as opposed to merely holding conversations) use tools just like we do. That's literally how we design them to operate.

In fact the LLMs that everyone uses today typically have access to specialized task specific tooling. Obviously specialized tools aren't appropriate for a test that measures the ability to generalize but generic tools are par for the course. Writing a bot to play a game for you would certainly serve to demonstrate an understanding of the task.

0 comments

UltraSane1mo ago

I'm pretty sure the LLM can use tools while doing arc-agi-3 but it has to the same tools available all the time not an incredibly elaborate custom harness.

fc417fc802OP1mo ago

To quote someone else from upthread, tool use requires a harness. Without one an LLM as commonly understood is a bare model that receives inputs and directly produces outputs the same as talking to an unaided person.

UltraSane1mo ago

Then the LLM has to write the harness.

fc417fc802OP1mo ago

I'd like to suggest that prior to expressing disagreement you really ought to reread the comment you're replying to and make sure your understanding is correct.

Quoting this for the second time now - tool use requires a harness.

Without a harness the LLM has no ability to interact with the world. It has no agency. It's just spitting out text (or whatever else) into the void. There's no programming tools, no filesystem, no shell, nothing.

1 more reply

j / k navigate · click thread line to collapse

0 comments

UltraSane1mo ago

I'm pretty sure the LLM can use tools while doing arc-agi-3 but it has to the same tools available all the time not an incredibly elaborate custom harness.

fc417fc802OP1mo ago

UltraSane1mo ago

Then the LLM has to write the harness.

fc417fc802OP1mo ago

I'd like to suggest that prior to expressing disagreement you really ought to reread the comment you're replying to and make sure your understanding is correct.

Quoting this for the second time now - tool use requires a harness.

1 more reply

j / k navigate · click thread line to collapse