undefined | Better HN

0 pointsmeh88813y ago0 comments

LLM stands for Large Language model. Not Language Learning Model.

It absolutely categorizes those patterns as you can ask it for descriptions of things or request things via descriptions. It does identify the importance to humans. Because all of those things are also just patterns about patterns.

Everything an LLM does may be implicit, but it takes a tiny effort to engineer that implicit capability in to something explicit.

0 comments

3 comments · 1 top-level

thomastjeffery3y ago· 2 in thread

> LLM stands for Large Language model. Not Language Learning Model.

I'm not sure how I missed that. Good to know. My point still stands: it is the text being modeled, not language.

> It absolutely categorizes those patterns as you can ask it for descriptions of things or request things via descriptions

When you "ask it for descriptions", you are presenting a text prompt (that contains your question) to be merged into the LLM's model. After merging, the LLM presents a continuation. The categories you are talking about are patterns of text. Those patterns are part of the model, and the LLM doesn't know them any more than it knows any other pattern.

The categories I am talking about would be higher-level: a metadata structure of explicitly named patterns. That does not exist. LLMs don't have any notion of "what" a specific text pattern is, or why humans care.

> It does identify the importance to humans.

Again: not it. Humans make that identification. The identification itself, as written in text somewhere in the training corpus, is made available as a pattern to the LLM. The answer you get is a continuation from the content of the model, not some objective thought that the LLM is doing.

> Everything an LLM does may be implicit, but it takes a tiny effort to engineer that implicit capability in to something explicit.

Yes, and that engineering is a collection of human intentions. This changes nothing about the distinction I am making between an LLMs behavior, and the human behavior that was encoded into the text it is modeling.

Most of the engineering is to construct the training corpus itself. Adding "weights" is effectively the same; except the model is being edited directly. None of these efforts change the fundamental structure and behavior that the LLM itself is made of.

meh8881OP3y ago

So you agree that the LLM has patterns

And you agree that all those things are patterns

But you are arguing that they’re not… doing the patterns explicitly enough?

Please. The bar you’re describing is already something we can half ass and something we will be able to do explicitly very soon. The LLM can write code. If we just let it delegate the execution of that code to something that isn’t an LLM you can generate your explicit ontologies very easily. imo this is so obviously near term it’s uninteresting to fuss over it not being here in the oresent

thomastjeffery3y ago

I'm talking about problem domain and approach.

"Language" is a subdomain of "written text", which is a subdomain of "all possible text permutations". LLMs model "written text".

Approach is either explicit or implicit. Constructed or inferred.

You may be familiar with parsing: that's the explicit approach. Parsers are made out of predetermined grammar patterns. They categorize text into a model (AST) that was predetermined by the grammar rules. This approach is essentially a function from "all possible permutations of text" to "a known language model". It also maps "written language" to "predictable machine instructions".

Parsing works for "context-free" (code) grammars, because the patterns of grammar are already known. Parsing fails at "context-dependent" (natural language) grammars because the patterns of grammar are ambiguous.

LLMs take the implicit approach: they start completely blind, and model every pattern they can find in the text. An LLM has no category for "language grammar pattern". It does not constrain itself to the domain of "language". This approach maps "the behavior of someone writing text" to "patterns".

The difference in problem domain introduces ambiguity: LLMs can't categorize truth from lie.

The difference in approach removes intentional behavior: LLMs don't translate text into predictable machine behavior. LLMs model the patterns of behavior from the training corpus text, then model some more from a prompt, then show the resulting pattern.

What an LLM does should not be expected to ever match what a parser does: they are completely different approaches working in completely different domains.

1 more reply

j / k navigate · click thread line to collapse