Ask HN: How does ChatGPT understand the prompts?

3 pointsraj5552y ago3 comments

I somewhat understand how it generates texts, but I don't understand how can it understand the query to generate the text. I googled and landed on [1] which doesn't answer me. Why is there no info on this part anywhere?

[1] https://ai.stackexchange.com/questions/38294/how-does-chatgpt-respond-to-novel-prompts-and-commands

3 comments

3 comments · 3 top-level

skilled2y ago

LLMs simply process the input and generate outputs based on patterns seen during training.

Here's the process in brief:

Tokenization: The input text gets broken down into smaller chunks, or tokens. Tokens can range from a single character to a whole word.

Embedding: Tokens get translated into numerical vectors - this is how models can process them.

Processing: These vectors are then processed in the context of the others. This is done via a type of neural network called a Transformer[0] network, which handles context particularly well.

Context Understanding: The model uses patterns learned from its training to predict the next word in a sentence. It's not a human-like understanding, but rather it estimates the statistical probability of a word following the preceding ones.

Generation: The model generates a response by continuously predicting the next word until a full response is formed or it reaches a certain limit.

[0]: https://huggingface.co/learn/nlp-course/chapter1/4

PaulHoule2y ago

It “understands” the prompt by passing the data through the neural network and activating individual neurons to a greater or lesser extent.

In the case of BERT models (which I know better), there is an an activation for each token and that activation captures the meaning of the token in context. You can average these over all the tokens in a document and get a vector which is similar to the document vectors used in information retrieval. Traditionally you would count how many times each word is in a document and make a vector indexed by words, but the BERT vector can (1) find synonyms since these typically have a vector close to words with similar meanings and (2) differentiate different meanings of a word because the neuron activation is affected by the other words around it.

Activation of the neural network is the way that it represents the input text and I think “representation” is what is going on when it “understands insofar as it does.

rvz2y ago

It doesn't.

LLMs are mysterious black-boxes which cannot transparently explain themselves or their decisions and just regurgitates and rewords the output it was trained on and doesn't even know if its generated text is correct.

Grady Booch has deconstructed this question perfectly in a recent Twitter thread. [0]