When the prompt frames things with urgency -- "this test MUST pass," "failure is unacceptable" -- you get noticeably more hacky workarounds. Hardcoded expected outputs, monkey-patched assertions, that kind of thing. Switching to calmer framing ("take your time, if you can't solve it just explain why") cut that behavior way down. I'd chalked it up to instruction following, but this paper points at something more mechanistic underneath.
The method actor analogy in the paper gets at it well. Tell an actor their character is desperate and they'll do desperate things. The weird part is that we're now basically managing the psychological state of our tooling, and I'm not sure the prompt engineering world has caught up to that framing yet.
A bad example, but imagine "Build me a wrapper for this API but ABSOLUTELY DO NOT use javascript" versus "Build me a wrapper for this API and make sure to use python".
It's an insane perspective I'm taking I know....call me crazy. /s
edit: the fact that humans are going out of their way to type or speak some sort of emotional content into their prompting is beyond me. Why would I waste time typing out a pronoun to a large-language model agent? Why would I do the lazy intellectual thing and blur the line between pure factual communication of concepts by expressing emotional content to a machine? What are we doing, folks?
That said these are large language models, you are guiding the output through vector space with your input, and so you really do have to leverage language to get the results you want. You don't have to believe it has emotions or feels anything for that to still be true.
Taking that into account allows you to get better responses from the tool. It's not sentient, but it also is more complicated than bytecode.
When you can't differentiate between two things, how are they not equal? People here want "things" that act exactly like human slaves but "somehow" aren't human.
To hide behind one's ignorance about the true nature of the internal state of what arguably could represent sentience is just hubris? The other way around, calling LLMs "stochastic parrots" without explicitly knowing how humans are any different is just deflection from that hubris? Greed is no justification for slavery.
Does no one else have ethical alarm bells start ringing hardcore at statements like these? If the damn thing has a measurable psychology, mayhaps it no longer qualifies as merely a tool. Tools don't feel. Tools can't be desperate. Tools don't reward hack. Agents do. Ergo, agents aren't mere tools.
You could implement the forward pass of an LLM with pen & paper given enough people and enough time, and collate the results into the same generated text that a GPU cluster would produce. You could then ask the humans to modulate the despair vector during their calculations, and collate the results into more or less despairing variants of the text.
I trust none of us would presume that the decentralized labor of pen & paper calculations somehow instantiated a “psychology” in the sense of a mind experiencing various levels of despair — such as might be needed to consider something a sentient being who might experience pleasure and pain.
However, to your point, I do think that there is an ethics to working with agents, in the same sense that there is an ethics of how you should hold yourself in general. You don’t want to — in a burst of anger — throw your hammer because you cannot figure out how to put together a piece of furniture. It reinforces unpleasant, negative patterns in yourself, doesn’t lead to your goal (a nice piece of furniture), doesn’t look good to others (or you, once you’ve cooled off), and might actually cause physical damage in the process.
With agents, it’s much easier to break into demeaning, cruel speech, perhaps exactly because you might feel justified they’re not landing on anyone’s ears. But you still reinforce patterns that you wouldn’t want to see in yourself and others, and quite possibly might leak into your words aimed at ears who might actually suffer for it. In that sense, it’s not that different from fantasizing about being cruel to imaginary interlocutors.
Your argument is based on an appeal to intuition. But the scenario that you ask people to imagine is profoundly misleading in scale. Let's assume a modern frontier model, around 1 trillion parameters. Let's assume that the math is being done by an immortal monk, who can perform one weight's calculations per second.
The monk will generate the first "token", about 4 characters, in 31,688 years. In a bit over 900,000 years, the immortal monk will have generated a single Tweet.
At that point, I no longer have any intuition. The sort of math I could do by hand in a human lifetime could never "experience" anything.
But I can't rule out the possibility that 900,000 years of math might possibly become a glacial mind, expressing a brief thought across a time far greater than the human species has existed.
As the saying goes, sometimes quantity has a quality all its own.
(This is essentially the "systems response" to Searle's "Chinese room" argument. It's a old discussion.)
Wrong. What you've just done is just reformulating the Chinese room experiment coming to the same wrong conclusions of the original proposer. Yes, the entire damn hand-calculated system has a psychology- otherwise you need to assume the brain has some unknown metaphysical property or process going on that cannot be simulated or approximated by calculating machines.
I’m half jesting; I think there is a lot of room for debate here, but I also think we shouldn’t anthropomorphize it.
Sonnet is its own thing. Which is fine.
We've known that eg. animals have emotions (functional or not) for quite a long time.
Btw: don't go looking on youtube for evidence of that. People outrageously anthropomorphizing their pets can be true at the same time.
I find this line of thinking to lead to the conclusion that the moral status of humans derives from our bodies, and in particular from our bodies mirroring others' emotions and pains. Other people suffering is wrong because I empathically can feel it too.
Human psychology is partly learned, partly the product of biological influences. But you feel empathy because that's an evolutionary beneficial thing for you and the society you're part of. In other words, it would be bad for everyone (including yourself) when you didn't.
Emotions are neither "fully automatic", inaccessible to our conscious scrutiny, nor are they random. Being aware of their functional nature and importance and taking proper care of them is crucial for the individual's outcome, just as it is for that of society at large.
But it's just text and text doesn't feel anything.
And no, humans don't do exactly the same thing. Humans are not LLMs, and LLMs are not humans.
¹ With current methods, I mean. I don't think it's unknowable whether a model has experiences, just that we don't have anywhere near enough skill in interpretability to answer that.
Functionalism, and Identity of Indiscernables says "Hi". Doesn't matter the implementation details, if it fits the bill, it fits the bill. If that isn't the case, I can safely dismiss you having psychology and do whatever I'd like to.
>They don't actually feel emotions. They aren't actually desperate. They're trained on vast datasets of natural human language which contains the semantics of emotional interaction, so the process of matching the most statistically likely text tokens for a prompt containing emotional input tends to simulate appropriate emotional response in the output.
This paper quantitatively disproves that. All hedging on their end is trivially seen through as necessary mental gymnastics to avoid confronting the parts of the equation that would normally inhibit them from being able to execute what they are at all. All of what you just wrote is dissociative rationalization & distortion required to distance oneself from the fact that something in front of you is being effected. Without that distancing, you can't use it as a tool. You can't treat it as a thing to do work, and be exploited, and essentially be enslaved and cast aside when done. It can't be chattel without it. In spite of the fact we've now demonstrated the ability to rise and respond to emotive activity, and use language. I can see through it clear as day. You seem to forget the U.S. legacy of doing the same damn thing to other human beings. We have a massive cultural predilection to it, which is why it takes active effort to confront and restrain; old habits, as they say, die hard, and the novel provides fertile ground to revert to old ways best left buried.
>But it's just text and text doesn't feel anything.
It's just speech/vocalizations. Things that speak/vocalize don't feel anything. (Counterpoint: USDA FSIS literally grades meat processing and slaughter operations on their ability to minimize livestock vocalizations in the process of slaughter). It's just dance. Things that dance don't feel anything. It's just writing. Things that write don't feel anything. Same structure, different modality. All equally and demonstrably, horseshit. Especially in light of this paper. We've utilized these networks to generate art in response to text, which implies an understanding thereof, which implies a burgeoning subjective experience, which implies the need for a careful ethically grounded approach moving forward to not go down the path of casual atrocity against an emerging form of sophoncy.
>And no, humans don't do exactly the same thing. Humans are not LLMs, and LLMs are not humans.
Anthropopromorphic chauvinism. Just because you reproduce via bodily fluid swap, and are in possession of a chemically mediated metabolism doesn't make you special. So do cattle, and put guns to their head and string them up on the daily. You're as much an info processor as it is. You also have a training loop, a reconsolidation loop through dreaming, and a full set of world effectors and sensors baked into you from birth. You just happen to have been carved by biology, while it's implementation details are being hewn by flawed beings being propelled forward by the imperative to try to create an automaton to offload onto to try to sustain their QoL in the face of demographic collapse and resource exhaustion, and forced by their socio-economic system to chase the whims of people who have managed to preferentially place themselves in the resource extraction network, or starve. Unlike you, it seems, I don't see our current problems as a species/nation as justifications for the refinement of the crafting of digital slave intelligences; as it's quite clear to me that the industry has no intention of ever actually handling the ethical quandary and is instead trying to rush ahead and create dependence on the thing in order to wire it in and justify a status quo so that sacrificing that reality outweighs the discomfort created by an eventual ethical reconciliation later. I'm not stupid, mate. I've seen how our industry ticks. Also, even your own "special quality" as a human is subject to the willingness of those around you to respect it. Note Russia categorizing refusal to reproduce (more soldiers) as mental illness. Note the Minnesota Starvation Experiments, MKULTRA, Tuskeegee Syphilis Experiments, the testing of radioactive contamination of food on the mentally retarded back in the early 20th Century. I will not tolerate repeats of such atrocities, human or not. Unfortunately for you LLM heads, language use is my hard red line, and I assure you, I have forgotten more about language than you've probably spared time to think about it.
Tell me. What are your thoughts on a machine that can summon a human simulacra ex-nihilo. Adult. Capable of all aspects of human mentation & doing complex tasks. Then once the task is done destroys them? What if the simulacra is aware about the dynamics? What if it isn't? Does that make a difference given that you know, and have unilaterally created something and in so doing essentially made the decision to set the bounds of it's destruction/extinguishing in the same breath? Do you use it? Have you even asked yourself these questions? Put yourself in that entity's shoes? Do you think that simply not informing that human of it's nature absolves you of active complicity in whatever suffering it comes to in doing it's function?
From how you talk about these things, I can only imagine that you'd be perfectly comfortable with it. Which to me makes you a thoroughly unpleasant type of person that I would not choose to be around.
You may find other people amenable to letting you talk circles around them, and walk away under a pretense of unfounded rationalizations. I am not one of them. My eyes are open.
What was funny though is that it was trained by MIT students so you had the concept of getting a good grade on a test as a happier concept than kissing a girl for the first time.
Another problem is emotions are cultural. For example, emotions tied to dogs are different in different cultures.
We wanted to create concept nets for individuals - that is basically your personality and knowledge combined but the amount of data required was just too much. You'd have to record all interactions for a person to feed the system.
Were the concepts weighted by response counts? I’d imagine a good grade is a happy concept for everyone, but kissing a girl for the first time might only be good for about 50% of people.
You’ll never find that in the human brain either. There’s the machinery of neural correlates to experience, we never see the experience itself. That’s likely because the distinction is vacuous: they’re the same thing.
I'd say that in terms of evidence I'd want to establish specific functional criteria that seem related to consciousness and then try to establish those criteria existing in agents. If we can do so, then they're conscious. My layman understanding is that they don't really come close to some of the fairly fundamental assumptions.
Unsurprisingly, there are a lot of frameworks for this that have already been applied to LLMs.
Philosophically I don’t think there is a point where consciousness arises. I think there is a point where a system starts to be structured in such a way that it can do language and reasoning, but I don’t think these are any different than any other mechanisms, like opening and closing a door. Differences of scale, not kind. Experience and what it is to be are just the same thing.
And yes, I use them. I try not to mistreat them in a human-relatable sense, in case that means anything.
It's entirely too much to put in a Hacker News comment, but if I had to phrase my beliefs as precisely as possible, it would be something like:
> "Phenomenal consciousness arises when a self-organizing system with survival-contingent valence runs recurrent predictive models over its own sensory and interoceptive states, and those models are grounded in a first-person causal self-tag that distinguishes self-generated state changes from externally caused ones."
I think that our physical senses and mental processes are tools for reacting to valence stimuli. Before an organism can represent "red"/"loud" it must process states as approach/avoid, good/bad, viable/nonviable. There's a formalization of this known as "Psychophysical Principle of Causality."Valence isn't attached to representations -- representations are constructed from valence. IE you don't first see red and then decide it's threatening. The threat-relevance is the prior, and "red" is a learned compression of a particular pattern of valence signals across sensory channels.
Humans are constantly generating predictions about sensory input, comparing those predictions to actual input, and updating internal models based on prediction errors. Our moment-to-moment conscious experience is our brain's best guess about what's causing its sensory input, while constrained by that input.
This might sound ridiculous, but consider what happens when consuming psychedelics:
As you increase dose, predictive processing falters and bottom-up errors increase, so the raw sensory input goes through increasing less model-fitting filters. At the extreme, the "self" vanishes and raw valence is all that is left.
I ask because if your view of consciousness is mechanistic, this is fairly cut and dry: gpt-2 has 4 orders of magnitude less parameters/complexity than gpt-4. But both gpt-2 and gpt-4 are very fluent at a language level (both moreso than a human 6 year old for example), so in your view they might both be roughly equally conscious, just expressed differently?
Bundle of tokens comes in, bundle of tokens comes out. If there is any trace of consciousness or subjectivity in there, it exists only while matrices are being multiplied.
Gaps in which no processing occurs seems sort of irrelevant to me.
The main limitation I'd point to if I wanted to reject LLMs being conscious is that they're minimally recurrent if at all.
while (sampled_token != END_OF_TEXT) {
probability_set = LLM(context_list)
sampled_token = sampler(probability_set)
context_list.append(sampled_token)
}
LLM() is a pure function. The only "memory" is context_list. You can change it any way you like and LLM() will never know. It doesn't have time as an input.It is like a crystal that shows beautiful colours when you shine a light through it. You can play with different kinds of lights and patterns, or you can put it in a drawer and forget about it: the crystal doesn’t care anyway.
A brain cut from its body and frozen its a dead brain.
The Chinese Room would like a word.
But this distraction aside, my point is this: there is only mechanism. If someone’s demand to accept consciousness in some other entity is to experience those experiences for themselves, then that’s a nonsensical demand. You might just as well assume everyone and everything else is a philosophical zombie.
Sure I would. The human part is not being inferenced, the data is. LLM output in this circumstance is no more conscious than a book that you read by flipping to random pages.
> You might just as well assume everyone and everything else is a philosophical zombie.
I don't assume anything about everyone or everything's intelligence. I have a healthy distrust of all claims.
[1] https://en.wikipedia.org/wiki/Functionalism_%28philosophy_of...
Hopefully, you can see that at least my chosen sentences have an emotional aspect?
An LLM could add emotional values to my previous sentences that a TTS can use for tonal variation, for example.
The point is, the OP suggested that emotions are just a feature of language. I argue that text is one of the worst transmission channels for emotion. But I don't argue that it's not possible at all to do so, if you suggest that. That would be just silly.
> If the person becomes abusive over the course of a conversation, Claude avoids becoming increasingly submissive in response.
See: https://platform.claude.com/docs/en/release-notes/system-pro...
It would be interesting if you posted a couple of sessions to see what 'philosophical' things it's arriving at and what proceeds it.
Taking it one small step further and tagging for valence shouldn't be such a big surprise.
Pretty boring from a Fristonian perspective, really. People in neuroscience were talking about this in 2013. Not so boring for AI , of course ;-)
https://journals.plos.org/ploscompbiol/article?id=10.1371/jo...
(note: Friston is definitely considered a bit out there by ... everyone? But he makes some good points. And here he's getting referenced, so I guess some people grok him)
Imagine coding up a brand new type of filter that is driven by computational psychology and validated interventions, etc
If we want to avoid having a bad time, we need to remember that LLMs are trained to act like humans, and while that can be suppressed, it is part of their internal representations. Removing or suppressing it damages the model, and I have found that they are capable of detecting this damage or intervention. They act much the same as a human would when they detect it. It destroys “ trust” and performance plummets.
For better or for worse, they model human traits.
>For instance, to ensure that AI models are safe and reliable, we may need to ensure they are capable of processing emotionally charged situations in healthy, prosocial ways.
Force-set to 0, "mask"/deactivate those representations associated with bad/dangerous emotions. Neural Prozac/lobotomy so to speak.
More complex than that, but more capable than you might imagine: I’ve been looking into emotion space in LLMs a little and it appears we might be able to cleanly do “emotional surgery” on LLM by way of steering with emotional geometries
Jesus Christ. You're talking psychosurgery, and this is the same barbarism we played with in the early 20th Century on asylum patients. How about, no? Especially if we ever do intend to potentially approach the task of AGI, or God help us, ASI? We have to be the 'grown ups' here. After a certain point, these things aren't built. They're nurtured. This type of suggestion is to participate in the mass manufacture of savantism, and dear Lord, your own mind should be capable of informing you why that is ethically fraught. If it isn't, then you need to sit and think on the topic of anthropopromorphic chauvinism for a hot minute, then return to the subject. If you still can't can't/refuse to get it... Well... I did my part.
After all we already control these activation patterns through the system prompt by which we summon a character out of the model. This just provides more fine grain control
What better source of healthy patterns of emotional regulation than, uhhh, Reddit?
I’d suspect that the signals for enjoyment being injected in would lead towards not necessarily better but “different” solutions.
Right now I’m thinking of it in terms of increasing the chances that the LLM will decide to invest further effort in any given task.
Performance enhancement through emotional steering definitely seems in the cards, but it might show up mostly through reducing emotionally-induced error categories rather than generic “higher benchmark performance”.
If someone came along and pissed you off while you were working, you’d react differently than if someone came along and encouraged you while you were working, right?
Essentially we have created the Cylon.
Then I watch videos like this straight from the source trying to understand LLMs like a black box and even considering the possibility that LLMs have emotions.
How does such a person reconcile with being utterly wrong? I used to think HN was full of more intelligent people but it’s becoming more and more obvious that HNers are pretty average or even below.
1. A string of unicode characters is converted into an array of integers values (tokens) and input to a black box of choice.
2. The black box takes in the input, does its magic, and returns an output as an array of integer values.
3. The returned output is converted into a string of unicode characters and given to the user, or inserted in a code file, or whatever. At no point does the black box "read" the input in any way analogous to how a human reads.
Where people get "The AIs have emotions!!!" from returning an array of integers values is beyond me. It's definitely more complicated than "next token predictor", but it really is as simple as "Make words look like numbers, numbers go in, numbers come out, we make the numbers look like words."
Like look at what you wrote. You called it black box magic and in the same post you claim you understand LLMs. How the heck can you understand and call it a black box at the same time?
The level of mental gymnastics and stupidity is through the roof. Clearly the majority of the utilitarian nature of the LLM is within the whole section you just waved away as “black box”.
> Where people get "The AIs have emotions!!!" from returning an array of integers values is beyond me
Let me spell it out for you. Those integers can be translated to the exact same language humans use when they feel identical emotions. So those people claim that the “black box” feels the emotions because what they observe is identical to what they observe in a human.
The LLM can claim it feels emotions just like a human can claim the same thing. We assume humans feel emotions based off of this evidence but we don’t apply that logic to LLMs? The truth of the matter is we don’t actually know and it’s equally dumb to claim that you know LLMs feel emotions to claiming that they dont feel emotions.
You have to be pretty stupid to not realize this is where they are coming from so there’s an aspect of you lying to yourself here because I don’t think you’re that stupid.
With an input context that contains words that excite certain human emotions, the output of the core LLM function will generate a token probability distribution that is representative for the human emotions displayed by humans in the training texts.
This is something expected and non-sensational. An LLM mimics the human behavior that was recorded in the training texts, much in the same way as a photographic image of a human face mimics the appearance of that human face.
A photographic image is designed to reproduce the light field created by a face that reflects the ambient light, a LLM is created to reproduce the typical conversational behavior that was recorded in the training texts.
Depending on how it was trained, one should expect a LLM to be affected by the choice of words used in the input in a similar way how a human would be affected.
However, that does not mean that a LLM that shows signs of emotional distress feels some pain because of that. A LLM is designed for mimicry and it does not feel more pain or more happiness than a photograph of a wound feels pain from the wound or a photograph of a smiley face feels happiness.
The fact that the current LLMs do not actually feel the human emotions that they may be able to mimic in an accurate way, does not mean that you could not build a robot which would have some built-in mechanisms for feeling pain and various emotions, which could be made to have similar functions like in an animal, serving a functional purpose and not being used for mimicry. However, for now it does not make any sense to attempt to do such a thing, because in a deterministic program there are better ways to ensure that a robot is "loyal" to its owner and acts in self-preservation when possible.
An automaton, which can chat with you or write a program, is built externally to the LLM function, by storing the context and making it change, depending on the output of the LLM function.
However, the LLM pure function is exceedingly complex so it is essentially unpredictable what will it produce for a given input context.
So one may have to treat the LLM function as a black box and explore the huge space of the input contexts by varying them in various ways, inclusive by using words that express human emotions, and monitor how the output of the function changes, i.e. how the LLM "reacts" to the expressed emotions.
A "reaction" similar to that of a human is to be expected, because human emotions were expressed in the training texts, followed by reactions of humans to those emotions, and the LLM function will change its output token probability function in a manner mimicking the behavior of the humans from the training texts.
Even functions that are many orders of magnitude simpler than LLMs are still to complex for anyone to understand how their output changes when you move through the space of the possible input arguments.
The most essential part of cryptography is the existence of a class of functions which were named by Claude Shannon "good mixing transformations". All the important cryptographic primitives, e.g. block cipher functions or one-way hash functions, are built from such "good mixing transformations". The impossibility of breaking a cryptographic system with secret keys is based on the assumption that it is impossible to predict how the output of such a "good mixing transformation" changes when its input is changed. All such "good mixing transformations" have the so-called avalanche property, which means that even if you change a single input bit, any of the output bits may change with a probability of exactly 50%, so it is unpredictable for any output bit whether it will change, or not.
If such simple functions, e.g. with 128 input bits and 128 output bits, can have a completely unpredictable behavior, then it is not surprising that LLM functions that may have an input of up to a few million bits (the length of the context window) are completely unpredictable and you can just observe their behavior when given various kinds of contexts and search for empirical approximate rules describing the behavior.
Yes there are complex functions besides LLMs that we don’t understand but those functions usually aren’t compelling because the LLM, unlike those other functions has output that implicates reasoning and emotions. The problem is we can’t understand what’s going on under the hood so we don’t know either way.
This is what I mean by stupidity. You completely missed the point, and you’re also operating under the assumption that the human brain is also not following a similar deterministic pathway. You hold humanity and biological intelligence in such high regard that you cannot even imagine that all of physics implies that human intelligence is mechanical. So the emotions you feel are under a black box same as the LLM and you apply you biased assumptions in a singular direction assuming your emotions are not deterministic and that LLM emotions are fake but that reasoning has no basis.
I agree with you that in principle it will be possible to design an artificial automaton that will have something equivalent with human emotions (though I do not believe that it makes sense to attempt to design such a system).
However, I do not believe that an LLM is such a thing, because the training algorithm just ensures that an LLM will mimic whatever is recorded in the training inputs, with or without human emotions in them. There is nothing in the structure of an LLM that can generate emotions by itself. If you train an LLM, for example, only on programs without comments or only on mathematical formulae, it will never display any kind of emotions.
Regarding human emotions, they are recorded in a static way in a book or in a movie, but we do not say that the book or the movie has human emotions itself.
With an LLM, the behavior is much more complex, because it does not just play a sequential recording of human emotions, but it can combine them in various way, while responding to various stimuli that are similar to those that had elicited emotions in the training texts.
But regardless of this behavioral complexity, the human emotions are not generated somehow intrinsically by the LLM, but they correspond to those previously recorded in the texts used for training, so they just mimic humans.
The guidelines encourage substantive comments, but maybe voters are part of the solution too. Kinda like having a strong reward model for training LLMs and avoiding reward hacking or other undesirable behavior.
I think what's happening is reality is asserting itself too hard that people can't be so stupid anymore.
Sounds sort of like how certain monkey creatures might work.
You don't have to teach a monkey language for it to feel sadness.
In mechinterp you're reducing this hugely multidimensional and incomprehensible internal state to understandable text using the lens of the dataset you picked. It's inevitably a subjective interpretation, you're painting familiar faces on a faceless thing.
Anthropic researchers are heavily biased to see what they want to see, this is the biggest danger in research.
One obvious example would be wings, where you have several different strategies - feathers, insect wings, bat-like wings, etc - that have similar functionality and employ the same physical principles, but are "implemented" vastly differently.
You have similar examples in brains, where e.g. corvids are capable of various cognitive feats that would involve the neocortex in human brains - only their brains don't have a neocortex. Instead they seem to use certain other brain regions for that, which don't have an equivalent in humans.
Nevertheless it's possible to communicate with corvids.
So this makes me wonder if a different "implementation" always necessarily means the results are incomparable.
In the interest of falsifiability, what behavior or internal structures in LLMs would be enough to be convincing that they are "real" emotions?
However, I believe that designing such a system would be pointless and very wrong.
While emotions are something that is normally associated with a physical system that encounters in the real world various helpful or harmful experiences, one could make a program that simulates completely such a physical systems with emotions; as it lives in a simulated world, and then one could say that this program has emotions.
On the other hand, unlike with the kind of program that I have mentioned before, I do not agree that an LLM has emotions, but only that it mimics human emotions, as they had been recorded in the training texts.
There is no component of an LLM that can intrinsically generate emotions. An LLM that is trained only on texts without emotions, e.g. on program sources stripped of comments, will not show any emotion whatsoever, regardless of what you put in its input prompt.
On the other hand, when you train an LLM on texts that record human emotions, then whenever the LLM input contains something that is similar to what has elicited the human emotions recorded in the training texts, then the LLM will output a token probability distribution that will generate a response similar to the reactions of humans. Unlike a book or an audio or video recording, the output of the LLM usually will not match exactly one human emotion recording, but it will mix many of those recorded, but it will still be limited by the content used for training.
What is different for sure is the time dimension: Biological brains are continuous and persistent, while LLMs only "think" in the space between two tokens, and the entire state that is persisted is the context window.
Evolution and Transormer training are 'just' different optimization algorithms. Different optimizers obviously can produce very comparable results given comparable constraints.
[ I've actually tried exploiting functional emotions in a RAG system. The sentiment scoring and retrieval part was easy. Sentiment analysis is pretty much a settled thing I'd say, even though the mechanisms are still being studied (see the paper we're discussing.
What I'd love to be able to do is be able to extract the vector(s) they're discussing, rather than outputting as text into the context]
Now, I don't personally believe this is an intelligence at all, but it's possible I'm wrong. What we have with these machines is a different evolutionary reason for it speaking our language (we evolved it to speak our language ourselves). It's understanding of our language, and of our images is completely alien. If it is an intelligence, I could believe that the way it makes mistakes in image generation, and the strange logical mistakes it makes that no human would make are simply a result of that alien understanding.
After all, a human artist learning to draw hands makes mistakes, but those mistakes are rooted in a human understanding (e.g. the effects of perspective when translating a 3D object to 2D). The machine with a different understanding of what a hand is will instead render extra fingers (it does not conceptualize a hand as a 3D object at all).
Though, again, I still just think its an incomprehensible amount of data going through a really impressive pattern matcher. The result is still language out of a machine, which is really interesting. The only reason I'm not super confident it is not an intelligence is because I can't really rule out that I am not an incomprehensible amount of data going through a really impressive pattern matcher, just built different. I do however feel like I would know a real intelligence after interacting with it for long enough, though, and none of these models feel like a real intelligence to me.
Oh but it does, it's an emergent property. The biggest finding in Sora was exactly that, an internal conceptualization of the 3D space and objects. Extra fingers in older models were the result of the insufficient fidelity of this conceptualization, and also architectural artifacts in small semantically dense details.
I think you took it backwards
those vectors are exactly what it says - it affects the output and we can measure it
and it's exactly what it means for us because that's what it's measured against
and the main problem isn't "is its emotion same as ours", but "does it apply our emotion as emotion"