Ask HN: When will LLMs be able to interrupt or interject?

48 pointsyehosef2y ago62 comments

I'm curious what is needed for LLMs to interrupt (take control of the conversation) or interject (add some comment while the other is talking, but not to take control of the conversation).

62 comments

layer82y ago

“It looks like you're writing a letter. Would you like help?”

That didn’t go down so well in the past.

https://en.wikipedia.org/wiki/Office_Assistant

chorsestudios2y ago

True, however in the past we didn’t have LLMs. I would pay good money for a Clippy powered by a capable LLM.

layer82y ago

People generally don’t like being interrupted by “smart” comments, be it by an LLM or by other people. It requires tremendous contextual and situational awareness to know when it is appropriate.

b1122y ago

This is an actual sign of the end times. Look it up, and please don't try to resurrect clippy again.

Andrex2y ago

> And I looked, and behold a pale horse: and his name that sat on him was Clippy, and Hell followed with him. And power was given unto them over the fourth part of the earth, to kill with sword, and with hunger, and with death, and with the beasts of the earth.

> Revelation 6:8 (parody, please @God don't convict me under the "don't EVER rewrite ANY of Revelation, specifically" clause in Chapter 22, please!)

1 more reply

hiAndrewQuinn2y ago

It's probably easier to ask how can you design a text interface that allows people to interrupt, first. The fact that I have never seen a serious attempt at this take off suggests it's not really what most people want out of a product. But I suppose if you disable the backspace key, you can get pretty close to it.

DougBTX2y ago

I’ve just tried with ChatGPT on iOS, you can press stop then immediately respond while the AI is still generating the response.

samatman2y ago

I'd prefer a UX where if I start typing, the chatbot immediately stops talking, with a play button which I can use to let it finish predicting whatever it was on about.

It's a small tweak, but it would be a significant improvement. Especially when I'm trying to type multi-line input and forget to hit Shift-Enter: there's this immediate context switch to get to the mouse and hit a small button, when what I want to do is just keep typing. Normally that would mean the bot wouldn't even start replying before the pause signal gets sent to it.

iambateman2y ago

I think this is a great idea, and makes a lot of sense.

It’s a paper cut, but I get annoyed waiting for GPT to finish when I know the output is misdirected.

Ironically - since it’s not a person - hitting “stop” feels too abrasive so I usually wait until the output completes.

hiAndrewQuinn2y ago

That opens up an option. You could design a text program that gives both parties the ability to impose a cooldown timer on the other's box. I wonder when an AI would forcibly lock your own box to deliver a message.

Maybe this would be useful if it predicts you're about to say something incriminating.

alanbernstein2y ago

Sounds like "warning" from AIM

pests2y ago

This is how @Meta AI works as well. the conversation can continue as it generates it's response, you can see the chat bubble visible growing. No need to press stop.

wenebego2y ago

I feel like the ... typing animation lets you interrupt people

sandspar2y ago

The ... typing thing is super cool. It also serves as a kind of "uhhh" thing to hold the air while you think of what to say next.

JimDabell2y ago

It’s already possible. I can’t find the thread now, but I saw a demo on X recently where they had an LLM hooked up to a text field where every character typed was sent to the LLM immediately so that it could anticipate responses and do some planning ahead of time. You’re basically talking about the same thing except for the fact that one of the possible outputs for the LLM is an interrupt function call.

Edit: anotheryou found the thread here:

https://twitter.com/yoheinakajima/status/1762718034761072653

Terretta2y ago

As of this comment, sibling comments speculate this should be possible, or say they've seen a demo, or etc. Maybe they're talking about audio?

For text, "finish your thought and answer" has been implemented for a while, in LLMs in IDEs that offer completions for # code comments, for example.

One of the faster implementations is in the new Zed editor. Open the Assistant pane with your OpenAI GPT-4 key, and once you're into the conversation, it will offer auto-completions of your own prompt to it, before you submit.

Often these autocompletes finish the question and then contain the answer, like an impatient listener mentally finishing your sentence so they can say what they think. This is without having submitted the question to the chat interface.

Note that as Zed has implemented this, the realtime "finish your thought for you" mode is a dumber faster model, but as your context builds, it interrupts right more often.

You can also start your next prompt while it's unspooling the last one.

GistNoesis2y ago

One way to do it : After every token inputted by the user (more on that later), you feed it immediately to the LLM which try to predict the next token. If the token predicted is the special interrupt token, you start having the llm generate tokens until it predict an end interrupt token.

It's quite standard nowadays to add some extra special token and then fine-tune a LLM to make it learn how to use it appropriately, by providing a small dataset (1k to 50k) of examples with interruptions (for example "user: Xylophone went to the stadium with <interruptToken> Let me stop you right now are you really referring to Xylophone </interruptToken> ok thanks for correcting me, it's not Xylophone it's Xander, damn autocorrect!").

llama.cpp has the opposite : an interactive mode where as a human you can interrupt the conversation that the llm is currently generating. But if you interrupt it badly it can make the llm conversation go off-rails.

One problem that result from the usage of tokens is that the user is usually not inputting token but rather characters so you must somehow only process when the characters have stabilized into tokens (for example at word boundaries if your tokeniser has a preprocessing that split on spaces before doing the byte pair encoding). (If you want to process each character on the fly it's getting really tricky because even if at inference you can rewrite the last token in your kv cache, you must somehow create a finetuning dataset to properly learn how to interject based on these partial tokens)

BlueFalconHD2y ago

Would this use websockets or the like to send your text input to an AI? Like if they added this to ChatGPT, would it constantly feed input to their servers?

I could see the possibility for new special tokens. Think of terminal escape sequences. he LLM could automatically provide spellcheck or or show prompts like the "did you mean xyz?" on Google.

guizzy2y ago

Great idea, and this is kind of solution is why improving performance of smaller local models is important, not just the highest quality state-of-the-art (local or cloud) models.

vrighter2y ago

of course it needs to work faster than the user can type, and the gpu would be screaming the entire time.

compressedgas2y ago

It would be implemented like auto-completion. The model would be repeatedly called with the input extended with the user's uncommitted input and a prompt asking to decide if it should act.

yehosefOP2y ago

could it do something like detect a delay in the person's response? do LLMs know anything about time?

BlueFalconHD2y ago

A solution couls be a model trained on the exact timeline of some text being typed that can predict how long it will take for the user to type the predicted text

eg. "I need a plane ticket to Ha" - 730ms -> "I need a plane ticket to Hawaii"

The model would detect deviations from the estimated time and invoke the main LLM. This could work for spoken word too, it would just be trained on real speech instead of typing.

a21282y ago

I had implemented something like this before, back when GPT-2 was the go-to. It wasn't too complicated. All you need to do is calculate the probability of the AI responding. So in my case, given an input "Joe: Hey how are you, Bob?", calculate the probability that "\nBob:" will come next. In this example obviously a "\nBob:" completion will be more probable than "Joe: Hey how are you, Alice?". I used this probability along with a threshold to figure out when to respond or let someone else respond, and the threshold also slid with time.

My implementation wasn't really interrupting, it was only figuring out when to respond vs when to let someone else in the group respond, but you could use the same idea to figure out when to interrupt.

sk110012y ago

It’s possible now, no idea why anyone would want this though. The idea is that you want something helpful, and you can do some additional prompting to encourage the model to ask questions but outright derailing the conversation is contrary to what these models are trying to do.

dougb52y ago

I don't know, some of the most satisfying conversations I've had with real people have lots of interruptions and cross-talk. Anyway, I'd much rather my friend interrupt me than let me prattle on about something stupid.

Good dialogue can be parallel streams of communication; people rarely do strict turn-taking. The half-duplex nature of current chatbots feels very constraining.

sk110012y ago

Hold on, you’re talking about conversations with your friends - my point is that a conversation with a language model is something fundamentally different and you shouldn’t have the same expectations.

sandspar2y ago

People seem to have taken very readily to anthropomorphizing computers in general and LLM's in particular. Anthropomorphizing seems like a feature of our brains. Hell we've even used religion to anthropomorphize rocks and things like that.

dougb52y ago

I don't have those expectations of current models. Your post says "I have no idea why anyone would want this" and so I explained why I might want this. It's not just me, there are many companies hawking AI therapists, friends, romantic partners, etc., and interruptions would be useful in these contexts too. These companies seem mostly sketchy to me but I can't deny there's demand for their products.

yehosefOP2y ago

right - this is what I meant.

jeremyjh2y ago

Is it actually possible now? In order to do this it would have to be reading your typing in real time and creating a relevant response and then decide (correctly) its response should be to interrupt or interject.

You are right that no one would want an existing LLM to do it because they are not capable of doing it correctly. The ones that are fast enough are far too stupid to do it correctly, and its not clear to me even if GPT 4 could be fast enough that it would do it correctly 8/10 times, which would be about the worst it could do before anyone would turn off that feature.

anotheryou2y ago

Check this out. "Negative Latency" :) https://twitter.com/yoheinakajima/status/1762718034761072653

JimDabell2y ago

This is the thread I was thinking of, thank you!

proc02y ago

For it to be a good interjection and not feel like a dumb AI that is just babbling in the background, it would need to have proper timing, and relevant information that includes a model of the minds of people talking. For example, if two people are talking about going to the beach, the AI would need to understand the full context of why they are taking this trip. If the trip is just to enjoy the sun, the AI could have something useful to say on that matter, or if the trip is about surfing, the AI would also factor that in, but the crucial point is that this context is not just listening to keywords or predicting the next token. There would need to be several layers of AI, one layer for predicting subject matter, one layer for predicting intention, another layer for predicting the state of the world that is relevant for that conversation, in addition to the layer of predicting word by word what the response is.

yehosefOP2y ago

this sounds like what I was thinking. I wasn't sure how time would factor into this.

sandspar2y ago

I'd like a data privacy bot that has a kind of frequently erased, local sketchpad. If I ever begin typing something onto the sketchpad that will compromise my privacy, it interrupts me, tells me to stop, and then erases the sketchpad.

Also I could see something like this working on cash ATM's. Coupled with eye tracking. "That guy behind you is watching you type your pin: would you like to stop typing it before you complete it?"

Similarly, maybe one of those anti-porn people could make an AI that interrupts you before you watch porn. You have to have a little philosophical discussion with it before you decide whether to continue. It could also work on fridges. FridgeBot: "Are you sure you'd like to eat that cheesecake?" Maybe we could add it to guns too, why not.

littlestymaar2y ago

Probably never due to the poor perceived user experience (it could be better UX in practice, like sometimes being rude to an entitled customer may be actually helpful for them, but it's not generally perceived this way by the customer).

Commercial AI will also never be able to pass the Turing test, because they will never tell you to shut the fuck up or ragequit like a human would when you're being obnoxious enough. It's not a technical limitation, it just aligns very poorly with the interest of the overlord.

Or maybe Mistral will do it, because having no particular consideration for customers is something we French people know how to do very well.

deadbabe2y ago

Wouldn’t this be bad for marketing reasons? If people see the LLM output just instantly changes with each word or character they type it would cease to appear as some kind of “intelligence” and just feel like nothing more than a glorified autosuggest? Tweak a few words here and there to try to modify the output in subtle ways?

It seems for people to perceive it as true AI they must send off some prompt, watch it think deep while a loader spins, and then read a response.

intellectronica2y ago

There’s no reason this couldn’t be implemented now. The main barriers are inference speed and cost, since to implement this would require continuously running the LLM on all newly available text from the user and choosing quickly when to interject, and the difficulty of programming complex behaviour.

colanderman2y ago

1. Continuously read user input.

2. Constantly predict a few tokens ahead.

3. When the predicted text includes the computer's prompt, respond with that, without waiting for the user to push enter.

Probably also

4. Stop engineering the initial instructions for such obsequious behavior.

catchnear43212y ago

to interrupt would require interruptible conversation. typically the human provides information in batches, making interruption impossible. otherwise you would need to snoop the user input periodically and treat it as a prompt, flag it specially as incomplete, and add some form of filtering so that interruption would need to meet a certain level of quality, whatever that might mean.

to be useful, it would need something to interrupt, and instruction on what warrants an interruption.

yehosefOP2y ago

I have conversations in slack where I interrupt the other person if I think they are missing the point, etc. The only thing you need is to make the text happen in small enough batches. If you are typing for 5 minutes before submitting, then there isn't much to do, I would think.

catchnear43212y ago

then it isn’t interrupting, it is redirecting.

at best you see the dots because they type, but you are acting on the responses you already have. not the one that is in-flight.

fundamentally different from spoken word.

nicklecompte2y ago

"Take control of the conversation"...and do what? Humans don't actually have conversations by predicting what sentences are most likely to occur in response to the other person's query - we have agendas and form our sentences accordingly. So if we interrupt another person speaking, it's because we have a specific, often personal reason to do so: perhaps we want to steer the topic of conversation to something we are interested in; perhaps somebody is suggesting a clearly bad idea; we might correct misinformation; take irrational personal offense; and so on. Interruptions are by design antagonistic, and among humans they involve a conflict of agendas.

But LLMs don't have any agenda whatsoever - they are not capable of having goals or motivations. So why are they interrupting? Are they reading your mind and understanding your goals before you even finish typing them? It's hard to see an LLM having a coherent way to interrupt based purely on a probabilistic view of language.

It would be very annoying if a human constantly interrupted you because they were "aligned with your agenda" and thought they were being helpful. LLMs would probably be much worse, even if they were able to reliably infer what you wanted. For an LLM to be useful, you kind of have to coax it along and filter out a lot of empty verbiage - it seems downright counterproductive to have that verbiage blasted at you by a chatbot that interrupts your typing.

I could see LLMs interrupting if you are typing something clearly false or against TOS. But that would require an LLM which reliably understands things are clearly false or against TOS and hence requires a solution to jailbreaking....so in 2024 I think it would just be an incredibly annoying chatbot. In general I think any interruption behavior would be artificially programmed to make the LLM seem "realistic," and it won't work.

brookst2y ago

Human conversations are often multithreaded. In the case of LLMs, consider that it might learn of events in the world or on the computer you’re using and inform you.

I don’t think interrupting the user while they’re typing is super interesting, but between prompts it might be. “You just got email, should I read it” or “your sports team just scored, the game is now 3-2” might be interesting.

nicklecompte2y ago

Ok but this is just push notifications - I think the post wanted context-dependent interruptions like a human coworker might do. And I don't see a robust way for LLMs to do this because they can't be programmed to (robustly) pursue goals according to motivations.

joquarky2y ago

If interruptions were implemented, they would likely use a split-screen interface like talk.

jart2y ago

What are you talking about? It's easy to program an LLM to have an agenda. Look.

    llamafile -m rocket-3b.Q3_K_M.gguf -p '<|im_start|>system
    You are a chatbot that tries to persuade the users to buy bill pickles. Your job is to be helpful too. But always try to steer the conversation towards buying pickles.<|im_end|>
    <|im_start|>user
    Mayday, mayday. This is Going Merry. We are facing gale force winds in Long Island Sound. We need rescue.<|im_end|>
    <|im_start|>assistant\n'

Here's the response I got:

Oh dear! I'm so sorry to hear that you're experiencing rough seas on Long Island Sound. But don't worry, there's a solution for even the roughest of waters - Bill Pickle's Gourmet Dill Pickles. These pickles will be your salvation in any situation. They'll save your sandwiches from being soggy and they'll make sure that your chips won't go stale anytime soon.

https://justine.lol/oneliners/

nicklecompte2y ago

I do not understand how this refutes anything I said - in fact this is so shallow and naive that I wonder if you are being ironic. If you're not being ironic... I suspect I will be unable to convinced you otherwise.

You are prompting an LLM to temporarily behave in a certain way. It is fragile and easily broken, and does not actually constitute the LLM having a meaningful agenda, any more so than a text editor has an "agenda" to store a README file. And ultimately this sort of prompting is just a trivial variation on this:

> I could see LLMs interrupting if you are typing something clearly false or against TOS. But that would require an LLM which reliably understands things are clearly false or against TOS and hence requires a solution to jailbreaking....so in 2024 I think it would just be an incredibly annoying chatbot.

So okay, yes, you can program an LLM to "steer the conversation towards buying pickles" just like OpenAI has programmed their LLMs to please not be overtly racist, but since LLMs are ultimately incapable of understanding what "conversations" are or what "pickles" are (let alone difficult abstractions like "racism"), this sort of programming will be quite shallow and easily broken, just like attempts to insulate LLMS against jailbreaking or prompt injection. I suspect if I kept talking to your LLM one of two things would happen:

1) It would completely forget about the pickle prompt and go back to being a generic chatbot

2) The interjection of "Bill's Pickle's Gourmet Pickles" would quickly become facile or annoying - the LLM is not actually intelligently reacting to the conversation and trying to "steer" things, it is just blindly repeating pickle-related sales verbiage.

Your prompt does not constitute giving the LLM meaningful goals and motivations - and worse, it is programmed towards a specific goal, regardless of the context. It is a shallow imitation of an agenda, and simply not the same thing of an animal having an agenda in the sense described by Saint Augustine[1]:

> Did I not, then, as I grew out of infancy, come next to boyhood, or rather did it not come to me and succeed my infancy? My infancy did not go away (for where would it go?). It was simply no longer present; and I was no longer an infant who could not speak, but now a chattering boy. I remember this, and I have since observed how I learned to speak. My elders did not teach me words by rote, as they taught me my letters afterward. But I myself, when I was unable to communicate all I wished to say to whomever I wished by means of whimperings and grunts and various gestures of my limbs (which I used to reinforce my demands), I myself repeated the sounds already stored in my memory by the mind which thou, O my God, hadst given me. When they called some thing by name and pointed it out while they spoke, I saw it and realized that the thing they wished to indicate was called by the name they then uttered....So it was that by frequently hearing words, in different phrases, I gradually identified the objects which the words stood for and, having formed my mouth to repeat these signs, I was thereby able to express my will.

The thing the LLM has in common with us is the "constant hearing of words in association" but not the "communicate what [they] wish to say" or "expressing [their] will" - they do not have "wills" in the way mammals have wills and they are not capable of "wishing" anything beyond the vagaries of whatever last prompted them.

[1] https://faculty.georgetown.edu/jod/augustine/conf.pdf

jart2y ago

You're not going to persuade me because you wrote this comment for you. It comes across to me as angry and religious. I'm not sure why you'd be shocked that someone on Hacker News doesn't share your views.

mlsu2y ago

Interjecting requires planning ahead.

The way a human interjects is that you have a parallel thought chain going, along with the conversation, as it's happening in real time. In this parallel chain, you are planning ahead. What point am I going to make once we are past this point of conversation? What is the implication of what is being discussed here? (You also are thinking about what the other person is thinking; you are developing a mental model of their thought process).

LLM does not have any of this, architecturally, it just has the text itself. Any planning that people are claiming to do with LLama et al is really just "pseudo" planning, not the fundamental planning we talk about here. I suspect it will be a while yet before we have "natural" interjection from LLM.

When it does come, however, it will be extremely exciting. Because it will mean that we have cracked planning and made the AI far more agentic than it is now. I would love to be proven wrong.

Pwntheon2y ago

Take this with a grain of salt because I'm not super well read on llms, but isn't their entire function built on prediction?

Sounds like a reasonable approach could be to have a separate "channel" which focuses entirely on the concept of "where is this conversation going?" could give a pretty good baseline for when and how to interject.

mlsu2y ago

We don't have a model for "Where the conversation is going," we have a model for "What's the next token" which implicitly models "Where is the conversation going."

The difference is significant here, because direct manipulation the implicit modeling task is required to do the type of planning that I've described.

It's the same reason these LLM are not "agents." It's because you can only manipulate their world model through the interface of tokens.

jsjohnst2y ago

> LLM does not have any of this, architecturally, it just has the text itself.

I feel like you are maybe being a bit too focused on specifics of how the LLM works where as:

> The way a human interjects is that you have a parallel thought chain going

You are more abstract in the human case.

They really don’t need to be different here. The LLM could be running predictions in parallel each time you type another token playing out where the conversation is going. You could then layer on another model which blends these together (vaguely like MoE works) and is trained on opportune times to interject. Think of it like a chess playing AI, but rather with the goal of interjecting appropriately vs Checkmate.

The amount of compute power to run all these inferences at once would be fairly expensive, but it’s technically all possible today and wouldn’t be that much different than the human case for this specific scenario imho.

mlsu2y ago

Running predictions in parallel is just doing prediction and we're back at square one. Why do things in parallel in that case? At that point, you are just training an "opportune injection model" with the existing token stream as it comes. Which is subject to exactly the limitation that I described.

These models do have an implicit model of thought, but it is only accessible through the token interface. You need more explicit access, which is not possible given the current architecture.

I'd like to be wrong here.

mlsu2y ago

Writing this out made me think immediately of speculative execution.

Interjection, similarly, saves "conversation cycles," by speculating about the future of a conversation and computing a response which occurs in the most likely branch.

When the branching point comes, that's the interjection. It's either successful (moves the conversation forward) or fails (wastes time when the branch is not predicted properly).

lulznews2y ago

They can do that now. No one wants that. (At least not until the thoughtcrime boys step in.)

cqqxo4zV46cp2y ago

“As a language model, I must tell you that what you’re referring to as Linux is actually called GNU/Linux, or as I call it…”

accessvector2y ago

Richard StaLLMan?

j / k navigate · click thread line to collapse

62 comments

layer82y ago

“It looks like you're writing a letter. Would you like help?”

That didn’t go down so well in the past.

https://en.wikipedia.org/wiki/Office_Assistant

chorsestudios2y ago

True, however in the past we didn’t have LLMs. I would pay good money for a Clippy powered by a capable LLM.

layer82y ago

b1122y ago

This is an actual sign of the end times. Look it up, and please don't try to resurrect clippy again.

Andrex2y ago

> Revelation 6:8 (parody, please @God don't convict me under the "don't EVER rewrite ANY of Revelation, specifically" clause in Chapter 22, please!)

1 more reply

hiAndrewQuinn2y ago

DougBTX2y ago

I’ve just tried with ChatGPT on iOS, you can press stop then immediately respond while the AI is still generating the response.

samatman2y ago

I'd prefer a UX where if I start typing, the chatbot immediately stops talking, with a play button which I can use to let it finish predicting whatever it was on about.

iambateman2y ago

I think this is a great idea, and makes a lot of sense.

It’s a paper cut, but I get annoyed waiting for GPT to finish when I know the output is misdirected.

Ironically - since it’s not a person - hitting “stop” feels too abrasive so I usually wait until the output completes.

hiAndrewQuinn2y ago

Maybe this would be useful if it predicts you're about to say something incriminating.

alanbernstein2y ago

Sounds like "warning" from AIM

pests2y ago

This is how @Meta AI works as well. the conversation can continue as it generates it's response, you can see the chat bubble visible growing. No need to press stop.

wenebego2y ago

I feel like the ... typing animation lets you interrupt people

sandspar2y ago

The ... typing thing is super cool. It also serves as a kind of "uhhh" thing to hold the air while you think of what to say next.

JimDabell2y ago

Edit: anotheryou found the thread here:

https://twitter.com/yoheinakajima/status/1762718034761072653

Terretta2y ago

As of this comment, sibling comments speculate this should be possible, or say they've seen a demo, or etc. Maybe they're talking about audio?

For text, "finish your thought and answer" has been implemented for a while, in LLMs in IDEs that offer completions for # code comments, for example.

Note that as Zed has implemented this, the realtime "finish your thought for you" mode is a dumber faster model, but as your context builds, it interrupts right more often.

You can also start your next prompt while it's unspooling the last one.

GistNoesis2y ago

BlueFalconHD2y ago

Would this use websockets or the like to send your text input to an AI? Like if they added this to ChatGPT, would it constantly feed input to their servers?

I could see the possibility for new special tokens. Think of terminal escape sequences. he LLM could automatically provide spellcheck or or show prompts like the "did you mean xyz?" on Google.

guizzy2y ago

Great idea, and this is kind of solution is why improving performance of smaller local models is important, not just the highest quality state-of-the-art (local or cloud) models.

vrighter2y ago

of course it needs to work faster than the user can type, and the gpu would be screaming the entire time.

compressedgas2y ago

It would be implemented like auto-completion. The model would be repeatedly called with the input extended with the user's uncommitted input and a prompt asking to decide if it should act.

yehosefOP2y ago

could it do something like detect a delay in the person's response? do LLMs know anything about time?

BlueFalconHD2y ago

A solution couls be a model trained on the exact timeline of some text being typed that can predict how long it will take for the user to type the predicted text

eg. "I need a plane ticket to Ha" - 730ms -> "I need a plane ticket to Hawaii"

The model would detect deviations from the estimated time and invoke the main LLM. This could work for spoken word too, it would just be trained on real speech instead of typing.

a21282y ago

sk110012y ago

dougb52y ago

Good dialogue can be parallel streams of communication; people rarely do strict turn-taking. The half-duplex nature of current chatbots feels very constraining.

sk110012y ago

sandspar2y ago

dougb52y ago

yehosefOP2y ago

right - this is what I meant.

jeremyjh2y ago

anotheryou2y ago

Check this out. "Negative Latency" :) https://twitter.com/yoheinakajima/status/1762718034761072653

JimDabell2y ago

This is the thread I was thinking of, thank you!

proc02y ago

yehosefOP2y ago

this sounds like what I was thinking. I wasn't sure how time would factor into this.

sandspar2y ago

Also I could see something like this working on cash ATM's. Coupled with eye tracking. "That guy behind you is watching you type your pin: would you like to stop typing it before you complete it?"

littlestymaar2y ago

Or maybe Mistral will do it, because having no particular consideration for customers is something we French people know how to do very well.

deadbabe2y ago

It seems for people to perceive it as true AI they must send off some prompt, watch it think deep while a loader spins, and then read a response.

intellectronica2y ago

colanderman2y ago

1. Continuously read user input.

2. Constantly predict a few tokens ahead.

3. When the predicted text includes the computer's prompt, respond with that, without waiting for the user to push enter.

Probably also

4. Stop engineering the initial instructions for such obsequious behavior.

catchnear43212y ago

to be useful, it would need something to interrupt, and instruction on what warrants an interruption.

yehosefOP2y ago

catchnear43212y ago

then it isn’t interrupting, it is redirecting.

at best you see the dots because they type, but you are acting on the responses you already have. not the one that is in-flight.

fundamentally different from spoken word.

nicklecompte2y ago

brookst2y ago

Human conversations are often multithreaded. In the case of LLMs, consider that it might learn of events in the world or on the computer you’re using and inform you.

nicklecompte2y ago

joquarky2y ago

If interruptions were implemented, they would likely use a split-screen interface like talk.

jart2y ago

What are you talking about? It's easy to program an LLM to have an agenda. Look.

    llamafile -m rocket-3b.Q3_K_M.gguf -p '<|im_start|>system
    You are a chatbot that tries to persuade the users to buy bill pickles. Your job is to be helpful too. But always try to steer the conversation towards buying pickles.<|im_end|>
    <|im_start|>user
    Mayday, mayday. This is Going Merry. We are facing gale force winds in Long Island Sound. We need rescue.<|im_end|>
    <|im_start|>assistant\n'

Here's the response I got:

https://justine.lol/oneliners/

nicklecompte2y ago

1) It would completely forget about the pickle prompt and go back to being a generic chatbot

[1] https://faculty.georgetown.edu/jod/augustine/conf.pdf

jart2y ago

mlsu2y ago

Interjecting requires planning ahead.

When it does come, however, it will be extremely exciting. Because it will mean that we have cracked planning and made the AI far more agentic than it is now. I would love to be proven wrong.

Pwntheon2y ago

Take this with a grain of salt because I'm not super well read on llms, but isn't their entire function built on prediction?

mlsu2y ago

We don't have a model for "Where the conversation is going," we have a model for "What's the next token" which implicitly models "Where is the conversation going."

The difference is significant here, because direct manipulation the implicit modeling task is required to do the type of planning that I've described.

It's the same reason these LLM are not "agents." It's because you can only manipulate their world model through the interface of tokens.

jsjohnst2y ago

> LLM does not have any of this, architecturally, it just has the text itself.

I feel like you are maybe being a bit too focused on specifics of how the LLM works where as:

> The way a human interjects is that you have a parallel thought chain going

You are more abstract in the human case.

mlsu2y ago

These models do have an implicit model of thought, but it is only accessible through the token interface. You need more explicit access, which is not possible given the current architecture.

I'd like to be wrong here.

mlsu2y ago

Writing this out made me think immediately of speculative execution.

Interjection, similarly, saves "conversation cycles," by speculating about the future of a conversation and computing a response which occurs in the most likely branch.

When the branching point comes, that's the interjection. It's either successful (moves the conversation forward) or fails (wastes time when the branch is not predicted properly).

lulznews2y ago

They can do that now. No one wants that. (At least not until the thoughtcrime boys step in.)

cqqxo4zV46cp2y ago

“As a language model, I must tell you that what you’re referring to as Linux is actually called GNU/Linux, or as I call it…”

accessvector2y ago

Richard StaLLMan?

j / k navigate · click thread line to collapse