Let's say your beverage LLM is there to recommend drinks. You once said "I hate espresso" or even something like "I don't take caffeine" at one point to the LLM.
Before recommending coffee, Beverage LLM might do a vector search for "coffee" and it would match up to these phrases. Then the LLM processes the message history to figure out whether this person likes or dislikes coffee.
But searching SQL for `LIKE '%coffee%'` won't match with any of these.
The basic idea is, you don't search for a single term but rather you search for many. Depending on the instructions provided in the "Query Construction" stage, you may end up with a very high level search term like beverage or you may end up with terms like 'hot-drinks', 'code-drinks', etc.
Once you have the query, you can do a "Broad Search" which returns an overview of the message and from there the LLM can determine which messages it should analyze further if required.
Edit.
I should add, this search strategy will only work well if you have a post message process. For example, after every message save/upddate, you have the LLM generate an overview. These are my instructions for my tiny overview https://github.com/gitsense/chat/blob/main/data/analyze/tiny... that is focused on generating the purpose and keywords that can be used to help the LLM define search terms.
And now you’ve reinvented vector embeddings.
Given how fast interference has become and given current supported context window sizes for most SOTA models, I think summarizing and having the LLM decide what is relevant is not that fragile at all for most use cases. This is what I do with my analyzers which I talk about at https://github.com/gitsense/chat/blob/main/packages/chat/wid...
The number is actually the order in the chat so 1.md would be the first message, 2.md would be the second and so forth.
If you goto https://chat.gitsense.com and click on the "Load Personal Help Guide" you can see how it is used. Since I want you to be able to chat with the document, I will create a new chat tree and use the directory structure and the 1,2,3... markdown files to determine message order.
In fact, this is what ChatGPT came up with:
SELECT *
FROM documents
WHERE text ILIKE '%coffee%'
OR text ILIKE '%espresso%'
OR text ILIKE '%latte%'
OR text ILIKE '%cappuccino%'
OR text ILIKE '%americano%'
OR text ILIKE '%mocha%'
OR text ILIKE '%macchiato%';
(I gave it no direction as to the structure of the DB, but it shouldn't be terribly difficult to adapt to your exact schema)There are an unlimited number of items to add to your “like” clauses. Vector search allows you to efficiently query for all of them at once.
[1] Despite also somehow supporting MongoDB...
Main advantages of a vector lookup are built-in fuzzy matching and the potential to keep a large amount of documentation in memory for low latency. I can’t see an RDMS being ideal for either. LLMs are slow enough already, adding a slow document lookup isn’t going to help.
It would become unwieldy real fast, though. Easier to get an embedding for the sentence.
If you're matching ("%card%" OR "%kad%"), you'll also match with things like virtual card, debit card, kadar (rates), akad (contract). The more languages you support, the more false hits you get.
Not to say SQL is wrong, but 30 year old technology works with 30 year old interfaces. It's not that people didn't imagine this back then. It's just that you end up with interfaces similar to dropdown filters and vending machines. If you're giving the user the flexibility of a LLM, you have to support the full range of inputs.
Certainly you're at the mercy of what the LLM constructs. But if understands that, say, "debt card" isn't applicable to "card" it can add a negation filter. Like has already been said, you're basically just reinventing a vector database in 'relational' (that somehow includes MongoDB...) approach anyway.
But what is significant is the claim that it works better. That is a bold claim that deserves a closer look, but I'm not sure how you've added to that closer look by arbitrarily sharing your experience? I guess I've missed what you're trying to say. Everyone and their brother knows how a vector database works by this point.
A. Last month user fd8120113 said “I don’t like coffee”
B. Today they are back for another beverage recommendation
SQL is the place to store the relevant fact about user fd8120113 so that you can retrieve it into the LLM prompt to make a new beverage recommendation, today.It’s addressing the “how many fucking times do I fucking need to tell you I don’t like fucking coffee” problem, not the word salad problem.
The ggp comment is strawmanning.
"I hate espresso" "I love coffee"
What if the SQL query only retrieves the first one?
My comment described the problem.
The solution is left as an exercise for the reader.
Keep in mind that people change their minds, misspeak, and use words in peculiar ways.