undefined | Better HN

0 pointsregularfry2y ago0 comments

The recent SILO-LM paper has a slightly different approach: rather than using input embeddings and prompting the LLM with documents, it searches the database according to the LLM's output embedding and uses KNN search to skew the output embedding vector before token generation. Done that way round, using LLM embeddings outperforms RAG, allegedly.

They did it with a custom language model. I really want to give this a try with llama2 embeddings but haven't had the bandwidth yet (and llama2's embedding vectors are inconveniently huge, but that's a different problem).

0 comments

1 comments · 1 top-level

spott2y ago

Interesting! I’ll have to look into that.

j / k navigate · click thread line to collapse