undefined | Better HN

0 pointssimonw2y ago0 comments

What are you using those embeddings for?

I can see a sliding window working for semantic search and RAG, but not so much for clustering or finding related documents.

0 comments

1 comments · 1 top-level

egorfine2y ago

Ah yes, clustering is indeed something that would benefit from large context, I agree.

However even so I would think about the documents themselves and figure out if it is even needed. Lets say we are talking about clustering court proceedings. I'd rather extract the abstract from these document, embed and cluster those instead of the whole text.

j / k navigate · click thread line to collapse