undefined | Better HN

0 pointsPhilippGille1y ago0 comments

In chromem-go [1] I'm searching through 100,000 vectors in 40ms on a mid-range laptop CPU, even without SIMD. It's quick enough for many use cases.

[1] https://github.com/philippgille/chromem-go

0 comments

6 comments · 1 top-level

PaulHoule1y ago· 5 in thread

It would be very hard to find a problem that has better mechanical sympathy than full-scan similarity search. Even if the operation count of some other algorithm was 1/10 on paper it might not be faster if the prefetcher and branch predictor aren't running at their best.

People who want to start with RAG should not start with a vector search engine but instead download

https://sbert.net/

and start messing around w/ Jupyter notebooks and maybe FAISS. People who think I'm going to upload vectors to an expensive cloud service over a slow ADSL connection are delusional.

rockwotj1y ago

Swap out FAISS with usearch, you get all the awesome SIMD acceleration (via dynamic dispatch), optional compression. Not affiliated but really cool tech.

https://github.com/unum-cloud/usearch

vegabook1y ago

Just tried usearch against ol’ faithful np.dot, and found the latter to be 8x faster than usearch on 10m brute force scan as described in their readme [1] for top 50 matches. Identical output result. 1.74 seconds for numpy and around 12 seconds for usearch on an M2 max with enough ram to hold the vectors without swapping.

[1] https://github.com/unum-cloud/usearch?tab=readme-ov-file#exa...

1 more reply

zackangelo1y ago

fwiw faiss, although a bit unwieldy, has an optimized full scan search built into it as well

1 more reply

abhgh1y ago

I strongly advocate this! If you're starting off in this space, check if this barebones implementation isn't all you need. You can't beat the accuracy of a full search; in theory, you're trading off scalability, but validate if you need the scale where this tradeoff begins to show. And yes, sbert is great, and it gives you options to choose [1] between accuracy (MPNet) and speed (MiniLM). There are also multi-lingual options. And remember, you can also fine-tune MPNet with SetFit. And there are always new and interesting embeddings being released, so remember to re-assess the fitment of embeddings once in a while against what you're using, e.g., LLM2vec or ModernBERT. A good idea would be to keep checking MTEB [2].

[1] https://www.sbert.net/docs/sentence_transformer/pretrained_m...

[2] https://huggingface.co/spaces/mteb/leaderboard

VoVAllen1y ago

Hi, I'm the author of the article. I agree with your point. The model from https://www.mixedbread.ai/blog/mxbai-embed-xsmall-v1 also looks great, though I haven’t had the chance to try it yet.

j / k navigate · click thread line to collapse

0 comments

6 comments · 1 top-level

PaulHoule1y ago· 5 in thread

People who want to start with RAG should not start with a vector search engine but instead download

https://sbert.net/

and start messing around w/ Jupyter notebooks and maybe FAISS. People who think I'm going to upload vectors to an expensive cloud service over a slow ADSL connection are delusional.

rockwotj1y ago

Swap out FAISS with usearch, you get all the awesome SIMD acceleration (via dynamic dispatch), optional compression. Not affiliated but really cool tech.

https://github.com/unum-cloud/usearch

vegabook1y ago

[1] https://github.com/unum-cloud/usearch?tab=readme-ov-file#exa...

1 more reply

zackangelo1y ago

fwiw faiss, although a bit unwieldy, has an optimized full scan search built into it as well

1 more reply

abhgh1y ago

[1] https://www.sbert.net/docs/sentence_transformer/pretrained_m...

[2] https://huggingface.co/spaces/mteb/leaderboard

VoVAllen1y ago

Hi, I'm the author of the article. I agree with your point. The model from https://www.mixedbread.ai/blog/mxbai-embed-xsmall-v1 also looks great, though I haven’t had the chance to try it yet.

j / k navigate · click thread line to collapse