Show HN: Boomerang, a new embedding model for RAG and semantic search (opens in new tab)

(vectara.com)

18 pointsTastyLamps2y ago13 comments

13 comments

13 comments · 8 top-level

cjcenizal2y ago· 2 in thread

I work at Vectara and I'm curious -- are folks here using Retrieval-Augmented Generation (RAG)? What's your stack and what kind of improvements have you seen in answer quality?

Nischalj102y ago

pinecone + custom built ingestion & retrieval for codebase RAG (for the purpose of code search and understanding)

noobcoder2y ago

Clickhouse + Custom Reranker

Nischalj102y ago· 2 in thread

how good is it for code RAG applications?

eskibars2y ago

So far, we haven't really focused on code ingestion. We've had a few users try it out for that use case, but we code ingestion and generation is a bit different. We've found a lot of users have success in the natural language areas (ingesting enterprise content, ecommerce content, etc) and then building chatbots on top of the all-in-one API

Nischalj102y ago

true, code is very different than natural language. any plans for incorporating it?

awadallah2y ago· 1 in thread

How does Boomerang handle the trade-off between speed and accuracy? Does it sacrifice the quality of the results for faster response time? (I know answer as I am one of the founders of Vectara, asking this for the benefit of others)

svcrunch2y ago

The metrics presented in the blog post are those of our production model. When designing Boomerang, we tried to balance latency and search relevance in a manner that strikes the right balance for most use cases.

On the other hand, GTR-XXL is an example of a research model that biases in favor of search relevance, at the expense of latency. It's not really practical to deploy in production environments as a result.

llm-apprentice2y ago

Great to see another company join the LLM-builder tier. Good luck Vectara!

cyndaqu1l2y ago

As one of the people behind the development of Boomerang, I would love to add that we've tried to be as objective with the evaluation of our model as possible. And have reported results on datasets where we do better as well as worse than other commercial offerings as well as models available on HuggingFace.

awadallah2y ago

What are the limitations and challenges of Boomerang in terms of scalability to a large corpus with tens of millions off questions? (I know answer as I am one of the founders of Vectara, asking this for the benefit of others)

K0IN2y ago

> Note that while Boomerang is optimized for low-latency performance, models like GTR-XXL, which weighs in at 4.8 billion parameters, are very challenging to productionize.

So what is the size of your model than, or did i miss something?

ofermend2y ago

Using Boomerang can significantly improve your end-to-end RAG performance: retrieving the most relevant facts (or chunks) matters, a lot!

j / k navigate · click thread line to collapse

13 comments

13 comments · 8 top-level

cjcenizal2y ago· 2 in thread

I work at Vectara and I'm curious -- are folks here using Retrieval-Augmented Generation (RAG)? What's your stack and what kind of improvements have you seen in answer quality?

Nischalj102y ago

pinecone + custom built ingestion & retrieval for codebase RAG (for the purpose of code search and understanding)

noobcoder2y ago

Clickhouse + Custom Reranker

Nischalj102y ago· 2 in thread

how good is it for code RAG applications?

eskibars2y ago

Nischalj102y ago

true, code is very different than natural language. any plans for incorporating it?

awadallah2y ago· 1 in thread

svcrunch2y ago

llm-apprentice2y ago

Great to see another company join the LLM-builder tier. Good luck Vectara!

cyndaqu1l2y ago

awadallah2y ago

K0IN2y ago

> Note that while Boomerang is optimized for low-latency performance, models like GTR-XXL, which weighs in at 4.8 billion parameters, are very challenging to productionize.

So what is the size of your model than, or did i miss something?

ofermend2y ago

Using Boomerang can significantly improve your end-to-end RAG performance: retrieving the most relevant facts (or chunks) matters, a lot!

j / k navigate · click thread line to collapse