But, the holy grail is an LLM that can successfully work on a large corpus of documents and data like slack history, huge wiki installations and answer useful questions with proper references.
I tried a few, but they don’t really hit the mark. We need the usability of a simple search engine UI with private data sources.
The approach in the paper has rough edges, but the metrics are bonkers (double digit percentage POINTS improvement over dual encoders). This paper was written before the LLM craze, and I am not aware of any further developments in that area. I think that this area might be ripe for some break through innovation.
The best at least for now is to just use OpenAI’s custom gpt and with some clever (but not hard) it’s quite good.
But if all you wanted is a search engine that's a bit easier.
The problem is often that a huge wiki installation etc will have a lot of outdated data etc. Which will still be an issue for an llm. And if you had fixed the data you might as well just search for the things you need no?
My question is.
1 - Even if there is so much data that I can no longer find stuff, how much text data is needed to train an LLM to work ok? Im not after an AI that could answer general question, only an AI that should be able to answer what I already know exist in the data.
2 - I understand that the more structured the data are, the better, but how important is it when training an LLM with structured data? Does it just figuring stuff out anyways in a good way mostly?
3 - Any recommendation where to start, how to run an LLM AI locally, train on your own data?
`text_splitter=RecursiveCharacterTextSplitter( chunk_size=8000, chunk_overlap=4000)`
Does this simple numeric chunking approach actually work? Or are more sophisticated splitting rules going to make a difference?
`vector_store_ppt=FAISS.from_documents(text_chunks_ppt, embeddings)`
So we're embedding all 8000 chars behind a single vector index. I wonder if certain documents perform better at this fidelity than others. To say nothing of missed "prompt expansion" opportunities.
Regarding the index usually a mix of BM25 and vector index seems to perform best for most generic data.
- how much ram is needed
- what CPU do you need for decent performances
- can it run on a GPU? And if it does how much vram do you need / does it work only on Nvidia?
Tinkering with building a RAG with some of my documents using the vector stores and chaining multiple calls now.
coming from chatgpt4 it was a huge breath of fresh air to not deal with the judeo-christian biased censorship.
i think this is the ideal localllama setup--uncensored, unbiased, unlimited (only by hardware) LLM+RAG
https://shelbyjenkins.github.io/blog/retrieval-is-all-you-ne...
I have:
Processor: Ryzen 5 3600
Video card: Geforce GTX 1660 TI 6Gb DDR6 (Zotac)
RAM: 16Gb DDR4 2666mhz
Any recommendations?