Show HN: Khoj – Chat offline with your second brain using Llama 2 (opens in new tab)

(github.com)

565 points1102y ago150 comments

Hi folks, we're Debanjum and Saba. We created Khoj as a hobby project 2+ years ago because: (1) Search on the desktop sucked; we just had keyword search on the desktop vs google for the internet; and (2) Natural language search models had become good and easy to run on consumer hardware by this point.

Once we made Khoj search incremental, I completely stopped using the default incremental search (C-s) in Emacs. Since then Khoj has grown to support more content types, deeper integrations and chat (using ChatGPT). With Llama 2 released last week, chat models are finally good and easy enough to use on consumer hardware for the chat with docs scenario.

Khoj is a desktop application to search and chat with your personal notes, documents and images. It is accessible from within Emacs, Obsidian or your Web browser. It works with org-mode, markdown, pdf, jpeg files and notion, github repositories. It is open-source and can work without internet access (e.g on a plane).

Our chat feature allows you to extract answers and create content from your existing knowledge base. Example: "What was that book Trillian mentioned at Zaphod's birthday last week". We personally use the chat feature regularly to find links, names and addresses (especially on mobile) and collate content across multiple, messy notes. It works online or offline: you can chat without internet using Llama 2 or with internet using GPT3.5+ depending on your requirements.

Our search feature lets you quickly find relevant notes, documents or images using natural language. It does not use the internet. Example: Search for "bought flowers at grocery store" will find notes about "roses at wholefoods".

Quickstart:

  pip install khoj-assistant && khoj

See https://docs.khoj.dev/#/setup for detailed instructions

We also have desktop apps (in beta) at https://github.com/khoj-ai/khoj/releases/tag/0.10.0 if you want to try them out.

Please do try out Khoj and let us know if it works for your use cases? Looking forward to the feedback!

Show HN: Khoj – Chat offline with your second brain using Llama 2

(github.com)

565 points1102y ago150 comments

Quickstart:

  pip install khoj-assistant && khoj

See https://docs.khoj.dev/#/setup for detailed instructions

We also have desktop apps (in beta) at https://github.com/khoj-ai/khoj/releases/tag/0.10.0 if you want to try them out.

Please do try out Khoj and let us know if it works for your use cases? Looking forward to the feedback!

150 comments

112 comments · 35 top-level

agg232y ago· 7 in thread

Just a heads up, your landing page on your website doesn't seem to mention Llama/the offline usecase at all, only online via OpenAI.

----

What model size/particular fine-tuning are you using, and how have you observed it to perform for the usecase? I've only started playing with Llama 2 at 7B and 13B sizes, and I feel they're awfully RAM heavy for consumer machines, though I'm really excited by this possibility.

How is the search implemented? Is it just an embedding and vector DB, plus some additional metadata filtering (the date commands)?

110OP2y ago

Thanks for the pointer, yeah the website content has gone stale. I'll try update it by end of day

Khoj is using the Llama 7B, 4bit quantized, GGML by TheBloke.

It's actually the first offline chat model that gives coherent answers to user queries given notes as context.

And it's interestingly more conversational than GPT3.5+, which is much more formal

jmorgan2y ago

This is a super cool project. Congrats! If you’re looking at trying different models with one API check out an open-source project a few folks and I have been working on in July in case it’s helpful https://github.com/jmorganca/ollama

Llama 2 gives great answers, even the 7B model. There’s an “uncensored” 7B version as well George Sung has fine-tuned for topics that the default Llama2 model won’t discuss - eg I had trouble having Llama2 review authentication/security code or topics: https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GG...

From just playing around with it the uncensored model still seems to know where to “draw the line” on sensitive topics but YMMV

If you do end up checking out Ollama you can try it with with this command or there’s an API too (it’s not in the docs yet)

  ollama run llama2-uncensored

1 more reply

agg232y ago

Oh interesting, so you're not using Llama 2, you're using the original. Have you begun to evaluate Llama 2 to determine the differences in performance?

How are you determining what notes (or snippets of notes?) to be injected as context? Especially given the small 2048 context limit with Llama 1.

1 more reply

M4v3R2y ago

Why only the 7B version? Would there be possibility to add support for the 13B as well if someone has enough RAM to run it?

1 more reply

moneywoes2y ago

Is a vector db used?

rapnie2y ago

> Just a heads up, your landing page on your website doesn't seem to mention Llama/the offline usecase at all, only online via OpenAI.

I am sufficiently uneducated on the ins and outs of AI integrations to always wonder if projects like this one can be used in local-only mode, i.e. when self-hosted ensuring me that never any of my personal information is sent to a remote service. So it would be very helpful to very explicitly give me that assurance of privacy, if that's the case.

110OP2y ago

Yes, this seems to be a common concern. We're trying to see how best to crisply address it. But yes, Khoj can be used even with your internet turned off

mlajtos2y ago· 6 in thread

Have anyone got something valuable from talking to your second brain? What kind of conversations are you trying to have?

bozhark2y ago

Traumatic Brain Injury. I can’t remember yesterday.

Would be hella nice to connect all the scattered lines of thoughts in various notes on a variety of platforms.

andai2y ago

If you're on mac I would strongly recommend Notational Velocity (or the Alt version), if they still run (I know Apple likes to break compatibility).

I've tried dozens of notetaking apps and that's the only one that truly felt like a second brain.

It's because of the speed. Infuriatingly, Obsidian for example can search just as fast, but they intentionally programmed in a lag after each keystroke... (I know because I removed it.)

3 more replies

andai2y ago

If you're on windows check out TimeSnapper. The classic version is free and works fine.

It screencaps your desktop every 5 sec so you can watch a timelapse of how you spent your day. (Assuming it was on the computer!)

I did find it heavy on the disk usage so I wrote a ffmpeg script to convert it to video (much more efficient).

110OP2y ago

Wow that's an intense use-case. I don't know how but we'd love to be able to support this.

If you can collate your notes into markdown or some such, then messy notes can be handled, at least using Khoj with GPT3.5+.

Do let us know how we can help out and what your current biggest pain-points are?

mlajtos2y ago

I am sorry.

Would some summary of previous day would be helpful to you? Is your memory problem only episodic, or does it extend to factual and kinesthetic as well?

1 more reply

fragmede2y ago

rewind.ai might be able to help

mavsman2y ago· 5 in thread

Really cool to see this! Local is the real future of AI.

I got really excited about this and fired it up on my petite little M2 Macbook Air only for it to grind it to a halt. Think the old days when you had a virus on your PC and you'd move the mouse then wait 45 seconds to see the cursor move. It honestly made me feel nostalgic. I guess I have to taper performance expectations with this Air, though this is the first time it's happened.

quickthrower22y ago

Just wait 10 years when computers have a dedicated AI-PU and you don't have to worry about freezing anything up to talk to your bot.

joaogui12y ago

I wonder if edge tpus (sold here https://coral.ai/products/) could help, though I guess the community would have to optimize for them instead of for standard hardware

kridsdale12y ago

The M1 line of Macs does have that unit.

jmorgan2y ago

How much memory do you have in your Macbook? The 7B models seem to work well with at least 16GB of unified memory, but I’ve seen Macs with 8GB really struggle.

mavsman2y ago

Indeed it's just a poor little 8GB of RAM.

overnight53492y ago· 5 in thread

Could this do something like take in the contents of my web history for the day and summarize notes on what I've been researching?

This is getting very close to my ideal of a personal AI. It's only gonna be a few more years until I can have a digital brain filled with everything I know. I can't wait

110OP2y ago

That would be pretty awesome. Building a daily web history summarizer as a browser extension shouldn't be too much work. I bet there's something like that already out there.

Having something that indexes all your digital travels and makes it easily digestible will be gold. Hopefully Khoj can become that :)

reitanqild2y ago

> I bet there's something like that already out there.

There was.

It was called Google Desktop Search, it was awesome, and it was axed.

That said, today I wouldn't use it anyway as both I and Google have changed a lot.

1 more reply

porcc2y ago

https://www.rewind.ai

aliasxneo2y ago

Have you used this? Looks fairly interesting.

1 more reply

usehackernews2y ago

Interesting, this is the exact question that came to mind for me. This would address a pain point for me.

Does anyone have recommendations for a tool that does it?

Or, anyone want to build it together?

kljuka2y ago· 5 in thread

I tried the search using Slavic language (all my notes are in Slovene) - it performed very poorly: if the searched keyword was not directly in the note itself, the search results seemed to be more or less random.

110OP2y ago

Search should work with Slavic languages including Russian and 50+ other languages.

You'll just need to configure the asymmetric search model khoj uses to paraphrase-multilingual-MiniLM-L12-v2 in your ~/.khoj/khoj.yml config file

See http://docs.khoj.dev/#/advanced?id=search-across-different-l...

110OP2y ago

Khoj chat with Llama2 will not work with non-english languages though. You'll have to enable OpenAI for that

tom9102y ago

Yes, I confirm. I have many articles in Russian language and the search can not find relative information, but if I try to search in English it works fine and can find documents that use English

omniglottal2y ago

So you're saying you got no results when searching for patterns which did not exist in the dataset...?

kljuka2y ago

There was always full list of results. They just weren't relevant.

coder5432y ago· 4 in thread

This seems like a cool project.

It would be awesome if it could also index a directory of PDFs, and if it could do OCR on those PDFs to support indexing scanned documents. Probably outside of the scope of the project for now, but just the other day I was just thinking how nice it would be to have a tool like this.

110OP2y ago

Yeah being able to search and chat with PDF files is quite useful.

Khoj can index directory of PDFs for search and chat. But it does not currently work with scanned PDF files (i.e not with ones without selectable text).

Being able to work with those would be awesome. We just need to get to it. Hopefully soon

adr1an2y ago

Check pdftotext it's a CLI tool (maybe a library too) that makes pdf text selectable. Oh sorry, I meant to say ocrmypdf. But hey, maybe it's worth checking both.

samstave2y ago

Ive wanted a crawler on my machine for auto-categorizing and organizing, tagging and moving ALL my files around based on all my machines - so the ability to crawl PDFs, downloads, screenshots, pictures, etc and give me a logical tree of the org of the files - and allow me to modify it by saying "add all PDF related to [subject] here and the organize by source/author etc... and then move all my screenshots, ordered by date here

etc...

I've wanted a "COMPUTER.", uh... I say "COMPUTER!", 'sir, you have to use the keyboard', ah a Keyboard, how quaint.... forever.

110OP2y ago

That.would.be.awesome! Khoj isn't their yet, but that actually shouldn't be too far away if you give it a voice interface and terminal access.

Of course, having it be stable enough to not `rm -rf /` soon after is definitely not part of the warranty

1 more reply

spdustin2y ago· 4 in thread

I see you’re using gpt4all; do you have a supported way to change the model being used for local inference?

A number of apps that are designed for OpenAI’s completion/chat APIs can simply point to the endpoints served by llama-cpp-python [0], and function in (largely) the same way, while using the various models and quants supported by llama.cpp. That would allow folks to run larger models on the hardware of their choice (including Apple Silicon with Metal acceleration or NVIDIA GPUs) or using other proxies like openrouter.io. I enjoy openrouter.io myself because it supports Anthropic’s 100k models.

[0]: https://github.com/abetlen/llama-cpp-python

syntaxing2y ago

The point of gpt4all is that you can change the model with minimal breaking. You should be able to change this line https://github.com/khoj-ai/khoj/blob/master/src/khoj/process... to the model you want. You'll need to build your own local image with docker-compose but should be relatively straight forward.

sabaimran2y ago

Yeah, the gpt4all project is super neat. If folks are inclined enough, it should be fairly straightforward for you to clone the Khoj project and swap out the model used. You'd have to update the model type in a few places, but should be easy enough just with normal string/keyword search. Then run it directly from inside your machine. You will, however, have to go in and modify the prompt structure to fit the model's expectation. Some guidance on that in this PR with Falcon: https://github.com/khoj-ai/khoj/pull/330/files#diff-7fa11396...

I'll provide my insight from experimentation integrating Llama V2/GPT4All into Khoj -- Falcon 7b is probably the runner up in models that can be supported on consumer hardware, and it really wasn't good enough (for me) on my machine to be useful. The token consumption with personal notes context is too large, and the content too variable for a small model like that to be able to understand it. It's fine if you're just doing normal question-answering back and forth, but you don't need Khoj for that.

110OP2y ago

No, we don't yet. Lots of developer folks want to try different models, we want to provide simple to use, but deep assistance. Kind of unsure what to focus on given our limited resources.

vunderba2y ago

I really like the idea of running a dedicated server that serves up various large language models via a standardized API, and then Khoj could just be pointed at one. Depending on the notes and the type of conversation I want to have, that would even allow for Khoj to swap models on the fly.

1 more reply

wanderingmind2y ago· 4 in thread

Two comments

1. If you want better adoption especially among corporations, GPL-3 wont cut it. Maybe think of some business friendly licenses (MIT etc)

2. I understand the excitement about llm's. But how about making something more accessible to people with regular machines and not state of art. I use rip-grep-all (rga) along with fzf [1] that can search all files including pdfs in a specific folders. However, I would like a GUI tool to

   (a) search across multiple folders, 

   (b) provide priority of results across folders, filetypes and 

   (c) store search histories where I can do a meta-search.

This is sufficient for 95% of my usecases to search locally and I don't need LLM. If khoj can enable such search as default without LLM that will be a gamechanger for many people without a heavy compute machine or who dont want to use OpenAI.

[1] https://github.com/phiresky/ripgrep-all/wiki/fzf-Integration

pvh2y ago

Just a note to suggest that giving away your hard work to those who will profit from it in the hope that they will remember you later seems like a pretty dubious exchange.

Have a look at how that worked out for the folks who built node and its libraries versus the ones who maintained control of their work (like npm).

quickthrower22y ago

What happened there. Surely the people who built node (or the people building the most popular fork at least) get to define what the default package manager is etc. and get some BATNA against the likes of a 3rd party package manager profiting from their thing. I don't know what the node/npm relationship story is though.

Zuiii2y ago

If corporations have no issue with using restrictive proprietary licenses, they should not have any issues with the GPL.

trenchgun2y ago

That seems like a pretty trivial thing to implement. Why not do it yourself?

ramesh312y ago· 4 in thread

Something I've noticed playing around with Llama 7b/13b on my Macbook is that it clearly points out just how little RAM 16GB really is these days. I've had a lot of trouble running both inference and a web UI together locally when browser tabs take up 5GB alone. Hopefully we will see a resurgence of lightweight native UIs for these things that don't hog resources from the model.

sabaimran2y ago

FWIW I've also had browser RAM consumption issues in life, but it's been mitigated by extensions like OneTab: https://chrome.google.com/webstore/detail/onetab/chphlpgkkbo...

For now, local LLMs take up an egregious about of RAM, totally agreed. But we trust the ecosystem is going to keep improving and growing and we'll be able to make improvements over time. They'll probably become efficient enough where we can run them on phones, which will unlock some cool scope for Khoj to integrate with on device, offline assistance.

thenickdude2y ago

The new Chrome "memory saver" feature that discards the contents of old tabs saves a lot of memory for me. Tabs get reloaded from the server if you revisit them.

Kwpolska2y ago

Or hopefully we will see an end of the LLM hype.

Or at least models that don’t hog so much RAM.

ramesh312y ago

>Or at least models that don’t hog so much RAM

The RAM usage is kind of the point though; we're trading space for time. It's not a problem that the model is using it, it's just that with the default choice for UI being web based now, the unnecessary memory usage of browsers is actually starting to be a real pain point.

1 more reply

IshKebab2y ago· 3 in thread

Interesting. The obvious question you haven't answered anywhere (as far as I can see) is what are the hardware requirements to run this locally?

110OP2y ago

Ah, you're right, forgot to mention that. We use the Llama 2 7B 4 bit quantized model. The machine requirements are:

Ideal: 16Gb (GPU) RAM

Less Ideal: 8GB RAM and CPU

danparsonson2y ago

So just to clarify, is that: Ideal is running the model on a GPU (any brand? Nvidia, AMD, etc.?) with 16GB of GPU RAM, less ideal is running it on the CPU, for which it needs 8GB system RAM? Presumably it will occupy all that memory while it's running?

What about if I have a GPU with 8GB?

1 more reply

HexDecOctBin2y ago

Sorry for the repetition, but do you mean 16 GB VRAM? That is a very high requirement, a RTX 4060 only has 8GB and even a RTX 4070 only ships with 12GB. Any upcoming further optimizations for reducing memory usage?

PS. Nice to see an Hindi name for a software. For those who don't speak Hindi: https://en.m.wiktionary.org/wiki/%E0%A4%96%E0%A5%8B%E0%A4%9C...

1 more reply

jigneshdarji912y ago· 3 in thread

This would be even great if available as a Spotlight Search replacement (with some additional features that Spotlight supports).

tough2y ago

Should be easy to plug it in with a Raycast.app or Alfred.app plugin.

110OP2y ago

Khoj exposes a local API. Hopefully that makes it easy to integrate with Raycast, Alfred (or even Spotlight)?

110OP2y ago

Yeah, this would be ideal for Mac users. Just need to look into what is required and how much work it is

IAmNotACellist2y ago· 2 in thread

What's the posthog telemetry used for? Why is there nothing on it in the docs? Why no clear way to opt out?

sabaimran2y ago

Thanks for pointing that out!

We use it for understanding usage -- like determining whether people are using markdown or org or more.

Everything is collected entirely anonymized, and no identifiable information is ever sent to the telemetry server.

To opt-out, you set the `should-log-telemetry` value in `khoj.yml` to false. Updated the docs to include these instructions and what we collect -- https://docs.khoj.dev/#/telemetry.

Kerbonut2y ago

It’s pretty easy to remove which is what I ended up doing. The project works remarkably well otherwise.

mmanfrin2y ago· 2 in thread

As someone who's been getting int o using Obsidian and messing around with chat ais, this is excellent, thank you!

110OP2y ago

Thanks! Do try it out and let us know if it works for your use-case?

tarwin2y ago

Really encourages me to move to Obsidion :D

smcleod2y ago· 2 in thread

I've been playing with Khoj for the past day - it's really neat, well done!

A few observations:

1. Telemetry is enabled by default, and may contain the API and chat queries. I've logged an issue for this along with some suggestions here: https://github.com/khoj-ai/khoj/issues/389

2. It would be advantageous to have configuration in the UI rather than baking it's YAML into the container image. (added a note on that in the aforementioned issue on Github).

3. It's not clear if you can bring your own models, e.g. can I configure a model from huggingface/gpt4all? if so, will it be automatically downloaded based on the name or should I put the .bin (and yaml?) in a volume somewhere?

4. AMD GPU/APU acceleration (CLBLAS) would be really nice, I've logged an issue for this feature request as well. https://github.com/khoj-ai/khoj/issues/390

sabaimran2y ago

Thanks for the feedback! Much appreciated.

I responded in the issue, but I'll paste here as well for those also curious:

Khoj does not collect any search or chat queries. As mentioned in the docs, you can see our telemetry server[1]. If you see anything amiss, point it out to me and I'll hotfix it right away. You can see all the telemetry metadata right here[2].

[1]: https://github.com/khoj-ai/khoj/tree/master/src/telemetry

[2]: https://github.com/khoj-ai/khoj/blob/master/src/khoj/routers...

Configuration with the `docker-compose` setup is a little bit particular, see the issue^ for details.

Thanks for the reference points for GPU integration! Just to clarify, we do use GPU optimization for indexing, but not for local chat with Llama. We're looking into getting that working.

tmzt2y ago

Would it be possible to support a custom URL for the local model, such as running ./server in ggml would give you?

This may be more difficult if you are pre-tokenizing the search context.

Very cool project.

LanternLight832y ago· 2 in thread

It's funny that you mention `C-s`, because `isearch-forward` is usually used for low-latency literal matches. In what workflow can Khoj offer acceptable latency or superior utility as a drop-in replacement for isearch? Is there an example of how you might use it to navigate a document?

110OP2y ago

That's (almost) exactly what khoj search provides a search-as-you-type experience but with a natural language (instead of keyword) search interface.

My workflow looks like: 1. Search with Khoj search[1]: `C-c s s` <search-query> RET 2. Use speed key to jump to relevant entry[2]: with `n n o 2`

[1]: `C-c s` is bound to `khoj` transient menu [2] https://orgmode.org/manual/Speed-Keys.html

Blackthorn2y ago

Can you elaborate on that a little bit? When you search like that, what do you type in to find stuff in a programming language source file? I'd like to better understand this workflow, it seems interesting and I might be missing out.

1 more reply

btbuildem2y ago· 2 in thread

Heads up, docker build fails with:

#12 2.017 ERROR: Could not find a version that satisfies the requirement pyside6>=6.5.1 (from khoj-assistant) (from versions: none)

#12 2.017 ERROR: No matching distribution found for pyside6>=6.5.1

------

executor failed running [/bin/sh -c sed -i 's/dynamic = \["version"\]/version = "0.0.0"/' pyproject.toml && pip install --no-cache-dir .]: exit code: 1

sabaimran2y ago

Darn, I've seen this error a couple of times. Can you drop a couple of details in this Github issue? https://github.com/khoj-ai/khoj/issues/391

I'm particularly interested in your OS/build environment.

btbuildem2y ago

Sure thing! I left a comment.

The "buildx" flag gets me past that one, and to the next error:

#0 12.37 ERROR: Could not find a version that satisfies the requirement gpt4all>=1.0.7 (from khoj-assistant) (from versions: 0.1.5, 0.1.6, 0.1.7)

#0 12.37 ERROR: No matching distribution found for gpt4all>=1.0.7

thangngoc892y ago· 2 in thread

I’m in search of a new Macbook Mx. what is the requirements for running these model locally without breaking the bank? Would 32GB be enough?

110OP2y ago

You do not need to break the bank to use Khoj for local chat, 16Gb RAM should be good enough

DoctorOetker2y ago

How slow would that be on an old non-Apple laptop, but also 16Gb RAM?

lscpu output: Architecture: x86_64

  CPU op-mode(s):        32-bit, 64-bit

  Address sizes:         36 bits physical, 48 bits virtual

  Byte Order:            Little Endian

CPU(s): 8

  On-line CPU(s) list:   0-7

Vendor ID: GenuineIntel

  Model name:            Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz

    CPU family:          6

    Model:               58

    Thread(s) per core:  2

    Core(s) per socket:  4

    Socket(s):           1

    Stepping:            9

    CPU(s) scaling MHz:  35%

    CPU max MHz:         3400.0000

    CPU min MHz:         1200.0000

    BogoMIPS:            4791.90

    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cp

                         uid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr
                         _shadow vnmi flexpriority ept vpid fsgsbase smep erms

xsaveopt dtherm ida arat pln pts md_clear flush_l1d

  L1d:                   128 KiB (4 instances)

  L1i:                   128 KiB (4 instances)

  L2:                    1 MiB (4 instances)

  L3:                    6 MiB (1 instance)

NUMA:

  NUMA node(s):          1

  NUMA node0 CPU(s):     0-7

1 more reply

Hypergraphe2y ago· 2 in thread

Hi, my dream app ! Will it work on non english sources ?

110OP2y ago

To use Chat with non-english sources you'll need to enable OpenAI. Offline chat with Llama 2 can't do that yet.

And Search can be configured to work with 50+ languages.

You'll just need to configure the asymmetric search model khoj uses to paraphrase-multilingual-MiniLM-L12-v2 in your ~/.khoj/khoj.yml config file

For setup details see http://docs.khoj.dev/#/advanced?id=search-across-different-l...

Hypergraphe2y ago

Thank you for your reply. I was thinking of using a translation model to translate all of my documents to english before indexing them.

wg02y ago· 1 in thread

I have not tried it but something like this should exist. I don't think it is going to be as useable on consumer hardware as yet unless you have a good enough GPU but within couple of years (or less), we'll be there I am sure.

Irrelevant opinion - The logo is beautiful, I like it and so are the colours used.

Lastly, LLMA2 for such use cases, I think is capable enough that paying for ChatGPT won't be as lucrative especially when privacy is of concern.

Keep it up. Good craftsmanship. :)

sabaimran2y ago

Thanks! I do think Llama V2 is going to be a good enough replacement for ChatGPT (aka GPT3.5) for a lot of use cases.

RomanHauksson2y ago· 1 in thread

Awesome work, I've been looking for something like this. Any plans to support Logseq in the future?

110OP2y ago

Yes, we hope to get to it soon! This has been an ask on our Github since a while[1]

[1]: https://github.com/khoj-ai/khoj/issues/141

ubertaco2y ago· 1 in thread

Hey, I saw Khoj hit HN a few weeks ago and get slaughtered because the messaging didn't match the product.

You've come a good way in both directions: the messaging is clearer about current state vs aspirations, and you've made good progress towards the aspirational parts.

Really glad to see the warm reception you're getting now. Nice job, y'all.

sabaimran2y ago

Hey ubertaco! I remember you. Appreciate the well-wishes. The landing page still needs some tweaking. It's kind of hard keeping what you're building in sync with what you're aspiring for, but we're definitely working towards it.

bajirao2y ago· 1 in thread

What's the recommended 'size' of the machine to run this?

I tried to run it on a pretty beefy machine (8 core cpu/32 GB RAM) to use with ~40 odd PDF documents. My observation is that the queries (chat) takes forever and also getting Segmentation fault (core dumped) for every other or so query.

110OP2y ago

Thanks for the feedback. Does your machine have a GPU? 32GB CPU RAM should be enough but GPU speeds up response time.

We have fixes for the seg fault[1] and improvement to the query speed[2] that should be released by end of day today[3].

Update khoj to version 0.10.1 with pip install --upgrade khoj-assistant later today to see if that improves your experience.

The number of documents/pages/entries doesn't scale memory utilization as quickly and doesn't affect the search, chat response time as much

[1]: The seg fault would occur when folks sent multiple chat queries at the same time. A lock and some UX improvements fixed that

[2]: The query time improvements are done by increasing batch size, to trade-off increased memory utilization for more speed

[3]: The relevant pull request for reference: https://github.com/khoj-ai/khoj/pull/393

asynchronous2y ago· 1 in thread

This is very cool, the Obsidian integration is a neat feature.

Please, someone make a home-assistant Alexa clone for this.

110OP2y ago

Thanks!

We've just been testing integrating over voice, whatsapp over the last few days[1][2] :)

[1]: https://github.com/khoj-ai/khoj/tree/khoj-chat-over-whatsapp...

[2]: https://github.com/khoj-ai/khoj/compare/master...features/wh...

bozhark2y ago· 1 in thread

I’m not a software dev.

Is there a way to have this bot read from a discord and google drive?

syntaxing2y ago

gpt4all itself (the library on the backend for this) has a similar program [1]. You just need to put everything into a folder. This should be straight forward for google drive. Harder for discord though but I’m sure theres a bot online that can do the extraction.

[1] https://gpt4all.io/index.html

pachico2y ago· 1 in thread

Would anybody be able to recommend any standalone solution (essentially data must not leave elsewhere) to chat with documents with a web interface?

I tried privategpt but results were not great.

110OP2y ago

Khoj provides exactly that; it runs on your machine, none of your data leaves your machine and it has a web interface to chat

Roark662y ago· 1 in thread

From previous answers it appears you're using standard lama-7b (quantized to 4 bits). I suppose you're doing a search on the notes than you pass what you found with the original query to lama. This technique is cool, but there are many limitations. For example lama's content length.

I can't wait for software that will take my notes each day and fine tune a LLM model on them so I can use entire context length for my question/answers.

ankit2192y ago

> I can't wait for software that will take my notes each day and fine tune a LLM model on them so I can use entire context length for my question/answers

Problem is finetuning does not work that way. Finetuning is useful when you want to teach a model about a certain pattern, not when you want it output it right. Eg: With enough finetuning and prompts, a model will be able to output the result in a certain format that you need, but it does not guarantee that it would not be hallucination prone. The best way to minimize hallucination is still embedding based retrieval passed along with the question/prompt.

In future, there can be a system where you can build a knowledge base for LLMs, and tell it to access that for any knowledge, and finetune it for the patterns you want the output in.

yberreby2y ago· 1 in thread

Cool project. I tried it last time this got posted, but it was still a bit buggy. Giving it another shot - I'm mainly interested in the local chat.

Could you elaborate on the incremental search feature? How did you implement it? Don't you need to re-encode the full query through a SBERT or such as each token is written (perhaps with debouncing)?

Also, having an easily-extended data connector interface would be awesome, to connect to custom data sources.

110OP2y ago

Buggy for setup? We've done some improvements and have desktop apps (in beta) too now to simplify this. Feel free to report any issues on the khoj github. I can have a look.

Yes, we don't do optimizations on the query encoding yet. So SBERT just re-encodes the whole query every time. It gets results in <100ms which is good enough for incremental search.

I did create a plugin system, so that a data plugin just has to convert the source data into a standardized intermeditate jsonl format. But this hasn't been documented or extensively tested yet.

throw031720192y ago· 1 in thread

Feedback for landing page: use a fixed height container for the example prompts. Without it, it causes jumping while scrolling down the page making other sections hard to read. iOS Safari

110OP2y ago

Thanks for the feedback! Someone else mentioned this issue the other day as well. I'll fix this issue on the landing page soon

calnayak2y ago· 1 in thread

How does one access this from a web browser?

sabaimran2y ago

We have a cloud product you can sign up for, but it's more limited in what data sources it supports. It currently only works for Notion and Github indexing. If you're interested in that, send me a dm on Discord - https://discord.gg/BDgyabRM6e

But that would allow you to access Khoj from the web.

Bonapara2y ago· 1 in thread

Congrats guys!

110OP2y ago

Thanks! :)

jsdeveloper2y ago· 1 in thread

will it work on linux?(ubuntu)

110OP2y ago

Yes, of course

blinkingled2y ago

Somewhat unrelated but do people have links to share that walk you through taking Llama2 model and feeding it local data - confluence links, Google docs, plain text documents etc. I came across embeddings and langchain but was curious if people had thoughts on better ways to go about it as a newcomer experiment.

marcus_holmes2y ago

Any chance it can look at Gitlab as well? I like the idea but I'm not giving all my work to Microsoft.

Crowberry2y ago

It would be pretty awesome if this could be hooked up into Jira and Confluence as well!

umanwizard2y ago

Markdown doesn't work on HN...

j / k navigate · click thread line to collapse

150 comments

112 comments · 35 top-level

agg232y ago· 7 in thread

Just a heads up, your landing page on your website doesn't seem to mention Llama/the offline usecase at all, only online via OpenAI.

----

How is the search implemented? Is it just an embedding and vector DB, plus some additional metadata filtering (the date commands)?

110OP2y ago

Thanks for the pointer, yeah the website content has gone stale. I'll try update it by end of day

Khoj is using the Llama 7B, 4bit quantized, GGML by TheBloke.

It's actually the first offline chat model that gives coherent answers to user queries given notes as context.

And it's interestingly more conversational than GPT3.5+, which is much more formal

jmorgan2y ago

From just playing around with it the uncensored model still seems to know where to “draw the line” on sensitive topics but YMMV

If you do end up checking out Ollama you can try it with with this command or there’s an API too (it’s not in the docs yet)

  ollama run llama2-uncensored

1 more reply

agg232y ago

Oh interesting, so you're not using Llama 2, you're using the original. Have you begun to evaluate Llama 2 to determine the differences in performance?

How are you determining what notes (or snippets of notes?) to be injected as context? Especially given the small 2048 context limit with Llama 1.

1 more reply

M4v3R2y ago

Why only the 7B version? Would there be possibility to add support for the 13B as well if someone has enough RAM to run it?

1 more reply

moneywoes2y ago

Is a vector db used?

rapnie2y ago

> Just a heads up, your landing page on your website doesn't seem to mention Llama/the offline usecase at all, only online via OpenAI.

110OP2y ago

Yes, this seems to be a common concern. We're trying to see how best to crisply address it. But yes, Khoj can be used even with your internet turned off

mlajtos2y ago· 6 in thread

Have anyone got something valuable from talking to your second brain? What kind of conversations are you trying to have?

bozhark2y ago

Traumatic Brain Injury. I can’t remember yesterday.

Would be hella nice to connect all the scattered lines of thoughts in various notes on a variety of platforms.

andai2y ago

If you're on mac I would strongly recommend Notational Velocity (or the Alt version), if they still run (I know Apple likes to break compatibility).

I've tried dozens of notetaking apps and that's the only one that truly felt like a second brain.

It's because of the speed. Infuriatingly, Obsidian for example can search just as fast, but they intentionally programmed in a lag after each keystroke... (I know because I removed it.)

3 more replies

andai2y ago

If you're on windows check out TimeSnapper. The classic version is free and works fine.

It screencaps your desktop every 5 sec so you can watch a timelapse of how you spent your day. (Assuming it was on the computer!)

I did find it heavy on the disk usage so I wrote a ffmpeg script to convert it to video (much more efficient).

110OP2y ago

Wow that's an intense use-case. I don't know how but we'd love to be able to support this.

If you can collate your notes into markdown or some such, then messy notes can be handled, at least using Khoj with GPT3.5+.

Do let us know how we can help out and what your current biggest pain-points are?

mlajtos2y ago

I am sorry.

Would some summary of previous day would be helpful to you? Is your memory problem only episodic, or does it extend to factual and kinesthetic as well?

1 more reply

fragmede2y ago

rewind.ai might be able to help

mavsman2y ago· 5 in thread

Really cool to see this! Local is the real future of AI.

quickthrower22y ago

Just wait 10 years when computers have a dedicated AI-PU and you don't have to worry about freezing anything up to talk to your bot.

joaogui12y ago

I wonder if edge tpus (sold here https://coral.ai/products/) could help, though I guess the community would have to optimize for them instead of for standard hardware

kridsdale12y ago

The M1 line of Macs does have that unit.

jmorgan2y ago

How much memory do you have in your Macbook? The 7B models seem to work well with at least 16GB of unified memory, but I’ve seen Macs with 8GB really struggle.

mavsman2y ago

Indeed it's just a poor little 8GB of RAM.

overnight53492y ago· 5 in thread

Could this do something like take in the contents of my web history for the day and summarize notes on what I've been researching?

This is getting very close to my ideal of a personal AI. It's only gonna be a few more years until I can have a digital brain filled with everything I know. I can't wait

110OP2y ago

That would be pretty awesome. Building a daily web history summarizer as a browser extension shouldn't be too much work. I bet there's something like that already out there.

Having something that indexes all your digital travels and makes it easily digestible will be gold. Hopefully Khoj can become that :)

reitanqild2y ago

> I bet there's something like that already out there.

There was.

It was called Google Desktop Search, it was awesome, and it was axed.

That said, today I wouldn't use it anyway as both I and Google have changed a lot.

1 more reply

porcc2y ago

https://www.rewind.ai

aliasxneo2y ago

Have you used this? Looks fairly interesting.

1 more reply

usehackernews2y ago

Interesting, this is the exact question that came to mind for me. This would address a pain point for me.

Does anyone have recommendations for a tool that does it?

Or, anyone want to build it together?

kljuka2y ago· 5 in thread

110OP2y ago

Search should work with Slavic languages including Russian and 50+ other languages.

You'll just need to configure the asymmetric search model khoj uses to paraphrase-multilingual-MiniLM-L12-v2 in your ~/.khoj/khoj.yml config file

See http://docs.khoj.dev/#/advanced?id=search-across-different-l...

110OP2y ago

Khoj chat with Llama2 will not work with non-english languages though. You'll have to enable OpenAI for that

tom9102y ago

Yes, I confirm. I have many articles in Russian language and the search can not find relative information, but if I try to search in English it works fine and can find documents that use English

omniglottal2y ago

So you're saying you got no results when searching for patterns which did not exist in the dataset...?

kljuka2y ago

There was always full list of results. They just weren't relevant.

coder5432y ago· 4 in thread

This seems like a cool project.

110OP2y ago

Yeah being able to search and chat with PDF files is quite useful.

Khoj can index directory of PDFs for search and chat. But it does not currently work with scanned PDF files (i.e not with ones without selectable text).

Being able to work with those would be awesome. We just need to get to it. Hopefully soon

adr1an2y ago

Check pdftotext it's a CLI tool (maybe a library too) that makes pdf text selectable. Oh sorry, I meant to say ocrmypdf. But hey, maybe it's worth checking both.

samstave2y ago

etc...

I've wanted a "COMPUTER.", uh... I say "COMPUTER!", 'sir, you have to use the keyboard', ah a Keyboard, how quaint.... forever.

110OP2y ago

That.would.be.awesome! Khoj isn't their yet, but that actually shouldn't be too far away if you give it a voice interface and terminal access.

Of course, having it be stable enough to not `rm -rf /` soon after is definitely not part of the warranty

1 more reply

spdustin2y ago· 4 in thread

I see you’re using gpt4all; do you have a supported way to change the model being used for local inference?

[0]: https://github.com/abetlen/llama-cpp-python

syntaxing2y ago

sabaimran2y ago

110OP2y ago

No, we don't yet. Lots of developer folks want to try different models, we want to provide simple to use, but deep assistance. Kind of unsure what to focus on given our limited resources.

vunderba2y ago

1 more reply

wanderingmind2y ago· 4 in thread

Two comments

1. If you want better adoption especially among corporations, GPL-3 wont cut it. Maybe think of some business friendly licenses (MIT etc)

   (a) search across multiple folders, 

   (b) provide priority of results across folders, filetypes and 

   (c) store search histories where I can do a meta-search.

[1] https://github.com/phiresky/ripgrep-all/wiki/fzf-Integration

pvh2y ago

Just a note to suggest that giving away your hard work to those who will profit from it in the hope that they will remember you later seems like a pretty dubious exchange.

Have a look at how that worked out for the folks who built node and its libraries versus the ones who maintained control of their work (like npm).

quickthrower22y ago

Zuiii2y ago

If corporations have no issue with using restrictive proprietary licenses, they should not have any issues with the GPL.

trenchgun2y ago

That seems like a pretty trivial thing to implement. Why not do it yourself?

ramesh312y ago· 4 in thread

sabaimran2y ago

FWIW I've also had browser RAM consumption issues in life, but it's been mitigated by extensions like OneTab: https://chrome.google.com/webstore/detail/onetab/chphlpgkkbo...

thenickdude2y ago

The new Chrome "memory saver" feature that discards the contents of old tabs saves a lot of memory for me. Tabs get reloaded from the server if you revisit them.

Kwpolska2y ago

Or hopefully we will see an end of the LLM hype.

Or at least models that don’t hog so much RAM.

ramesh312y ago

>Or at least models that don’t hog so much RAM

1 more reply

IshKebab2y ago· 3 in thread

Interesting. The obvious question you haven't answered anywhere (as far as I can see) is what are the hardware requirements to run this locally?

110OP2y ago

Ah, you're right, forgot to mention that. We use the Llama 2 7B 4 bit quantized model. The machine requirements are:

Ideal: 16Gb (GPU) RAM

Less Ideal: 8GB RAM and CPU

danparsonson2y ago

What about if I have a GPU with 8GB?

1 more reply

HexDecOctBin2y ago

PS. Nice to see an Hindi name for a software. For those who don't speak Hindi: https://en.m.wiktionary.org/wiki/%E0%A4%96%E0%A5%8B%E0%A4%9C...

1 more reply

jigneshdarji912y ago· 3 in thread

This would be even great if available as a Spotlight Search replacement (with some additional features that Spotlight supports).

tough2y ago

Should be easy to plug it in with a Raycast.app or Alfred.app plugin.

110OP2y ago

Khoj exposes a local API. Hopefully that makes it easy to integrate with Raycast, Alfred (or even Spotlight)?

110OP2y ago

Yeah, this would be ideal for Mac users. Just need to look into what is required and how much work it is

IAmNotACellist2y ago· 2 in thread

What's the posthog telemetry used for? Why is there nothing on it in the docs? Why no clear way to opt out?

sabaimran2y ago

Thanks for pointing that out!

We use it for understanding usage -- like determining whether people are using markdown or org or more.

Everything is collected entirely anonymized, and no identifiable information is ever sent to the telemetry server.

To opt-out, you set the `should-log-telemetry` value in `khoj.yml` to false. Updated the docs to include these instructions and what we collect -- https://docs.khoj.dev/#/telemetry.

Kerbonut2y ago

It’s pretty easy to remove which is what I ended up doing. The project works remarkably well otherwise.

mmanfrin2y ago· 2 in thread

As someone who's been getting int o using Obsidian and messing around with chat ais, this is excellent, thank you!

110OP2y ago

Thanks! Do try it out and let us know if it works for your use-case?

tarwin2y ago

Really encourages me to move to Obsidion :D

smcleod2y ago· 2 in thread

I've been playing with Khoj for the past day - it's really neat, well done!

A few observations:

1. Telemetry is enabled by default, and may contain the API and chat queries. I've logged an issue for this along with some suggestions here: https://github.com/khoj-ai/khoj/issues/389

2. It would be advantageous to have configuration in the UI rather than baking it's YAML into the container image. (added a note on that in the aforementioned issue on Github).

4. AMD GPU/APU acceleration (CLBLAS) would be really nice, I've logged an issue for this feature request as well. https://github.com/khoj-ai/khoj/issues/390

sabaimran2y ago

Thanks for the feedback! Much appreciated.

I responded in the issue, but I'll paste here as well for those also curious:

[1]: https://github.com/khoj-ai/khoj/tree/master/src/telemetry

[2]: https://github.com/khoj-ai/khoj/blob/master/src/khoj/routers...

Configuration with the `docker-compose` setup is a little bit particular, see the issue^ for details.

Thanks for the reference points for GPU integration! Just to clarify, we do use GPU optimization for indexing, but not for local chat with Llama. We're looking into getting that working.

tmzt2y ago

Would it be possible to support a custom URL for the local model, such as running ./server in ggml would give you?

This may be more difficult if you are pre-tokenizing the search context.

Very cool project.

LanternLight832y ago· 2 in thread

110OP2y ago

That's (almost) exactly what khoj search provides a search-as-you-type experience but with a natural language (instead of keyword) search interface.

My workflow looks like: 1. Search with Khoj search[1]: `C-c s s` <search-query> RET 2. Use speed key to jump to relevant entry[2]: with `n n o 2`

[1]: `C-c s` is bound to `khoj` transient menu [2] https://orgmode.org/manual/Speed-Keys.html

Blackthorn2y ago

1 more reply

btbuildem2y ago· 2 in thread

Heads up, docker build fails with:

#12 2.017 ERROR: Could not find a version that satisfies the requirement pyside6>=6.5.1 (from khoj-assistant) (from versions: none)

#12 2.017 ERROR: No matching distribution found for pyside6>=6.5.1

------

executor failed running [/bin/sh -c sed -i 's/dynamic = \["version"\]/version = "0.0.0"/' pyproject.toml && pip install --no-cache-dir .]: exit code: 1

sabaimran2y ago

Darn, I've seen this error a couple of times. Can you drop a couple of details in this Github issue? https://github.com/khoj-ai/khoj/issues/391

I'm particularly interested in your OS/build environment.

btbuildem2y ago

Sure thing! I left a comment.

The "buildx" flag gets me past that one, and to the next error:

#0 12.37 ERROR: Could not find a version that satisfies the requirement gpt4all>=1.0.7 (from khoj-assistant) (from versions: 0.1.5, 0.1.6, 0.1.7)

#0 12.37 ERROR: No matching distribution found for gpt4all>=1.0.7

thangngoc892y ago· 2 in thread

I’m in search of a new Macbook Mx. what is the requirements for running these model locally without breaking the bank? Would 32GB be enough?

110OP2y ago

You do not need to break the bank to use Khoj for local chat, 16Gb RAM should be good enough

DoctorOetker2y ago

How slow would that be on an old non-Apple laptop, but also 16Gb RAM?

lscpu output: Architecture: x86_64

  CPU op-mode(s):        32-bit, 64-bit

  Address sizes:         36 bits physical, 48 bits virtual

  Byte Order:            Little Endian

CPU(s): 8

  On-line CPU(s) list:   0-7

Vendor ID: GenuineIntel

  Model name:            Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz

    CPU family:          6

    Model:               58

    Thread(s) per core:  2

    Core(s) per socket:  4

    Socket(s):           1

    Stepping:            9

    CPU(s) scaling MHz:  35%

    CPU max MHz:         3400.0000

    CPU min MHz:         1200.0000

    BogoMIPS:            4791.90

    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cp

                         uid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr
                         _shadow vnmi flexpriority ept vpid fsgsbase smep erms

xsaveopt dtherm ida arat pln pts md_clear flush_l1d

  L1d:                   128 KiB (4 instances)

  L1i:                   128 KiB (4 instances)

  L2:                    1 MiB (4 instances)

  L3:                    6 MiB (1 instance)

NUMA:

  NUMA node(s):          1

  NUMA node0 CPU(s):     0-7

1 more reply

Hypergraphe2y ago· 2 in thread

Hi, my dream app ! Will it work on non english sources ?

110OP2y ago

To use Chat with non-english sources you'll need to enable OpenAI. Offline chat with Llama 2 can't do that yet.

And Search can be configured to work with 50+ languages.

You'll just need to configure the asymmetric search model khoj uses to paraphrase-multilingual-MiniLM-L12-v2 in your ~/.khoj/khoj.yml config file

For setup details see http://docs.khoj.dev/#/advanced?id=search-across-different-l...

Hypergraphe2y ago

Thank you for your reply. I was thinking of using a translation model to translate all of my documents to english before indexing them.

wg02y ago· 1 in thread

Irrelevant opinion - The logo is beautiful, I like it and so are the colours used.

Lastly, LLMA2 for such use cases, I think is capable enough that paying for ChatGPT won't be as lucrative especially when privacy is of concern.

Keep it up. Good craftsmanship. :)

sabaimran2y ago

Thanks! I do think Llama V2 is going to be a good enough replacement for ChatGPT (aka GPT3.5) for a lot of use cases.

RomanHauksson2y ago· 1 in thread

Awesome work, I've been looking for something like this. Any plans to support Logseq in the future?

110OP2y ago

Yes, we hope to get to it soon! This has been an ask on our Github since a while[1]

[1]: https://github.com/khoj-ai/khoj/issues/141

ubertaco2y ago· 1 in thread

Hey, I saw Khoj hit HN a few weeks ago and get slaughtered because the messaging didn't match the product.

You've come a good way in both directions: the messaging is clearer about current state vs aspirations, and you've made good progress towards the aspirational parts.

Really glad to see the warm reception you're getting now. Nice job, y'all.

sabaimran2y ago

bajirao2y ago· 1 in thread

What's the recommended 'size' of the machine to run this?

110OP2y ago

Thanks for the feedback. Does your machine have a GPU? 32GB CPU RAM should be enough but GPU speeds up response time.

We have fixes for the seg fault[1] and improvement to the query speed[2] that should be released by end of day today[3].

Update khoj to version 0.10.1 with pip install --upgrade khoj-assistant later today to see if that improves your experience.

The number of documents/pages/entries doesn't scale memory utilization as quickly and doesn't affect the search, chat response time as much

[1]: The seg fault would occur when folks sent multiple chat queries at the same time. A lock and some UX improvements fixed that

[2]: The query time improvements are done by increasing batch size, to trade-off increased memory utilization for more speed

[3]: The relevant pull request for reference: https://github.com/khoj-ai/khoj/pull/393

asynchronous2y ago· 1 in thread

This is very cool, the Obsidian integration is a neat feature.

Please, someone make a home-assistant Alexa clone for this.

110OP2y ago

Thanks!

We've just been testing integrating over voice, whatsapp over the last few days[1][2] :)

[1]: https://github.com/khoj-ai/khoj/tree/khoj-chat-over-whatsapp...

[2]: https://github.com/khoj-ai/khoj/compare/master...features/wh...

bozhark2y ago· 1 in thread

I’m not a software dev.

Is there a way to have this bot read from a discord and google drive?

syntaxing2y ago

[1] https://gpt4all.io/index.html

pachico2y ago· 1 in thread

Would anybody be able to recommend any standalone solution (essentially data must not leave elsewhere) to chat with documents with a web interface?

I tried privategpt but results were not great.

110OP2y ago

Khoj provides exactly that; it runs on your machine, none of your data leaves your machine and it has a web interface to chat

Roark662y ago· 1 in thread

I can't wait for software that will take my notes each day and fine tune a LLM model on them so I can use entire context length for my question/answers.

ankit2192y ago

> I can't wait for software that will take my notes each day and fine tune a LLM model on them so I can use entire context length for my question/answers

In future, there can be a system where you can build a knowledge base for LLMs, and tell it to access that for any knowledge, and finetune it for the patterns you want the output in.

yberreby2y ago· 1 in thread

Cool project. I tried it last time this got posted, but it was still a bit buggy. Giving it another shot - I'm mainly interested in the local chat.

Could you elaborate on the incremental search feature? How did you implement it? Don't you need to re-encode the full query through a SBERT or such as each token is written (perhaps with debouncing)?

Also, having an easily-extended data connector interface would be awesome, to connect to custom data sources.

110OP2y ago

Buggy for setup? We've done some improvements and have desktop apps (in beta) too now to simplify this. Feel free to report any issues on the khoj github. I can have a look.

Yes, we don't do optimizations on the query encoding yet. So SBERT just re-encodes the whole query every time. It gets results in <100ms which is good enough for incremental search.

I did create a plugin system, so that a data plugin just has to convert the source data into a standardized intermeditate jsonl format. But this hasn't been documented or extensively tested yet.

throw031720192y ago· 1 in thread

Feedback for landing page: use a fixed height container for the example prompts. Without it, it causes jumping while scrolling down the page making other sections hard to read. iOS Safari

110OP2y ago

Thanks for the feedback! Someone else mentioned this issue the other day as well. I'll fix this issue on the landing page soon

calnayak2y ago· 1 in thread

How does one access this from a web browser?

sabaimran2y ago

But that would allow you to access Khoj from the web.

Bonapara2y ago· 1 in thread

Congrats guys!

110OP2y ago

Thanks! :)

jsdeveloper2y ago· 1 in thread

will it work on linux?(ubuntu)