What if, instead of a codebase, the files were all your workplace docs? There was a `Google_Drive` folder, a `Linear` folder, a `Slack` folder, and so on. Over the last week, we put together Craft to test this out.
It’s an interface to a coding agent (OpenCode for model flexibility) running on a virtual machine with: 1. your company's complete knowledge base represented as directories/files (kept in-sync) 2. free reign to write and execute python/javascript 3. ability to create and render artifacts to the user
Demo: https://www.youtube.com/watch?v=Hvjn76YSIRY Github: https://github.com/onyx-dot-app/onyx/blob/main/web/src/app/c...
It turns out OpenCode does a very good job with docs. Workplace apps also have a natural structure (Slack channels about certain topics, Drive folders for teams, etc.). And since the full metadata of each document can be written to the file, the LLM can define arbitrarily complex filters. At scale, it can write and execute python to extract and filter (and even re-use the verified correct logic later).
Put another way, bash + a file system provides a much more flexible and powerful interface than traditional RAG or MCP, which today’s smarter LLMs are able to take advantage of to great effect. This comes especially in handy for aggregation style questions that require considering thousands (or more) documents.
Naturally, it can also create artifacts that stay up to date based on your company docs. So if you wanted “a dashboard to check realtime what % of outages were caused by each backend service” or simply “slides following XYZ format covering the topic I’m presenting at next week’s dev knowledge sharing session”, it can do that too.
Craft (like the rest of Onyx) is open-source, so if you want to run it locally (or mess around with the implementation) you can.
Quickstart guide: https://docs.onyx.app/deployment/getting_started/quickstart Or, you can try it on our cloud: https://cloud.onyx.app/auth/signup (all your data goes on an isolated sandbox).
Either way, we’ve set up a “demo” environment that you can play with while your data gets indexed. Really curious to hear what y’all think!
Demo: https://youtu.be/2g4BxTZ9ztg
Two years ago, Yuhong and I had the same recurring problem. We were on growing teams and it was ridiculously difficult to find the right information across our docs, Slack, meeting notes, etc. Existing solutions required sending out our company's data, lacked customization, and frankly didn't work well. So, we started Danswer, an open-source enterprise search project built to be self-hosted and easily customized.
As the project grew, we started seeing an interesting trend—even though we were explicitly a search app, people wanted to use Danswer just to chat with LLMs. We’d hear, “the connectors, indexing, and search are great, but I’m going to start by connecting GPT-4o, Claude Sonnet 4, and Qwen to provide my team with a secure way to use them”.
Many users would add RAG, agents, and custom tools later, but much of the usage stayed ‘basic chat’. We thought: “why would people co-opt an enterprise search when other AI chat solutions exist?”
As we continued talking to users, we realized two key points:
(1) just giving a company secure access to an LLM with a great UI and simple tools is a huge part of the value add of AI
(2) providing this well is much harder than you might think and the bar is incredibly high
Consumer products like ChatGPT and Claude already provide a great experience—and chat with AI for work is something (ideally) everyone at the company uses 10+ times per day. People expect the same snappy, simple, and intuitive UX with a full feature set. Getting hundreds of small details right to take the experience from “this works” to “this feels magical” is not easy, and nothing else in the space has managed to do it.
So ~3 months ago we pivoted to Onyx, the open-source chat UI with:
- (truly) world class chat UX. Usable both by a fresh college grad who grew up with AI and an industry veteran who’s using AI tools for the first time.
- Support for all the common add-ons: RAG, connectors, web search, custom tools, MCP, assistants, deep research.
- RBAC, SSO, permission syncing, easy on-prem hosting to make it work for larger enterprises.
Through building features like deep research and code interpreter that work across model providers, we've learned a ton of non-obvious things about engineering LLMs that have been key to making Onyx work. I'd like to share two that were particularly interesting (happy to discuss more in the comments).
First, context management is one of the most difficult and important things to get right. We’ve found that LLMs really struggle to remember both system prompts and previous user messages in long conversations. Even simple instructions like “ignore sources of type X” in the system prompt are very often ignored. This is exacerbated by multiple tool calls, which can often feed in huge amounts of context. We solved this problem with a “Reminder” prompt—a short 1-3 sentence blurb injected at the end of the user message that describes the non-negotiables that the LLM must abide by. Empirically, LLMs attend most to the very end of the context window, so this placement gives the highest likelihood of adherence.
Second, we’ve needed to build an understanding of the “natural tendencies” of certain models when using tools, and build around them. For example, the GPT family of models are fine-tuned to use a python code interpreter that operates in a Jupyter notebook. Even if told explicitly, it refuses to add `print()` around the last line, since, in Jupyter, this last line is automatically written to stdout. Other models don’t have this strong preference, so we’ve had to design our model-agnostic code interpreter to also automatically `print()` the last bare line.
So far, we’ve had a Fortune 100 team fork Onyx and provide 10k+ employees access to every model within a single interface, and create thousands of use-case specific Assistants for every department, each using the best model for the job. We’ve seen teams operating in sensitive industries completely airgap Onyx w/ locally hosted LLMs to provide a copilot that wouldn’t have been possible otherwise.
If you’d like to try Onyx out, follow https://docs.onyx.app/deployment/getting_started/quickstart to get set up locally w/ Docker in <15 minutes. For our Cloud: https://www.onyx.app/. If there’s anything you'd like to see to make it a no-brainer to replace your ChatGPT Enterprise/Claude Enterprise subscription, we’d love to hear it!
Demo video: https://www.youtube.com/watch?v=6cDiv-TShh4
Code here: https://github.com/danswer-ai/danswer
Setup docs: https://docs.danswer.dev/quickstart
Danswer is self-hostable and comes ready to use out of the box with a UI, admin features, 20+ auto-syncing connectors, and user/access management. All your data is persisted on-prem and you can easily customize it to hook up to any LLM provider of your choice (though we recommend GPT-4 for best results).
As you ask questions, the system runs retrieval as needed to fetch relevant documents and uses GenAI to help you find the answers you’re seeking. You can also select specific documents to chat with at any point and deselect them when you want the system to run more retrievals. You can further refine your search by applying time and source type filters.
With Danswer you can also create document-sets for specific use cases. You can easily group together knowledge for different purposes like onboarding, HR answers, team specific knowledge, finance, etc. You can combine different sources at different granularities such as creating a document-set which is a combination of Slack channels + some set of Google Drive Folders + some workspaces from Confluence, etc. You get the picture.
Finally, just like GPTs, you can configure custom prompts and create different assistants for varied tasks like debugging, summarizing, generating content, all while pulling in the context documents you want to include. Out of the box, Danswer comes with several pre-configured general use assistants.
You can use Danswer to improve your operational efficiencies for any team that needs it. We’ve seen users use it for onboarding, customer support, cross-team communications, sales-call prep, and as an engineering assistant.
We also have a Slack (https://join.slack.com/t/danswer/shared_invite/zt-1u3h3ke3b-...) and a Discord (https://discord.gg/TDJ59cGV2X) community. Always looking forward to user feedback as this helps us improve our software! See you there :)