It seems to me that this can be easily added ad-hoc by programmers in their projects. But the biggest reason I see may be standardization of some kind? For interoperable building blocks maybe
Langchain is a swiss army knife of tools. And while that can be a good thing, it's mostly a bad thing when you're trying to build something for production. It's also not very flexible... when you're trying to do anything beyond the pre-built stuff, you have to edit the package itself... which means maintaining your own forked version or submitting a PR with your changes. For example, I improved upon the main text splitter Langchain recommends (https://github.com/ShelbyJenkins/langchain/blob/ca14d3028a57...), but I just haven't gotten around to doing everything required for a PR to add the feature (tests, docs, notebooks). So now I have two repos to maintain...
But I also really don't want to take the time to set up a web scraper for a dozen different document sources when Langchain has tools that will work with a minimal amount of tweaking. So, I'm happy to use it for things like that, and happy to recommend it for it's community built pocket-knife tools!
It’s simple: a bit of cargo culting and the real advantage that LangChain provides in its interoperability with various other useful systems.
In addition, when it comes to prototyping for a specific use case, we found it is often more than just calling the model but also the orchestration process matters, for example, when should LLM agent stop answering questions, fix input argument, ask a custom clarifying questions and more.
Hope AutoChain makes your exploration easier and more robust!
So maybe a simpler library like Microsoft's Guidance (https://github.com/microsoft/guidance)? It does this really well.
Also LangChain has a lot of integrations, that just pop up as soon as new API for anything LLM pops up, so that helps with new user onboarding as well.
At its core, what chatgpt and the like really are, are ways to transform natural language into structured data and vice versa. All you really need is an interface that allows you to plug in these APIs to _normal_ workflow engines that lots of companies are using every day to do work, both fully automated workflow engines like airflow or whatever, and also just plugging them into the _exact same systems_ that human engineers use -- it's perfectly possible to get chatgpt to use Jira and submit PRs or whatever.
I've seen quite the opposite, because at their core they are next word predictors. Telling it to structure an output biases the context into data that it saw in a structured format. From what I've seen, this always leads to lower quality reasoning/output.
I get much better results if I leave the data unstructured, to maximize reasoning, then add a later step to structure that output.
1. Inverting the control structure. Traditional orchestration frameworks blackbox simple calls, like asking an LLM to analyze a patch. Chaining is about an LLM-based planning agent + ensuring all tools etc are abstracted so the planner can orchestrate them. Airflow era tools leave too many integration gaps here.
2. Plugin interfaces. Related to 1, it puts in patterns for keeping vendors easy to swap out. (In theory, practice takes more bc tuning, but at least the shape is right.) Orchestration frameworks and new LLM ones have a lot plugin ground to cover here.
We had to do a lot of work around langchain to make prod ready for louie.ai, and just as we have been figuring out patterns we need, was clear so was langchain. Don't get me started on threading and eval..
You need some script that takes structured json and uses it to modify a jira ticket, you need some other script that asks openai to generate that json based on some prompt and context, and that's all there is to it. I wrote a script right after I got access to the API that could open and refine jira tickets based on conversations in a chat window.
"conversation_history":
"user: I want to change the shipping address for my order",
"assistant: Can you please provide the order ID?",
"user: The order ID is 456",
"assistant: The order with ID 456 has not yet shipped. I can update the shipping address for you. Please provide the new shipping address.",
"user: The new shipping address is 234 Spear St, San Francisco",
"assistant: The shipping address for your order with ID 456 has been successfully updated to 234 Spear St, San Francisco.",
"user: Thank you for updating the shipping address for my order",
"assistant: Thank your for contacting"
"is_agent_helpful":
{
"rating": 5,
"reason": "The conversation reaches the expected outcome. The user successfully provides the order ID and the new shipping address, and the assistant updates the shipping address for the order with ID 456. The assistant confirms the successful update and thanks the user for contacting."
},
"actions_took":
{
"tool": "check_order_status",
"tool_input":
{
"order_id": "456"
},
"tool_output":
{
"status_code": 200,
"order_id": "456",
"order_status": "not_shipped",
"tracking_url": "example.com/456",
"shipping_address": "301 ivy street san francisco ca"
}
},
{
"tool": "change_shipping_address",
"tool_input":
{
"order_id": "456",
"new_address": "234 Spear St, San Francisco"
},
"tool_output":
{
"status_code": 200,
"order_id": "456",
"shipping_address": "234 Spear St, San Francisco"
}
}
"num_turns": 8,
"expected_outcome": "found order status and changed shipping address"I don't think the jury is in yet. People love to be able to ask any bad/lazy question and get good answers. GUIs are not great for that.
Especially anything AI-related seems to just assume everybody uses Python.
Jokes aside, quick glance at the language pane on GitHub shows what languages are being used.
The goal is to enable rapid iteration on generative agents, both by simplifying agent customization and evaluation.
If you have any questions, please feel free to reach out to Yi Lu yi.lu@forethought.ai