Cost to Solve < Remaining LTV * Profit Margin
In other words, do the details matter? If the customer leaves because you don’t take a fraudulent $10 return, but he’s worth $1,000 in the long term, that’s dumb.You might think that such a user doesn’t exist. Then you’d be getting the details wrong again! Example: Should ISPs disconnect users for piracy? Should Apple close your iCloud sub for pirating Apple TV? Should Amazon lose accounts for rejecting returns? Etc etc.
A business that makes CS more details oriented is 200% the wrong solution.
There are a whole class of problems that do not require low-latency. But not having consistency makes them pretty useless.
Frameworks don’t solve that. You’ll probably need some sort of ground-truth injection at every sub-agent level. Ie: you just need data.
Totally agree with you. Unreliability is the thing that needs solving first.
Sounds like management to me.
How does gpt o1 solve this?
I've been using the general agent to build specialised sub-agents. Here's an example search agent beating perplexity: https://x.com/xundecidability/status/1835059091506450493
I'm failing to see the point in the example, unless the agents can do things on multiple threads. For example let's say we have Boss Agent.
I can ask Boss agent to organize a trip for five people to the Netherlands.
Boss agent can ask some basic questions, about where my Friends are traveling from, and what our budget is .
Then travel agent can go and look up how we each can get there, hotel agent can search for hotel prices, weather agent can make sure it's nice out, sightseeing agent can suggest things for us to do. And I guess correspondence agent can send out emails to my actual friends.
If this is multi-threaded, you could get a ton of work done much faster. But if it's all running on a single thread anyway, then couldn't boss agent just switch functionality after completing each job ?
Given that there is (a fairly standard) API to interact with LLMs, the next question is, what abstractions and primitives help easily build applications on top of these, while giving enough flexibility for complex use cases.
The features in Langroid have evolved in response to the requirements of various use-cases that arose while building applications for clients, or companies that have requested them.
Inference speed is being rapidly optimized, especially for edge devices.
> too expensive,
The half-life of OpenAI's API pricing is a couple of months. While the bleeding edge model is always costly, the cost of API's are becoming rapidly available to the public.
> and too unreliable
Out of the 3 points raised, this is probably the most up in the air. Personally I chalk this up to sideeffects of OpenAI's rapid growth over the last few years. I think this gets solved, especially once price and latency have been figured out.
IMO, the biggest unknown here isn't a technical one, but rather a business one- I don't think it's certain that products built on multi-agent architectures will be addressing a need for end users. Most of the talk I see in this space are by people excited by building with LLM's, not by people who are asking to pay for these products.
I don’t think the tech is ready yet for other reasons, but absence of anyone publishing is not good evidence against.
https://en.wikipedia.org/wiki/Swarm_(simulation)
https://www.santafe.edu/research/results/working-papers/the-...
Fun fact: Swarm was one of the very few non-NeXT/Apple uses of Objective C. We used the GNU Objective C runtime. Dynamic typing was a huge help for multiagent programming compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)
I enjoyed using it around 2002, got introduced via Rick Riolo at the the University of Michigan Center for the Study of Complex Systems. It was a bit of a gateway drug for me from software into modeling, particularly since I was already doing OS X/Cocoa stuff in Objective-C.
A lot of scientific modelers start with differential equations, but coming from object-oriented software ABMs made a lot more sense to me, and learning both approaches in parallel was really helpful in thinking about scale, dimensionality, representation, etc. in the modeling process, as ODEs and complex ABMs—often pathologically complex—represent end points of a continuum.
Tangentially, in one of Rick's classes we read about perceptrons, and at one point the conversation turned to, hey, would it be possible to just dump all the text of the Internet into a neural net? And here we are.
C++ has added a ton of great features since (especially C++11 onward) but run-time reflection is still sorely missed.
https://youtube.com/playlist?list=PL6zSfYNSRHalAsgIjHHsttpYf...
The idea was to think about it from different directions including academia, industry, and education.
Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things. There was a talk on high dimensional systems modelled with networks but the speaker didn't want their talk published online.
Anyways I'm happy to chat more about these topics. I'm obsessed with understanding complexity using ai, modelling, and other methods.
As-is, it's hard to skim the playlist, and likely terrible for organic search on Google or YouTube <3
> Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things.
To answer your question I did build a simulation of how a multi model agent swarm - agents have different capabilities and run times - would impact the end user wait time based on arbitrary message parsing graphs.
After playing with it for an afternoon I realized I was basically doing a very wasteful Markov chain enumeration algorithm and wrote one up accordingly.
> Swarm is currently an experimental sample framework intended to explore ergonomic interfaces for multi-agent systems. It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)
It’s literally not meant to replace anything.
IMO the reason there’s no langchain replacement is because everything langchain does is so darn easy to do yourself, there’s hardly a point in taking on another dependency.
Though griptape.ai also exists.
> Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.
Check out Microsoft Semantic Kernel: https://github.com/microsoft/semantic-kernelSupports .NET, Java, and Python. Lots of sample code[0] and support for agents[1] including a detailed guide[2].
We use it at our startup (the .NET version). It was initially quite unstable in the early days because of frequent breaking changes, but it has stabilized (for the most part). Note: the official docs may still be trailing, but the code samples in the repo and unit tests are up to date.
Highly recommended.
[0] https://github.com/microsoft/semantic-kernel/tree/main/pytho...
[1] https://github.com/microsoft/semantic-kernel/tree/main/pytho...
[2] https://github.com/microsoft/semantic-kernel/tree/main/pytho...
Their recent realtime demo had so many race conditions, function calling didn't even work, and the patch suggested by the community hasn't been merged for a week.
https://github.com/openai/openai-realtime-api-beta/issues/14
Not speaking for OpenAI here, only myself — but this is not an official SDK — only a reference implementation. The included relay is only intended as an example. The issues here will certainly be tackled for the production release of the API :).
I’d love to build something more full-featured here and may approach it as a side project. Feel free to ping me directly if you have ideas. @keithwhor on GitHub / X dot com.
Do they use their own product?
https://github.com/langroid/langroid
Among many other things, we have a mature tools implementation, especially tools for orchestration (for addressing messages, controlling task flow, etc) and recently added XML-based tools that are especially useful when you want an LLM to return code via tools -- this is much more reliable than returning code in JSON-based tools.
It's MIT licensed.
“Conretely, let's define a routine to be a list of instructions in natural langauge (which we'll repreesnt with a system prompt), along with the tools necessary to complete them.”
I count 3 in one mini paragraph. Is GPT writing this and being asked to add errors, or is GPT not worth using for their own content?
> Yes, basically. Delete any kyegomez link on sight. He namesquats recent papers for the clout, though the code never actually runs, much less replicates the paper results. We've had problems in /r/mlscaling with people unwittingly linking his garbage - we haven't bothered to set up an Automod rule, though.
[0] https://github.com/princeton-nlp/tree-of-thought-llm/issues/...
What really bothers me is that this kyegomez person wasted time and energy of so many people and for what?
Most likely outcome is if they try to actually pursue this they lose their "trademark" and the costs drive them out of business.
[1] I didn't misremember https://www.swarm.org/wiki/Swarm:Software_main_page
Bad press is still press XD
> "Swarms: The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework"
Nope this doesn't mean it at all. You decided additionaly and independent from the other statements that you do not allow collaboration at all.
Which is fine the sentence is still unlogical
The real challenge for at scale inference is that the compute for models is too long to keep normal API connections open and you need a message passing system in place. This system also needs to be able to deliver large files for multi-modal models if it's not going to be obsolete in a year or two.
I build a proof of concept using email of all things but could never get anyone to fund the real deal which could run at larger than web scale.
An example use with AWS Bedrock: https://temporal.io/blog/amazon-bedrock-with-temporal-rock-s...
But I find this approach working well overall.
Moreover it is easily debuggable and testable in isolation which is one of the biggest selling point.
(If anyone is building ai products feel free to hit me.)