Swarm, a new agent framework by OpenAI (opens in new tab)

(github.com)

258 pointsmnk471y ago106 comments

106 comments

Has anyone seen AI agents working in production at scale? It doesn't matter if you're using Swarm, langchain, or any other orchestration framework if the underlying issue is that AI agents too slow, too expensive, and too unreliable. I wrote about AI agent hype vs. reality[0] a while ago, and I don't think it has changed yet.

[0] https://www.kadoa.com/blog/ai-agents-hype-vs-reality

fnordpiglet1y ago

Yes we use agents in a human support agent facing application that has many sub agents used to summarize and analyze a lot of different models, prior support cases, knowledge base information, third party data sets, etc, to form an expert in a specific customer and their unique situation in detected potential fraud and other cases. The goal of the expert is to reduce the cognitive load of our support agent in analyzing some often complex situation with lots of information more rapidly and reliably. Because there is no right answer and the goal is error reduction not elimination it’s not necessary to have determinism, just do better than a human at understanding a lot of divergent information rapidly and answering various queries. Cost isn’t an issue because the decisions are high value. Speed isn’t an issue because the alternative is a human attempting to make sense of an enormous amount of information in many systems. It has dramatically improved our precision and recall over pure humans.

doctorpangloss1y ago

Isn’t the best customer service:

    Cost to Solve < Remaining LTV * Profit Margin

In other words, do the details matter? If the customer leaves because you don’t take a fraudulent $10 return, but he’s worth $1,000 in the long term, that’s dumb.

You might think that such a user doesn’t exist. Then you’d be getting the details wrong again! Example: Should ISPs disconnect users for piracy? Should Apple close your iCloud sub for pirating Apple TV? Should Amazon lose accounts for rejecting returns? Etc etc.

A business that makes CS more details oriented is 200% the wrong solution.

fnordpiglet1y ago

The fraud we deal with is a lot more than $10.

2 more replies

LASR1y ago

The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.

There are a whole class of problems that do not require low-latency. But not having consistency makes them pretty useless.

Frameworks don’t solve that. You’ll probably need some sort of ground-truth injection at every sub-agent level. Ie: you just need data.

Totally agree with you. Unreliability is the thing that needs solving first.

debo_1y ago

> The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.

Sounds like management to me.

mycall1y ago

Sticky goals and reevaluation of tasks is one way to keep the end result on track.

How does gpt o1 solve this?

irthomasthomas1y ago

I use my own agent all day, every day. Here is one example: https://x.com/xundecidability/status/1835085853506650269

I've been using the general agent to build specialised sub-agents. Here's an example search agent beating perplexity: https://x.com/xundecidability/status/1835059091506450493

9999000009991y ago

Do you have any code to share?

I'm failing to see the point in the example, unless the agents can do things on multiple threads. For example let's say we have Boss Agent.

I can ask Boss agent to organize a trip for five people to the Netherlands.

Boss agent can ask some basic questions, about where my Friends are traveling from, and what our budget is .

Then travel agent can go and look up how we each can get there, hotel agent can search for hotel prices, weather agent can make sure it's nice out, sightseeing agent can suggest things for us to do. And I guess correspondence agent can send out emails to my actual friends.

If this is multi-threaded, you could get a ton of work done much faster. But if it's all running on a single thread anyway, then couldn't boss agent just switch functionality after completing each job ?

irthomasthomas1y ago

That particular task didn't need parallel agents or any of the advanced features.

The prompt was: <prompt> Research claude pricing with caching and then review a conversation history to calculate the cost. First, search online for pricing for anthropic api with and without caching enabled for all of the models: claude-3-haiku, claude-3-opus and claude-3.5-sonnet (sonnet 3.5). Create a json file with ALL the pricing data.

from the llm history db, fetch the response.response_json.usage for each result under conversation_id=01j7jzcbxzrspg7qz9h8xbq1ww llm_db=$(llm logs path) schema=$(sqlite3 $llm_db '.schema') example usage: { "input_tokens": 1086, "output_tokens": 1154, "cache_creation_input_tokens": 2364, "cache_read_input_tokens": 0 }

Calculate the actual costs of each prompt by using the usage object for each response based the actual token usage cached or not. Also calculate/simulate what it would have cost if the tokens where not cached. Create interactive graphs of different kinds to show the real cost of conversation, the cache usage, and a comparison to what it would have costed without caching.

Write to intermediary files along the way.

Ask me if anything is unclear. </prompt>

I just gave it your task and I'll share the results tomorrow (I'm off to bed).

fsndz1y ago

True. In the classic form of automation, reasoning is externalized into rules. In the case of AI agents, reasoning is internalized within a language model. This is a fundamental difference. The problem is that language models are not designed to reason. They are designed to predict the next most likely word. They mimic human skills but possess no general intelligence. They are not ready to function without a human in the loop. So, what are the implications of this new form of automation that AI agents represent? https://www.lycee.ai/blog/ai-agents-automation-eng

xrd1y ago

I want hear more about this. I'm playing with langroid, crew.ai, and dspy and they all layer so many abstractions on top of a shifting LLM landscape. I can't believe anyone is really using them in the way their readme goals profess.

d4rkp4ttern1y ago

Not you in particular, but I hear this common refrain that the "LLM landscape is shifting", but what exactly is shifting? Yes new models are constantly announced, but at the end of the day, interacting with the LLMs involves making calls to an API, and the OpenAI API (and perhaps Anthropic's variant) has become fairly established, and this API will obviously not change significantly any time soon.

Given that there is (a fairly standard) API to interact with LLMs, the next question is, what abstractions and primitives help easily build applications on top of these, while giving enough flexibility for complex use cases.

The features in Langroid have evolved in response to the requirements of various use-cases that arose while building applications for clients, or companies that have requested them.

dimitri-vs1y ago

Sonnet 3.5 and other large context models made context management approaches irrelevant and will continue to do so.

o1 (and likely sonnet 3.5) made chain of through and other complex prompt engineering irrelevant.

Realtime API (and others that will soon follow) will made the best VTT > LLM > TTV irrelevant.

VLMs will likely make LLMs irrelevant. Who knows what Google has planned for Gemini 2.

The point is building these complex agents has been proven a waste of time over and over again until, at least until we see a plateau in models. It's much easier to swap in a single API call and modify one or two prompts than to rework a convoluted agentic approach. Especially when it's very clear that the same prompts can't be reused reliably between different models.

1 more reply

xrd1y ago

I appreciate your comment.

I suppose my comment is reserved more for the documentation than the actual models in the wild?

I do worry that LLM service providers won't do any better than rest API providers in versioning their backend. Even if we specify the model in the call to the API, it feels like it will silently be upgraded behind the scenes. There are so many parameters that could be adjusted to "improve" the experience for users even if the weights don't change.

I prefer to use open weight models when possible. But so many agentic frameworks, like this one (to be fair, I would not expect OpenAI to offer a framework that work local first), treat the local LLM experience as second class, at best.

1 more reply

islewis1y ago

> The underlying issue is that AI agents too slow,

Inference speed is being rapidly optimized, especially for edge devices.

> too expensive,

The half-life of OpenAI's API pricing is a couple of months. While the bleeding edge model is always costly, the cost of API's are becoming rapidly available to the public.

> and too unreliable

Out of the 3 points raised, this is probably the most up in the air. Personally I chalk this up to sideeffects of OpenAI's rapid growth over the last few years. I think this gets solved, especially once price and latency have been figured out.

IMO, the biggest unknown here isn't a technical one, but rather a business one- I don't think it's certain that products built on multi-agent architectures will be addressing a need for end users. Most of the talk I see in this space are by people excited by building with LLM's, not by people who are asking to pay for these products.

theptip1y ago

Frankly, what you are describing is a money-printing machine. You should expect anyone who has figured out such a thing to keep it as a trade secret, until the FOSS community figures out and publishes something comparable.

I don’t think the tech is ready yet for other reasons, but absence of anyone publishing is not good evidence against.

morgante1y ago

Agents can work in production, but usually only when they are closer to "workflows" that are very targeted to a specific use case.

mycall1y ago

If the solution the agents create is immediately useful, then waiting a few minutes or longer for the answer is fine.

inglor1y ago

Yes I built a lot of stuff (at batch, not to respond to user queries). Mostly large scale code generation and testing tasks.

antfarm1y ago

There used to be another open-source agentframework by the same name, but it was for multi-agent simulations. For a moment I thought there was a new wave of interest in a deeper understanding of complex systems by means of modelling.

https://en.wikipedia.org/wiki/Swarm_(simulation)

https://www.santafe.edu/research/results/working-papers/the-...

NelsonMinar1y ago

Hey, I wrote that! But it was nearly 30 years ago, it's OK for someone else to use the same name.

Fun fact: Swarm was one of the very few non-NeXT/Apple uses of Objective C. We used the GNU Objective C runtime. Dynamic typing was a huge help for multiagent programming compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)

edbaskerville1y ago

Hey, thanks for writing the original Swarm! Also thought of that immediately when I saw the headline.

I enjoyed using it around 2002, got introduced via Rick Riolo at the the University of Michigan Center for the Study of Complex Systems. It was a bit of a gateway drug for me from software into modeling, particularly since I was already doing OS X/Cocoa stuff in Objective-C.

A lot of scientific modelers start with differential equations, but coming from object-oriented software ABMs made a lot more sense to me, and learning both approaches in parallel was really helpful in thinking about scale, dimensionality, representation, etc. in the modeling process, as ODEs and complex ABMs—often pathologically complex—represent end points of a continuum.

Tangentially, in one of Rick's classes we read about perceptrons, and at one point the conversation turned to, hey, would it be possible to just dump all the text of the Internet into a neural net? And here we are.

NelsonMinar1y ago

I took a graduate level class in the 1990s from some SFI luminaries. It was a great class but the dismal conclusion was "this stuff is kind of neat but not very practical, traditional optimization techniques usually work better". None of us guessed if you could scale the networks up 1 million X or more they'd become magic.

seanhunter1y ago

Hey thanks for writing the original swarm. I found your framework very inspiring when I was conducting my own personal (pretty much universally failed) experiments into making this kind of multi-agent simulation.

darknavi1y ago

> compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)

C++ has added a ton of great features since (especially C++11 onward) but run-time reflection is still sorely missed.

mnky9800n1y ago

I believe there is a new wave of interest in deeper understanding of complex systems through modelling and connecting with machine learning. I organized this conference on exploring system dynamics with AI which you can see most of the lectures here:

https://youtube.com/playlist?list=PL6zSfYNSRHalAsgIjHHsttpYf...

The idea was to think about it from different directions including academia, industry, and education.

Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things. There was a talk on high dimensional systems modelled with networks but the speaker didn't want their talk published online.

Anyways I'm happy to chat more about these topics. I'm obsessed with understanding complexity using ai, modelling, and other methods.

patcon1y ago

This looks rad! But you should title the videos with the topic and the speakers name, and if you must include the conference name, put it at the end :)

As-is, it's hard to skim the playlist, and likely terrible for organic search on Google or YouTube <3

mnky9800n1y ago

I agree with you. Unfortunately I'm not in charge of the videos so even though I asked them to do this they didn't. Haha.

llm_trw1y ago

An AI conference that isn't bullshit hype? Will wonders never cease?

> Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things.

To answer your question I did build a simulation of how a multi model agent swarm - agents have different capabilities and run times - would impact the end user wait time based on arbitrary message parsing graphs.

After playing with it for an afternoon I realized I was basically doing a very wasteful Markov chain enumeration algorithm and wrote one up accordingly.

mnky9800n1y ago

Yeah I already have loads of people asking when the next one is for this exact reason. Haha. Well, I would love to have people help organise the next one. But I don't know yet

ac130kz1y ago

Looks kinda poorly written: not even a single async present, print debugging, deepcopy all over the place. Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.

dartos1y ago

I hold no love for openai, but to be fair (and balanced) they put this right in the start of their readme.

> Swarm is currently an experimental sample framework intended to explore ergonomic interfaces for multi-agent systems. It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)

It’s literally not meant to replace anything.

IMO the reason there’s no langchain replacement is because everything langchain does is so darn easy to do yourself, there’s hardly a point in taking on another dependency.

Though griptape.ai also exists.

1 more reply

CharlieDigital1y ago

    > Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.

Check out Microsoft Semantic Kernel: https://github.com/microsoft/semantic-kernel

Supports .NET, Java, and Python. Lots of sample code[0] and support for agents[1] including a detailed guide[2].

We use it at our startup (the .NET version). It was initially quite unstable in the early days because of frequent breaking changes, but it has stabilized (for the most part). Note: the official docs may still be trailing, but the code samples in the repo and unit tests are up to date.

Highly recommended.

[0] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

[1] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

[2] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

arnaudsm1y ago

OpenAI's code quality leaves to be desired, which is surprising considering how well compensated their engineers are.

Their recent realtime demo had so many race conditions, function calling didn't even work, and the patch suggested by the community hasn't been merged for a week.

https://github.com/openai/openai-realtime-api-beta/issues/14

keithwhor1y ago

Hey! I was responsible for developing this.

Not speaking for OpenAI here, only myself — but this is not an official SDK — only a reference implementation. The included relay is only intended as an example. The issues here will certainly be tackled for the production release of the API :).

I’d love to build something more full-featured here and may approach it as a side project. Feel free to ping me directly if you have ideas. @keithwhor on GitHub / X dot com.

arnaudsm1y ago

Thank you, will contact you asap, I'd be happy to help :)

croes1y ago

Why need they engineers if they have GPT?

Do they use their own product?

d4rkp4ttern1y ago

You can have a look at Langroid, an agent-oriented LLM framework from CMU/UW-Madison researchers (I am the lead dev). We are seeing companies using it in production in preference to other libs mentioned here.

https://github.com/langroid/langroid

Among many other things, we have a mature tools implementation, especially tools for orchestration (for addressing messages, controlling task flow, etc) and recently added XML-based tools that are especially useful when you want an LLM to return code via tools -- this is much more reliable than returning code in JSON-based tools.

alchemist1e91y ago

Take a look at txtai as an alternative more flexible and more professional framework for this problem space.

mnk47OP1y ago

edit: They've added a cookbook article at https://cookbook.openai.com/examples/orchestrating_agents

It's MIT licensed.

keeeba1y ago

Thanks for linking - I know this is pedantic but one might think OpenAI’s models could make their content free of basic errors quite easily?

“Conretely, let's define a routine to be a list of instructions in natural langauge (which we'll repreesnt with a system prompt), along with the tools necessary to complete them.”

I count 3 in one mini paragraph. Is GPT writing this and being asked to add errors, or is GPT not worth using for their own content?

ukuina1y ago

> ONLY if not satesfied, offer a refund.

If only we had a technology to access language expertise on demand...

r2_pilot1y ago

Clearly they should be using Claude instead.

segmondy1y ago

There's absolutely nothing new in this framework that you won't find in a dozen other agent frameworks on github.

croes1y ago

Simulating progress.

xrd1y ago

Which ones do you suggest considering?

adamdiy1y ago

one mentioned elsewhere in thread: https://github.com/crewAIInc/crewAI

meadhikari1y ago

I have noticed that CrewAI burns too much token for anything significant

kgc1y ago

I feel like there's a motivation here to generate a lot of inference demand. Having multiple o1 style agents churning tokens with each other seems like a great demand driver.

sebnun1y ago

I immediately thought of Docker Swarm. Naming things is one of the hardest problems in computer science.

8f2ab37a-ed6c1y ago

Or https://www.perforce.com/products/helix-swarm if you’re in the game dev world

Quizzical42301y ago

Anyone see the drama here: https://github.com/openai/swarm/issues/50

thawab1y ago

This dude has issues, the reddit post in /r/MachineLearning top comment:

> Yes, basically. Delete any kyegomez link on sight. He namesquats recent papers for the clout, though the code never actually runs, much less replicates the paper results. We've had problems in /r/mlscaling with people unwittingly linking his garbage - we haven't bothered to set up an Automod rule, though.

[0] https://github.com/princeton-nlp/tree-of-thought-llm/issues/...

[1] https://x.com/ShunyuYao12/status/1663946702754021383

Quizzical42301y ago

Oh!

What really bothers me is that this kyegomez person wasted time and energy of so many people and for what?

thawab1y ago

followers, clicks? anyone who spends a few minutes browsing his repo will know he is a fraud. Here is an example:

https://github.com/kyegomez/AlphaFold3

most issues are people not able to run his code. These issues are closed. The repo has 700 stars.

1 more reply

az2261y ago

Just gonna leave this here: https://github.com/kyegomez/tree-of-thoughts/issues/78#issue...

Also this part from the reply before editing it away:

They get mad that my repo and code is better than their's and they published they paper, they feel entitled even though I reproduced the entire paper based on 4 phrases, dfs, bfs (search algos), generate solutions, and generate thoughts and this is it. I didn't even read the full paper when I first started to implement it. The reason they want people to unstar my repo is because they are jealous that they made a mistake by not sharing the code when they published the paper as real scientists would do. If you do not publish your code as a AI research scientists you are a heretic, as your work cannot be tried and tested. and the code works amazingly much better than theirs, I looked at their code and couldn't figure out how to run it for hours, as well as other people have reported the same. the motivations are jealously, self hatred, guilt, envy, inferiority complex, ego, and much more psychographic principles.

1 more reply

newman3141y ago

I looked on his GH profile page. How was he able to amass over 16k GitHub stars?

thawab1y ago

New research paper drop or go viral > create a repo with AI code > post it in social media. Users star a repo to bookmark it. The few who test the code write in the issue section and get their issue closed with no replies.

Thats why some subreddits flagged these name squatters.

kevindamm1y ago

I think a lot of people use stars as a kind of bookmark, not for recognition. It takes time to read through the code or set up a working build from a fork. I, for one, occasionally use stars to remind myself to return to a repo for a more thorough look (especially if I'm on mobile at the time).

Also, bots.

seanhunter1y ago

I would be pretty astonished if the complainer manages to get the trademark they think they have on "swarms" enforced. People have been using the word "swarm" in connection with simulations of various kinds for as long as I have been interested in simulations (I mean I think I first heard the word swarm in connection with a simulation in relation to something done by the santa fe institute in the 80s if memory serves correctly - it's been a long time).[1]

Most likely outcome is if they try to actually pursue this they lose their "trademark" and the costs drive them out of business.

[1] I didn't misremember https://www.swarm.org/wiki/Swarm:Software_main_page

sunnybeetroot1y ago

You may be interested in seeing a reply by the creator in these comments: https://news.ycombinator.com/item?id=41819866

seanhunter1y ago

Yeah I saw it after posting my thing. So cool to have folks like that on this forum.

Quizzical42301y ago

Are they trying to advertise swarm.ai?

Bad press is still press XD

nsonha1y ago

how does this compare to Autogen and LangGraph? As someone new to this space, I tried to look into the other 2 but got pretty overwhelmed. Context is making multi agents, multi steps reasoning workflows

fkilaiwi1y ago

what is context?

arach1y ago

Worth noting there is an interesting multi-agent open source project named Swarms. When I saw this on X earlier I thought maybe the team had joined OpenAI but there's no connection between these projects

> "Swarms: The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework"

[0] https://github.com/kyegomez/swarms

[1] https://docs.swarms.world/en/latest/

ItsSaturday1y ago

" It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)"

Nope this doesn't mean it at all. You decided additionaly and independent from the other statements that you do not allow collaboration at all.

Which is fine the sentence is still unlogical

2024user1y ago

What is the challenge here? Orchestration/triage to specific assistants seems straight forward.

llm_trw1y ago

There isn't one.

The real challenge for at scale inference is that the compute for models is too long to keep normal API connections open and you need a message passing system in place. This system also needs to be able to deliver large files for multi-modal models if it's not going to be obsolete in a year or two.

I build a proof of concept using email of all things but could never get anyone to fund the real deal which could run at larger than web scale.

lrog1y ago

Why not use Temporal?

An example use with AWS Bedrock: https://temporal.io/blog/amazon-bedrock-with-temporal-rock-s...

llm_trw1y ago

Because when you see someone try and reinvent Erlang in another language for the Nth time you know you can safely ignore them.

1 more reply

2024user1y ago

Thanks. Could something like Kafka be used?

llm_trw1y ago

You could use messenger pigeons if you felt like it.

People really don't understand how much better LLM swarms get with more agents. I never hit a point of diminishing returns on text quality over two days of running a swarm of llama2 70Bs on an 8x4090 cluster during the stress test.

You would need something similar to, but better than, whatsapp to handle the firehose of data that needs to cascade between agents when you start running this at scale.

1 more reply

siscia1y ago

I am not commenting on the specific framework, as I just skimmed the readme.

But I find this approach working well overall.

Moreover it is easily debuggable and testable in isolation which is one of the biggest selling point.

(If anyone is building ai products feel free to hit me.)

thawab1y ago

In the example folder they used qdrant as a vector database, why not use openai’s assistants api? The idea for a vendor lock solution is to make things simpler. Is it because qdrant is faster?

htrp1y ago

qdrant is part of the openai tech stack for their RAG solutions

thawab1y ago

Why use it if you can do RAG with openai's assistants api?

jeffchuber1y ago

its not

htrp1y ago

Does anyone else feel like these are Google-style 20% time projects from the OpenAI team members looking to leave and trying to line up VC funding?

exitb1y ago

Doesn’t working on a venture on company time put you at an enormous disadvantage in terms of ownership?

johntash1y ago

Not just company time, but company resources and the company's github org.

But yeah, I'd assume they have no ownership themselves unless they signed something explicit?

sidcool1y ago

ELI5 anyone?

nobrains1y ago

It is a foreshadowing name...

nsonha1y ago

where is my llm-compose.yml

j / k navigate · click thread line to collapse

106 comments

hubraumhugo1y ago

[0] https://www.kadoa.com/blog/ai-agents-hype-vs-reality

fnordpiglet1y ago

doctorpangloss1y ago

Isn’t the best customer service:

    Cost to Solve < Remaining LTV * Profit Margin

In other words, do the details matter? If the customer leaves because you don’t take a fraudulent $10 return, but he’s worth $1,000 in the long term, that’s dumb.

A business that makes CS more details oriented is 200% the wrong solution.

fnordpiglet1y ago

The fraud we deal with is a lot more than $10.

2 more replies

LASR1y ago

The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.

There are a whole class of problems that do not require low-latency. But not having consistency makes them pretty useless.

Frameworks don’t solve that. You’ll probably need some sort of ground-truth injection at every sub-agent level. Ie: you just need data.

Totally agree with you. Unreliability is the thing that needs solving first.

debo_1y ago

> The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.

Sounds like management to me.

mycall1y ago

Sticky goals and reevaluation of tasks is one way to keep the end result on track.

How does gpt o1 solve this?

irthomasthomas1y ago

I use my own agent all day, every day. Here is one example: https://x.com/xundecidability/status/1835085853506650269

I've been using the general agent to build specialised sub-agents. Here's an example search agent beating perplexity: https://x.com/xundecidability/status/1835059091506450493

9999000009991y ago

Do you have any code to share?

I'm failing to see the point in the example, unless the agents can do things on multiple threads. For example let's say we have Boss Agent.

I can ask Boss agent to organize a trip for five people to the Netherlands.

Boss agent can ask some basic questions, about where my Friends are traveling from, and what our budget is .

irthomasthomas1y ago

That particular task didn't need parallel agents or any of the advanced features.

Write to intermediary files along the way.

Ask me if anything is unclear. </prompt>

I just gave it your task and I'll share the results tomorrow (I'm off to bed).

fsndz1y ago

xrd1y ago

d4rkp4ttern1y ago

The features in Langroid have evolved in response to the requirements of various use-cases that arose while building applications for clients, or companies that have requested them.

dimitri-vs1y ago

Sonnet 3.5 and other large context models made context management approaches irrelevant and will continue to do so.

o1 (and likely sonnet 3.5) made chain of through and other complex prompt engineering irrelevant.

Realtime API (and others that will soon follow) will made the best VTT > LLM > TTV irrelevant.

VLMs will likely make LLMs irrelevant. Who knows what Google has planned for Gemini 2.

1 more reply

xrd1y ago

I appreciate your comment.

I suppose my comment is reserved more for the documentation than the actual models in the wild?

1 more reply

islewis1y ago

> The underlying issue is that AI agents too slow,

Inference speed is being rapidly optimized, especially for edge devices.

> too expensive,

The half-life of OpenAI's API pricing is a couple of months. While the bleeding edge model is always costly, the cost of API's are becoming rapidly available to the public.

> and too unreliable

theptip1y ago

I don’t think the tech is ready yet for other reasons, but absence of anyone publishing is not good evidence against.

morgante1y ago

Agents can work in production, but usually only when they are closer to "workflows" that are very targeted to a specific use case.

mycall1y ago

If the solution the agents create is immediately useful, then waiting a few minutes or longer for the answer is fine.

inglor1y ago

Yes I built a lot of stuff (at batch, not to respond to user queries). Mostly large scale code generation and testing tasks.

antfarm1y ago

https://en.wikipedia.org/wiki/Swarm_(simulation)

https://www.santafe.edu/research/results/working-papers/the-...

NelsonMinar1y ago

Hey, I wrote that! But it was nearly 30 years ago, it's OK for someone else to use the same name.

edbaskerville1y ago

Hey, thanks for writing the original Swarm! Also thought of that immediately when I saw the headline.

NelsonMinar1y ago

seanhunter1y ago

darknavi1y ago

> compared to C++'s static typing and lack of runtime introspection. (Again, nearly 30 years ago. Things are different now.)

C++ has added a ton of great features since (especially C++11 onward) but run-time reflection is still sorely missed.

mnky9800n1y ago

https://youtube.com/playlist?list=PL6zSfYNSRHalAsgIjHHsttpYf...

The idea was to think about it from different directions including academia, industry, and education.

Anyways I'm happy to chat more about these topics. I'm obsessed with understanding complexity using ai, modelling, and other methods.

patcon1y ago

This looks rad! But you should title the videos with the topic and the speakers name, and if you must include the conference name, put it at the end :)

As-is, it's hard to skim the playlist, and likely terrible for organic search on Google or YouTube <3

mnky9800n1y ago

I agree with you. Unfortunately I'm not in charge of the videos so even though I asked them to do this they didn't. Haha.

llm_trw1y ago

An AI conference that isn't bullshit hype? Will wonders never cease?

> Nobody presented multi agent simulations but I agree with you that is a very interesting way of thinking about things.

After playing with it for an afternoon I realized I was basically doing a very wasteful Markov chain enumeration algorithm and wrote one up accordingly.

mnky9800n1y ago

Yeah I already have loads of people asking when the next one is for this exact reason. Haha. Well, I would love to have people help organise the next one. But I don't know yet

ac130kz1y ago

dartos1y ago

I hold no love for openai, but to be fair (and balanced) they put this right in the start of their readme.

It’s literally not meant to replace anything.

IMO the reason there’s no langchain replacement is because everything langchain does is so darn easy to do yourself, there’s hardly a point in taking on another dependency.

Though griptape.ai also exists.

1 more reply

CharlieDigital1y ago

    > Such a shame that there's nothing to replace Langchain with other than writing it all from the ground up yourself.

Check out Microsoft Semantic Kernel: https://github.com/microsoft/semantic-kernel

Supports .NET, Java, and Python. Lots of sample code[0] and support for agents[1] including a detailed guide[2].

Highly recommended.

[0] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

[1] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

[2] https://github.com/microsoft/semantic-kernel/tree/main/pytho...

arnaudsm1y ago

OpenAI's code quality leaves to be desired, which is surprising considering how well compensated their engineers are.

Their recent realtime demo had so many race conditions, function calling didn't even work, and the patch suggested by the community hasn't been merged for a week.

https://github.com/openai/openai-realtime-api-beta/issues/14

keithwhor1y ago

Hey! I was responsible for developing this.

I’d love to build something more full-featured here and may approach it as a side project. Feel free to ping me directly if you have ideas. @keithwhor on GitHub / X dot com.

arnaudsm1y ago

Thank you, will contact you asap, I'd be happy to help :)

croes1y ago

Why need they engineers if they have GPT?

Do they use their own product?

d4rkp4ttern1y ago

https://github.com/langroid/langroid

alchemist1e91y ago

Take a look at txtai as an alternative more flexible and more professional framework for this problem space.

mnk47OP1y ago

edit: They've added a cookbook article at https://cookbook.openai.com/examples/orchestrating_agents

It's MIT licensed.

keeeba1y ago

Thanks for linking - I know this is pedantic but one might think OpenAI’s models could make their content free of basic errors quite easily?

“Conretely, let's define a routine to be a list of instructions in natural langauge (which we'll repreesnt with a system prompt), along with the tools necessary to complete them.”

I count 3 in one mini paragraph. Is GPT writing this and being asked to add errors, or is GPT not worth using for their own content?

ukuina1y ago

> ONLY if not satesfied, offer a refund.

If only we had a technology to access language expertise on demand...

r2_pilot1y ago

Clearly they should be using Claude instead.

segmondy1y ago

There's absolutely nothing new in this framework that you won't find in a dozen other agent frameworks on github.

croes1y ago

Simulating progress.

xrd1y ago

Which ones do you suggest considering?

adamdiy1y ago

one mentioned elsewhere in thread: https://github.com/crewAIInc/crewAI

meadhikari1y ago

I have noticed that CrewAI burns too much token for anything significant

kgc1y ago

I feel like there's a motivation here to generate a lot of inference demand. Having multiple o1 style agents churning tokens with each other seems like a great demand driver.

sebnun1y ago

I immediately thought of Docker Swarm. Naming things is one of the hardest problems in computer science.

8f2ab37a-ed6c1y ago

Or https://www.perforce.com/products/helix-swarm if you’re in the game dev world

Quizzical42301y ago

Anyone see the drama here: https://github.com/openai/swarm/issues/50

thawab1y ago

This dude has issues, the reddit post in /r/MachineLearning top comment:

[0] https://github.com/princeton-nlp/tree-of-thought-llm/issues/...

[1] https://x.com/ShunyuYao12/status/1663946702754021383

Quizzical42301y ago

Oh!

What really bothers me is that this kyegomez person wasted time and energy of so many people and for what?

thawab1y ago

followers, clicks? anyone who spends a few minutes browsing his repo will know he is a fraud. Here is an example:

https://github.com/kyegomez/AlphaFold3

most issues are people not able to run his code. These issues are closed. The repo has 700 stars.

1 more reply

az2261y ago

Just gonna leave this here: https://github.com/kyegomez/tree-of-thoughts/issues/78#issue...

Also this part from the reply before editing it away:

1 more reply

newman3141y ago

I looked on his GH profile page. How was he able to amass over 16k GitHub stars?

thawab1y ago

Thats why some subreddits flagged these name squatters.

kevindamm1y ago

Also, bots.

seanhunter1y ago

Most likely outcome is if they try to actually pursue this they lose their "trademark" and the costs drive them out of business.

[1] I didn't misremember https://www.swarm.org/wiki/Swarm:Software_main_page

sunnybeetroot1y ago

You may be interested in seeing a reply by the creator in these comments: https://news.ycombinator.com/item?id=41819866

seanhunter1y ago

Yeah I saw it after posting my thing. So cool to have folks like that on this forum.

Quizzical42301y ago

Are they trying to advertise swarm.ai?

Bad press is still press XD

nsonha1y ago

fkilaiwi1y ago

what is context?

arach1y ago

> "Swarms: The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework"

[0] https://github.com/kyegomez/swarms

[1] https://docs.swarms.world/en/latest/

ItsSaturday1y ago

" It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)"

Nope this doesn't mean it at all. You decided additionaly and independent from the other statements that you do not allow collaboration at all.

Which is fine the sentence is still unlogical

2024user1y ago

What is the challenge here? Orchestration/triage to specific assistants seems straight forward.

llm_trw1y ago

There isn't one.

I build a proof of concept using email of all things but could never get anyone to fund the real deal which could run at larger than web scale.

lrog1y ago

Why not use Temporal?

An example use with AWS Bedrock: https://temporal.io/blog/amazon-bedrock-with-temporal-rock-s...

llm_trw1y ago

Because when you see someone try and reinvent Erlang in another language for the Nth time you know you can safely ignore them.

1 more reply

2024user1y ago

Thanks. Could something like Kafka be used?

llm_trw1y ago

You could use messenger pigeons if you felt like it.

You would need something similar to, but better than, whatsapp to handle the firehose of data that needs to cascade between agents when you start running this at scale.

1 more reply

siscia1y ago

I am not commenting on the specific framework, as I just skimmed the readme.

But I find this approach working well overall.

Moreover it is easily debuggable and testable in isolation which is one of the biggest selling point.

(If anyone is building ai products feel free to hit me.)

thawab1y ago

In the example folder they used qdrant as a vector database, why not use openai’s assistants api? The idea for a vendor lock solution is to make things simpler. Is it because qdrant is faster?

htrp1y ago

qdrant is part of the openai tech stack for their RAG solutions

thawab1y ago

Why use it if you can do RAG with openai's assistants api?

jeffchuber1y ago

its not

htrp1y ago

Does anyone else feel like these are Google-style 20% time projects from the OpenAI team members looking to leave and trying to line up VC funding?

exitb1y ago

Doesn’t working on a venture on company time put you at an enormous disadvantage in terms of ownership?

johntash1y ago

Not just company time, but company resources and the company's github org.