Introducing Agents in Haystack: Make LLMs resolve complex tasks (opens in new tab)

(haystack.deepset.ai)

202 pointsvishesh923y ago102 comments

102 comments

71 comments · 17 top-level

yosito3y ago· 14 in thread

I haven't yet figured out how to get an LLM to accurately determine whether it actually knows something or is making it up. I wonder how they handle that. They may get to that at some point in the article, but the page eventually breaks for me on mobile and I can't read past the first code block.

dimatura3y ago

I eventually ended up switching fields to ML (and that's my current day job), but I started out as an undergrad studying psychology/cognitive science. During those years I started a research project on what is called the "feeling of knowing", or FOK for short - a subtopic in the broader topic of "metacognition" in cognitive science. The FOK is closely related to what is colloquially knows as tip-of-the-tongue phenomenon - basically, that subjective feeling that we know something, whether or not we can actually recall it. There's some interesting aspects to it. For example, the FOK is generally pretty accurate (but not perfect, of course). And it tends to be more robust than actual memory; as we age, for example, we tend to be better at judging than we know an actor's name (and confirming it once we look it up) than recalling it. It seems like LLMs have very little in the way of metacognition, and just confabulate if they don't know something, as we've seen. I'm sure we'll be seeing some efforts to give some analog of a FOK to LLMs in the near future.

est313y ago

FTR, people are trying to build systems to compare LLMs with each other based on how well they are at saying "I don't know" (of course knowing is still rewarded higher): https://github.com/manyoso/haltt4llm

sr-latch3y ago

Would be cool to try to incorporate the previous token's confidence embedding into this process, but that would make training with a triangular attention mask not possible.

antti9093y ago

That's also one of the ideas behind using so-called retrieval-based augmentation. You can 'plug' an LLM like OpenAI's one (of Cohere, or a combo) to your data and make it provide accurate answers, but still leveraging all the benefits and power or a cutting-edge generative model. Check this https://twitter.com/deepset_ai/status/1625495149446062081 or this https://twitter.com/deepset_ai/status/1621161534243368961

irthomasthomas3y ago

This gives the model access to information, but it cannot eliminate the non-deterministic nature of the transformers model. There is always a non-zero probability that it will hallucinate.

1 more reply

CGamesPlay3y ago

With tools like this, you basically assume the LLM doesn't know, and teach it to always defer to a tool, so its response is basically summarization over the tool output.

tholor3y ago

That's right. If you want to ask questions about "general public knowledge", a plain LLM will know anyway and would not need tools. However, for many use cases you need data from your private CRM / a SQL db / a private wiki / or your obversability platform. Agents allow to "query" those tools/APIs to get the needed information so that they can answer the question. It's a matter of composition in your architecture. The "monolith LLM" works well for a certain spectrum of tasks, but at some point you want to decouple responsibilities into individual services / tools. Simplifies debugging, explainability, maintenance ...

wyldfire3y ago

Maybe we should just assume that LLMs "know" very little and if you want to build an oracle you should teach the LLM how to access an ontology.

antti9093y ago

With a real-life application it's often about making the LLM work on top of your actual (private) data most reliably. By definition a proprietary hosted LLM can't know about it unless you bridge it somehow in a reliable manner.

wickoff3y ago

I have the same exact problem with people. Being inaccurate is a feature, not a bug.

teruakohatu3y ago

It is different with LLMs. Most people can give a level of uncertainty along with an answer, and often do. LLMs can't, and worse, are trained to put an emphasis on the prompts. Humans are often trained to be skeptical of prompts.

If I said, "the moon is made of cheese. What type of cheese do you think it is?" most humans would automatically object, but with LLMs you can usually craft a prompt that would get it to answer such a silly question.

4 more replies

madisonmay3y ago

Imperfect systems are still useful, and any sufficiently complex system is imperfect.

djtriptych3y ago

i agree. LLMs are not built for structured reasoning or even citations.

ramraj073y ago

Gpt-4 does a reasonable job citing things. It can’t cite every paper out there but definitely the well cited ones.

1 more reply

Ozzie_osman3y ago· 11 in thread

If you (like me) were wondering how these works, the LLM is given a prompt like:

  Answer the following questions as best you can. You have access to the following tools:
  Search: Use this to search the internet.
  Calculator: Use this to do math.

  Use the following format:
  Question: the input question you must answer
  Thought: you should always think about what to do
  Action: the action to take, should be one of [{tool_names}]
  Action Input: the input to the action
  Observation: the result of the action
  ... (this Thought/Action/Action Input/Observation can repeat N times)
  Thought: I now know the final answer
  Final Answer: the final answer to the original input question

  Question: What is the age of the president of Egypt squared?
  Thought:

To which the LLM will generate a completion like:

  Thought: I need to find the age of the president of Egypt.
  Action: Search
  Action Input: Age of president of Egypt
  Observation:

At which point, the code (langchain, haystack, etc) will parse out the requested tool (Search) and input (Age of president of Egypt), and then call the right tool or API, then append the output of that action into the prompt.

This all happens in a loop, at each step, the LLM is given the entire past prompt history, and given the opportunity to do a completion to choose the next tool and input to the next tool, after which the code parses those out, executes the tool, and repeats until the LLM decides it has the final answer and returns.

tectonic3y ago

I just released something like this embedded in a browser extension. Except the prompt includes a TypeScript interface that GPT4 is asked to follow. Works very well and reliably uses tools like Calculate, RequestDOM, etc.

https://github.com/cantino/browser-friend

ryanwaggoner3y ago

I did a manual version of this where I played a dispatch controller in a robot, relaying inputs and outputs from GPT4, which I told was the reasoning brain in this robot. It was very remarkable to watch its train of thought in considering sensor inputs and then giving me actions to take in response.

sr-latch3y ago

This looks similar to the WebGPT paper, is that referenced in any of langchain or haystack's publications?

Introducing the mechanism of internal thought is very interesting, I wonder if there's a way to make it implicit in the model's architecture.

Sol-3y ago

I think the ReAct paper also popularized this approach: https://arxiv.org/abs/2210.03629

2 more replies

Ozzie_osman3y ago

My understanding is that the patterns are similar (in that you're enabling an LLM to use external tools/information), and all those patterns would fall under the "agents" pattern.

But, I think the difference is that WebGPT was actually fine-tuned / retrained for its specific use case, while the agents in these libraries just use the generic model without fine-tuning. My guess (and I'm not an expert here) is that fine-tuning these models for specific agent uses cases would probably result in better outcomes... Though as the models get more powerful, they might just perform well enough out of the box. (Also, some of the most recent OpenAI models don't support fine-tuning, and even for the ones that do, you'd need to generate the data to fine tune).

babyshake3y ago

Is ChatGPT plugins using something comparable to this under the hood?

ericlewis3y ago

Yes and no. Whatever they are doing seems more robust than anything else I have tried. Especially with being able to bring context in the conversation to later invocations of tools. I haven't managed to get langchain to do that well.

1 more reply

ElFitz3y ago

I had some fun with a similar approach, but when generating large outputs, or retrieving large contexts, it can easily run into the context window limit.

I think this could be partially solved by intelligently summarising parts of the prompt history, while storing the original in some vector db, so the relevant parts can be retrieved at will.

Quite fun.

gremlinsinc3y ago

Does observation cover reflexion? Self-observation or is that something else?

Or maybe before Final Answer, you could do: Double Check: I think I have the final answer, but does it look right? IF yes: go to final answer if no: Go back up the loop.

stevenhuang3y ago

For those interested in an explainer in reflexion (asking the LLM if it made an error and allowing it to correct itself) I found this breakdown useful: https://youtu.be/5SgJKZLBrmg

> GPT 4 can self-correct and improve itself. With exclusive discussions with the lead author of the Reflexions paper, show how significant this will be across a variety of tasks, and how you can benefit.

Ozzie_osman3y ago

(also tbc I took these example prompts from LangChain.. not sure if Haystack uses different prompts (LangChain actually has a bunch of versions, this is probably the easiest one)

j0e13y ago· 9 in thread

Could someone in the know compare this with LangChain (https://github.com/hwchase17/langchain)?

antti9093y ago

Haystack has been around for a while now, and we've been mostly specializing in the extractive QA. The focus has been indeed on making the use of local Transformer models most easy and convenient for a backend application builder. You can build very reliable and sometimes quite elaborate NLP pipelines with Haystack (e.g., extractive or generative QA, summarization, document similarity, semantic search, FAQ-style search, etc. etc.) with either Transformer models, LLMs, or both. With the Agents you can also put an Agent on top of your pipelines and use a prompt-defined control to find the best underlying tool and pipeline for the task. Haystack has always included all the necessary 'infrastructure' components - pre-processing, indexing, several document stores to choose from (ES/OS, Pinecone, Weavite, Milvus, now Qdrant, etc.) and the means to evaluate and fine-tune Transformer models.

d4rkp4ttern3y ago

Thanks for clarifying. The support for local LLMs seems very interesting — would a haystack agent call out to a separately “running” self-hosted LLM via an API (REST, etc) or would it need to actually load up the model and directly query it (e.g model.generate(<prompt>) ) ?

Also it seems like the functionality of haystack subsumes those of langchain and llama-index (fka GPT-index) ?

2 more replies

nextworddev3y ago

Most of the core ideas came from a paper called React, they all kind of riff on the idea of self-inspection / introspection to augment the context or plan action

antti9093y ago

For the Agents? Yes, indeed. Referred in the article.

ukuina3y ago

I would consider Haystack to be the more batteries-included, easier to use (but harder to customize) of the two. They have a good emphasis on local model use.

antti9093y ago

Thanks :) Working on it. Re local models - indeed, all started with using the Transformer models for extractive QA and semantic search. With the Promptnode, and/or the Agents it's also now possible to combine local models/pipelines & 'LLMs' freely.

teruakohatu3y ago

Looks like the same general idea:

https://github.com/deepset-ai/haystack

antti9093y ago

See above - Haystack started a few years ago as a result of us working with some large enterprise clients on implementing extractive QA at scale. Now evolving to also allow the backend builders to mimic what's available from, e.g. OpenAI+plugins, but with their own set of models, and being able to mix&match best available components and technology.

antti9093y ago

LangChain is very cool tho :))

lsy3y ago· 6 in thread

Is there some way of holding the LLM response to a given prompt constant? It sounds like a lot of this relies on the LLM getting the right answer in sequence, so I'm guessing they do something like keep the temperature at 0? Otherwise you are going to wind up with possibly different behavior run-to-run. And even if they do have something like the above, don't we end up with potentially breaking changes once models are updated? Basically the issue is that even if you can guarantee response format X for prompt A, a slightly modified prompt A' has no guarantee that its response will be in the same format as X, even in the same model. You can also imagine that the more "Tools" are available, the lower the chance that the model will pick the right one based on its English text description. Would be interesting to know how this is being addressed.

newhouseb3y ago

There are a couple different approaches:

- Rerun the prompt until you get a format that is consistent

- Steer the output token selection towards a predefined prompt

For the latter, I've built a proof of concept that takes in a JSON schema with a huggingface transformer and constrains token selection by modifying the output probabilities such that only schematically valid tokens can be emitted, see "Structural Alignment: Modifying Transformers (like GPT) to Follow a JSON Schema" @ https://github.com/newhouseb/clownfish. Unfortunately, given OpenAI's current API this is only possible on locally run models. That is... at any level of cost effectiveness. It's technically possible but (worst case) quadratically expensive against their current APIs.

computerex3y ago

The output is stochastic, so if the response doesn't decode to your format you can try the request again. Gpt3.5-turbo is pretty good at tool selection and use, but it often messes up with difficult tasks. gpt-4 is on another level when it comes to tool use. It is very reliable in my testing.

You ofc can't guarantee the output so defensive programming, retries are a must in my opinion. We are all learning how to work with this technology.

Ozzie_osman3y ago

Even with temperature 0 and the same model, the models are slightly non-deterministic and may diverge with the same input.

You probably don't want to treat them as deterministic (at least, not now). But there are many applications where slightly non-deterministic behavior is OK.

jasonjmcghee3y ago

I've been seeing the trick of creating a lookup table of query to response as a hack to "solve" this problem, with the other benefit of saving a call to the model. Especially useful when developing agents

amrrs3y ago

There is a library called guardrails . I've not played with it extensively but that seems to address issues like this one.

Cyphase3y ago

https://shreyar.github.io/guardrails/

https://github.com/shreyar/guardrails

arthurcolle3y ago· 3 in thread

Doesn't this remind anyone else of Bitcoin autonomous agents? https://en.bitcoin.it/wiki/Agent#:~:text=An%20agent%20is%20a....

It would be interesting if you could somehow combine a wallet plugin to give the GPT something to work to increase. That + that AutoGPT repo from a day ago + maybe Mitogen for self-replication might be a cool combo to really kick off a fun "dumb" AI on the loose

alangou3y ago

This is cool. Some of the weaknesses they point out in software agents can be obviated with the latest AI advancements—give a chatbot the tools of the courts, and maybe the autonomous agent won’t be at a disadvantage in scams anymore.

Interesting to see if an AI agent given the objective to generate profit would resort to running scams. Encoding moral boundaries becomes important. What if you asked it to raise money for a good cause and it uses “whatever means necessary?”

Bjartr3y ago

> give a chatbot the tools of the courts, and maybe the autonomous agent won’t be at a disadvantage in scams anymore.

And now we have the beginnings to a plot point from Accelerando. Self propelled AI businesses as a distinct entity.

arthurcolle3y ago

If anyone wants to start hacking on this, I just started working on this today.

What I am having trouble understanding is how do I have the "agent" (a GPT-like LLM) maintain it's own "state" (balances, accounts receivable, accounts payable, etc) - would I just run a webservice and then have a ChatGPT Plugin access the "agent state" via a webhook? Is it as simple as that? Or would it just be like a recursive function where you continue to pass in your state with every "iteration"

I started working on a design last night and I think this is a really cool idea. All the parts are basically "there"

EDIT: I asked ChatGPT-4 what it thinks. It seems to like the idea... *thinking emoji* perhaps it seeks a human acolyte to spread its gospel...

Arthur Collé: I want to combine the concept of Bitcoin agents (autonomous programs) with GPT-4 and mitogen, to be able to distribute complex AI agents throughout the internet

ChatGPT-4: Combining Bitcoin agents, GPT-4, and Mitogen to create and distribute complex AI agents throughout the internet is an ambitious and innovative idea. Here's a high-level outline to guide you through this process:

    Define the purpose and functionality of the AI agents:
    Before proceeding, clearly define the purpose of these AI agents. Will they serve as decentralized oracles, provide AI-driven services, or facilitate complex transactions on the blockchain? Understanding their purpose will help you design and develop the agents more efficiently.


    Develop the AI agents using GPT-4:
    Design and train your AI agents using GPT-4 to achieve the desired functionality. Depending on the complexity of the agents, you might need to fine-tune the GPT-4 model to tailor it to your specific use case. Be mindful of the computational resources required to run the AI agents, as this will impact their performance in a decentralized environment.


    Integrate Bitcoin agents:
    Incorporate the concept of Bitcoin agents, or autonomous programs that interact with the Bitcoin blockchain, into your AI agents. This will enable your agents to autonomously perform tasks such as sending and receiving payments, signing transactions, and updating their internal state based on blockchain data. You can use existing libraries and tools for this purpose, such as bitcoinj or BTCPay Server.


    Utilize Mitogen for distribution and communication:
    Mitogen is an Ansible extension that simplifies distributed computing by enabling communication between remote Python interpreters over SSH or other transports. Use Mitogen to facilitate communication between your AI agents and distribute them across various nodes on the internet. Ensure that the AI agents can communicate effectively and perform their tasks even in a decentralized setting.


    Implement security measures:
    Securing your AI agents is crucial, as they will be handling sensitive information and potentially financial transactions. Implement strong encryption, authentication, and access controls to protect the integrity and confidentiality of your agents.


    Testing and deployment:
    Thoroughly test your AI agents in a controlled environment to ensure they perform as expected. Once you are satisfied with their functionality, deploy them throughout the internet using the Mitogen framework.


    Monitor and maintain:
    Continuously monitor the performance and security of your AI agents. Periodically update the GPT-4 models to maintain their accuracy and relevance. Be prepared to address any issues or vulnerabilities that may arise.

Keep in mind that this is a complex project with many potential challenges, including security, scalability, and regulatory compliance. Additionally, the ethical implications of creating and distributing autonomous AI agents should be carefully considered. Nonetheless, if executed properly, this concept has the potential to unlock innovative use cases and services in the realm of AI and blockchain technology.

jcims3y ago· 3 in thread

From the Github project readme

>Agent: (since 1.15) An Agent is a component that is powered by an LLM, such as GPT-3. It can decide on the next best course of action so as to get to the result of a query. It uses the Tools available to it to achieve this. While a pipeline has a clear start and end, an Agent is able to decide whether the query has resolved or not. It may also make use of a Pipeline as a Tool.

Emphasis mine. Having tinkered with LangChain I think the idea of a recursive and/or graph-oriented model is going to yield interesting phenomena in the overall feel of these language models. LangChain agents are already super impressive.

Art96813y ago

I've been tinkering with LangChain for a few days and I agree. Is there a resource that collects the agents so we can experiment? I'd love to see an aggregated list of the most impressive agents and use cases if anyone knows of any. This stuff is the future of computing no doubt.

nico3y ago

There is a LangChainHub mentioned in their docs, but the repo for it seems dead.

Is there any sort of marketplace/AppStore for agents/tools/plug-ins for LLMs via LangChain? Or some other library like haystack?

2 more replies

antti9093y ago

Thanks for the emphasis :) Accurate!

antti9093y ago· 2 in thread

Thanks for the spotlight :) We've spent quite a lot of time working on the Agents lately, and it's definitely a big focus. Couple of extra points to reflect on some of the comments here. It's quite straightforward to build a hybrid NLP backend with Haystack combining either hosted LLM (e.g., OpenAI or Cohere), or local, smaller Transformer models, or both. Agents add another level of control on top of that, as described in the article and in the comments. This provides more flexibility wrt bridging it to the relevant data and extract/generate accurate non-hallucinatory answers. Join our Discord too :) https://haystack.deepset.ai/community

d4rkp4ttern3y ago

Haystack looks very interesting, just found out about it today :) Is there some overlap with the functionality of langchain? Could you highlight some differences? Thanks

antti9093y ago

Thanks :) Answered a similar one somewhere else here - looks like you've found it already. Feel free to ask more in Discord https://haystack.deepset.ai/community

ActorNightly3y ago· 2 in thread

Sure, lets build more and more applications that all have a single point of failure that is Open AI.

thomashop3y ago

The framework let's you plugin many different LLMs not just OpenAI's

antti9093y ago

Very accurate observation :) So basically, a bit more freedom in picking the right tools for the job, connecting an LLM to proprietary data in a safe way, using multiple models simultaneously, and leveraging custom extractive/generative pipelines.

Art96813y ago· 2 in thread

Are there any projects similar to this that use other languages? I don't have anything against Python. But I would prefer using Go if there are alternatives. I experimented with LangChain and this appears to be a similar idea but they are all based on iPython notebooks and that ecosystem.

ukuina3y ago

Python seems the go-to language for most ML work. Are there any wrappers you could use to call these frameworks from within Golang?

birdiesanders3y ago

That's the heartbreak in ml dev, an entire company of golang code, no way to avoid tossing in a pile of python I don't want to look at.

brofallon3y ago· 1 in thread

Somewhat OT, but I feel like this is such an underappreciated aspect of recent LLMs: Not just their ability to generate text - but their apparent effectiveness in making use of arbitrary tools to interact with their environment to achieve some goal. It seems like we're just at the beginning of this with the MRKL and ReAct papers, there will be a ton more awesomeness coming in this area I'm sure.

bjackman3y ago

Yes I think if this works it's one of the strongest signals yet that LLMs as they currently exist have a fairly general form of intelligence.

I am pretty sympathetic to the field of "AI Safety" and I worry a lot about the implications of agent-like general intelligences. This post gives me a lot to ponder.

What are the implications of the fact that even AIs that are not agent-like at heart can apparently be told "please simulate an agent-like AI"? I really don't know. Should we consider it as an "inner AI" with its own objectives? How can we determine what those objectives are?

Instinctively it feels much less concerning than an AI with a direct action->perception feedback loop but who knows. AI is fucking weird.

What a thrilling time to be alive!

eob3y ago· 1 in thread

Anyone have requests for an agent framework that adds some of the capabilities Yohei on Twitter has been tinkering with?

Longer term planning, memory, etc?

dovlex3y ago

Hey, I work on Haystack Agents; we haven't seen Yohei's types of requests yet but I'm closely following his ideas and work.

Vecr3y ago

Has anyone had experience with text-davinci-003 vs Code-DaVinci-002? Apparently code-davinci-002 is better at statistical reasoning, as RTHF/fine tuning has made the text versions as well as the later models as bad as humans are or worse. If this is true code-davinci-002 is probably the most competent model available that could form the basis of a reasonably rational system, using a chaining or step-by-step DAG method similar to the submitted article.

kacperlukawski3y ago

That's a great news! Especially since we implemented the integration between Haystack and Qdrant: https://github.com/qdrant/qdrant-haystack/

tuanacelik3y ago

Hey, thanks for posting our article. I see a few comments here about memory: This is indeed a challenge which we're working on. And some other topics at hand are making oss models available for our implementation of the agent as well

andre-z3y ago

Great move! Congrats to the deepset team! Glad there is now a Qdrant integration in place to power large-scale vector search needs.

anonu3y ago

I wonder on the implications for portfolio management. Experts could be optimizers with historical pricing.

bottlepalm3y ago

Computer viruses in the future really will be ‘ghosts in the machine’

j / k navigate · click thread line to collapse

102 comments

71 comments · 17 top-level

yosito3y ago· 14 in thread

dimatura3y ago

est313y ago

sr-latch3y ago

Would be cool to try to incorporate the previous token's confidence embedding into this process, but that would make training with a triangular attention mask not possible.

antti9093y ago

irthomasthomas3y ago

This gives the model access to information, but it cannot eliminate the non-deterministic nature of the transformers model. There is always a non-zero probability that it will hallucinate.

1 more reply

CGamesPlay3y ago

With tools like this, you basically assume the LLM doesn't know, and teach it to always defer to a tool, so its response is basically summarization over the tool output.

tholor3y ago

wyldfire3y ago

Maybe we should just assume that LLMs "know" very little and if you want to build an oracle you should teach the LLM how to access an ontology.

antti9093y ago

wickoff3y ago

I have the same exact problem with people. Being inaccurate is a feature, not a bug.

teruakohatu3y ago

4 more replies

madisonmay3y ago

Imperfect systems are still useful, and any sufficiently complex system is imperfect.

djtriptych3y ago

i agree. LLMs are not built for structured reasoning or even citations.

ramraj073y ago

Gpt-4 does a reasonable job citing things. It can’t cite every paper out there but definitely the well cited ones.

1 more reply

Ozzie_osman3y ago· 11 in thread

If you (like me) were wondering how these works, the LLM is given a prompt like:

  Answer the following questions as best you can. You have access to the following tools:
  Search: Use this to search the internet.
  Calculator: Use this to do math.

  Use the following format:
  Question: the input question you must answer
  Thought: you should always think about what to do
  Action: the action to take, should be one of [{tool_names}]
  Action Input: the input to the action
  Observation: the result of the action
  ... (this Thought/Action/Action Input/Observation can repeat N times)
  Thought: I now know the final answer
  Final Answer: the final answer to the original input question

  Question: What is the age of the president of Egypt squared?
  Thought:

To which the LLM will generate a completion like:

  Thought: I need to find the age of the president of Egypt.
  Action: Search
  Action Input: Age of president of Egypt
  Observation:

tectonic3y ago

https://github.com/cantino/browser-friend

ryanwaggoner3y ago

sr-latch3y ago

This looks similar to the WebGPT paper, is that referenced in any of langchain or haystack's publications?

Introducing the mechanism of internal thought is very interesting, I wonder if there's a way to make it implicit in the model's architecture.

Sol-3y ago

I think the ReAct paper also popularized this approach: https://arxiv.org/abs/2210.03629

2 more replies

Ozzie_osman3y ago

My understanding is that the patterns are similar (in that you're enabling an LLM to use external tools/information), and all those patterns would fall under the "agents" pattern.

babyshake3y ago

Is ChatGPT plugins using something comparable to this under the hood?

ericlewis3y ago

1 more reply

ElFitz3y ago

I had some fun with a similar approach, but when generating large outputs, or retrieving large contexts, it can easily run into the context window limit.

I think this could be partially solved by intelligently summarising parts of the prompt history, while storing the original in some vector db, so the relevant parts can be retrieved at will.

Quite fun.

gremlinsinc3y ago

Does observation cover reflexion? Self-observation or is that something else?

Or maybe before Final Answer, you could do: Double Check: I think I have the final answer, but does it look right? IF yes: go to final answer if no: Go back up the loop.

stevenhuang3y ago

For those interested in an explainer in reflexion (asking the LLM if it made an error and allowing it to correct itself) I found this breakdown useful: https://youtu.be/5SgJKZLBrmg

Ozzie_osman3y ago

(also tbc I took these example prompts from LangChain.. not sure if Haystack uses different prompts (LangChain actually has a bunch of versions, this is probably the easiest one)

j0e13y ago· 9 in thread

Could someone in the know compare this with LangChain (https://github.com/hwchase17/langchain)?

antti9093y ago

d4rkp4ttern3y ago

Also it seems like the functionality of haystack subsumes those of langchain and llama-index (fka GPT-index) ?

2 more replies

nextworddev3y ago

Most of the core ideas came from a paper called React, they all kind of riff on the idea of self-inspection / introspection to augment the context or plan action

antti9093y ago

For the Agents? Yes, indeed. Referred in the article.

ukuina3y ago

I would consider Haystack to be the more batteries-included, easier to use (but harder to customize) of the two. They have a good emphasis on local model use.

antti9093y ago

teruakohatu3y ago

Looks like the same general idea:

https://github.com/deepset-ai/haystack

antti9093y ago

LangChain is very cool tho :))

lsy3y ago· 6 in thread

newhouseb3y ago

There are a couple different approaches:

- Rerun the prompt until you get a format that is consistent

- Steer the output token selection towards a predefined prompt

computerex3y ago

You ofc can't guarantee the output so defensive programming, retries are a must in my opinion. We are all learning how to work with this technology.

Ozzie_osman3y ago

Even with temperature 0 and the same model, the models are slightly non-deterministic and may diverge with the same input.

You probably don't want to treat them as deterministic (at least, not now). But there are many applications where slightly non-deterministic behavior is OK.

jasonjmcghee3y ago

amrrs3y ago

There is a library called guardrails . I've not played with it extensively but that seems to address issues like this one.

Cyphase3y ago

https://shreyar.github.io/guardrails/

https://github.com/shreyar/guardrails

arthurcolle3y ago· 3 in thread

Doesn't this remind anyone else of Bitcoin autonomous agents? https://en.bitcoin.it/wiki/Agent#:~:text=An%20agent%20is%20a....

alangou3y ago

Bjartr3y ago

> give a chatbot the tools of the courts, and maybe the autonomous agent won’t be at a disadvantage in scams anymore.

And now we have the beginnings to a plot point from Accelerando. Self propelled AI businesses as a distinct entity.

arthurcolle3y ago

If anyone wants to start hacking on this, I just started working on this today.

I started working on a design last night and I think this is a really cool idea. All the parts are basically "there"

EDIT: I asked ChatGPT-4 what it thinks. It seems to like the idea... *thinking emoji* perhaps it seeks a human acolyte to spread its gospel...

Arthur Collé: I want to combine the concept of Bitcoin agents (autonomous programs) with GPT-4 and mitogen, to be able to distribute complex AI agents throughout the internet

    Define the purpose and functionality of the AI agents:
    Before proceeding, clearly define the purpose of these AI agents. Will they serve as decentralized oracles, provide AI-driven services, or facilitate complex transactions on the blockchain? Understanding their purpose will help you design and develop the agents more efficiently.


    Develop the AI agents using GPT-4:
    Design and train your AI agents using GPT-4 to achieve the desired functionality. Depending on the complexity of the agents, you might need to fine-tune the GPT-4 model to tailor it to your specific use case. Be mindful of the computational resources required to run the AI agents, as this will impact their performance in a decentralized environment.


    Integrate Bitcoin agents:
    Incorporate the concept of Bitcoin agents, or autonomous programs that interact with the Bitcoin blockchain, into your AI agents. This will enable your agents to autonomously perform tasks such as sending and receiving payments, signing transactions, and updating their internal state based on blockchain data. You can use existing libraries and tools for this purpose, such as bitcoinj or BTCPay Server.


    Utilize Mitogen for distribution and communication:
    Mitogen is an Ansible extension that simplifies distributed computing by enabling communication between remote Python interpreters over SSH or other transports. Use Mitogen to facilitate communication between your AI agents and distribute them across various nodes on the internet. Ensure that the AI agents can communicate effectively and perform their tasks even in a decentralized setting.


    Implement security measures:
    Securing your AI agents is crucial, as they will be handling sensitive information and potentially financial transactions. Implement strong encryption, authentication, and access controls to protect the integrity and confidentiality of your agents.


    Testing and deployment:
    Thoroughly test your AI agents in a controlled environment to ensure they perform as expected. Once you are satisfied with their functionality, deploy them throughout the internet using the Mitogen framework.


    Monitor and maintain:
    Continuously monitor the performance and security of your AI agents. Periodically update the GPT-4 models to maintain their accuracy and relevance. Be prepared to address any issues or vulnerabilities that may arise.

jcims3y ago· 3 in thread

From the Github project readme

Art96813y ago

nico3y ago

There is a LangChainHub mentioned in their docs, but the repo for it seems dead.

Is there any sort of marketplace/AppStore for agents/tools/plug-ins for LLMs via LangChain? Or some other library like haystack?

2 more replies

antti9093y ago

Thanks for the emphasis :) Accurate!

antti9093y ago· 2 in thread

d4rkp4ttern3y ago

Haystack looks very interesting, just found out about it today :) Is there some overlap with the functionality of langchain? Could you highlight some differences? Thanks

antti9093y ago

Thanks :) Answered a similar one somewhere else here - looks like you've found it already. Feel free to ask more in Discord https://haystack.deepset.ai/community

ActorNightly3y ago· 2 in thread

Sure, lets build more and more applications that all have a single point of failure that is Open AI.

thomashop3y ago

The framework let's you plugin many different LLMs not just OpenAI's

antti9093y ago

Art96813y ago· 2 in thread

ukuina3y ago

Python seems the go-to language for most ML work. Are there any wrappers you could use to call these frameworks from within Golang?

birdiesanders3y ago

That's the heartbreak in ml dev, an entire company of golang code, no way to avoid tossing in a pile of python I don't want to look at.

brofallon3y ago· 1 in thread

bjackman3y ago

Yes I think if this works it's one of the strongest signals yet that LLMs as they currently exist have a fairly general form of intelligence.

I am pretty sympathetic to the field of "AI Safety" and I worry a lot about the implications of agent-like general intelligences. This post gives me a lot to ponder.

Instinctively it feels much less concerning than an AI with a direct action->perception feedback loop but who knows. AI is fucking weird.

What a thrilling time to be alive!

eob3y ago· 1 in thread

Anyone have requests for an agent framework that adds some of the capabilities Yohei on Twitter has been tinkering with?

Longer term planning, memory, etc?

dovlex3y ago

Hey, I work on Haystack Agents; we haven't seen Yohei's types of requests yet but I'm closely following his ideas and work.

Vecr3y ago

kacperlukawski3y ago

That's a great news! Especially since we implemented the integration between Haystack and Qdrant: https://github.com/qdrant/qdrant-haystack/

tuanacelik3y ago

andre-z3y ago

Great move! Congrats to the deepset team! Glad there is now a Qdrant integration in place to power large-scale vector search needs.

anonu3y ago

I wonder on the implications for portfolio management. Experts could be optimizers with historical pricing.

bottlepalm3y ago

Computer viruses in the future really will be ‘ghosts in the machine’

j / k navigate · click thread line to collapse