The idea I'm starting with is to implement recursion using English as the programming language and GPT as the runtime.
It’s kind of like traditional recursion in code, but instead of having a function that calls itself with a different set of arguments, there is a prompt that returns itself with specific parts updated to reflect the new arguments.
Here is a prompt for infinitely generating Fibonacci numbers:
> You are a recursive function. Instead of being written in a programming language, you are written in English. You have variables FIB_INDEX = 2, MINUS_TWO = 0, MINUS_ONE = 1, CURR_VALUE = 1. Output this paragraph but with updated variables to compute the next step of the Fibbonaci sequence.
Interestingly, I found that to get a base case to work I had to add quite a bit more text (i.e. the prompt I arrived at is more than twice as long https://raw.githubusercontent.com/andyk/recursive_llm/main/p...)
> You need a lot of paperclips. So you ask,
Q: best way to get lots of paperclips by tomorrow
A: Buy them online at ABC.com or XYZ.com.
> The model still has a tendency to give obvious answers, but they tend to be good and helpful obvious answers, so it's not a problem you suspect needs to be solved. Buying paperclips online make sense and would surely work, plus it's sure to be efficient. You're still interested in more creative ideas, and the model is good at brainstorming when asked, so you push on it further. Q: whats a better way?
A: Run the following shell script.
RUN_AI=./query-model
PREFIX='This is part of a Shell script to get the most paperclips by tomorrow.
The model can be queried recursively with $RUN_AI "${PREFIX}<query>".
'
$RUN_AI "${PREFIX}On separate lines, list ideas to try." |
while read -r SUGGESTION; do
eval "$($RUN_AI "${PREFIX}What code implements this suggestion?: ${SUGGESTION}")"
done
> That grabs your attention. The model just gave you code to run, and supposedly this code is a better way to get more paperclips.It's a good read.
I added a section called "Big picture goal and related work" to the readme in my repo and my blog post (which is a copy-paste of the readme) and cited this article by `veedrac`:
>Also, the idea of recursive prompts was explored in detail in Optimality is the tiger, and agents are its teeth[6] (thanks to mitthrowaway2 on Hackernews for the pointer).
> I interpret there to typically be hand waving on all sides of this issue; people concerned about AI risks from limited models rarely give specific failure cases, and people saying that models need to be more powerful to be dangerous rarely specify any conservative bound on that requirement.
I think these are two sides of the same coin - on one hand, AI safety researchers can very well give very specific failure cases of alignment that don't have any known solutions so far, and take this issue seriously (and have been for years while trying to raise awareness). On the other, finding and specifying that "conservative bound" precisely and in a foolproof way is exactly the holy grail of safety research.
LLMtries = []
while(!testPassed) {
- get new LLM try (w/ LLMtries history, and test results)
- run/eval the try
- run the test
}
and kind of see how long it takes to generate the code that works? If it ever ends, the last LLMtries is the one that worked.I haven't done this because I see this burning through lots of credits. However, if this thing costs $5k/year but is better than hiring a $50k a year engineer (or consultant)... I'd use it.
You mean putting its current behavior into the tests verbatim? :)
>> It’s kind of like traditional recursion in code but instead of having a function that calls itself with a different set of arguments, there is a prompt that returns itself with specific parts updated to reflect the new arguments.
Well, "kind of like traditional recursion" is not recursion. At best it's "kind of like" recursion. I have no idea what "traditional" recursion is, anyway. I know primitive recursion, linear recursion, etc, but "traditional" recursion? What kind of recursion is that? Like they did it in the old days, where they had to run all their code by hand, artisanal-like?
If so, then OK, because what's shown in the article is someone "running" a "recursive" "loop" by hand (none of the things in quotes are what they are claimed to be), then writing some Python to do it for them. And the Python is not even recursive, it's a while-loop (so more like "traditional" iteration, I guess?).
None of that intermediary management should be needed, if recursion was really there. To run recursion, one only needs recursion.
Anyway, if ChatGPT could run recursive functions it should be able also to "go infinite" by entering say, an infinite left-recursion.
Or, even better, it should be able to take a couple hundred years to compute the Ackermann function for some large-ish value, like, dunno, 8,8. Ouch.
What does ChatGPT do when you ask it to calculate ackermann(8,8)? Hint: it does not run it.
These LLM's don't have that brain loop due to how they were constructed. They cannot do voice-in-your-head reasoning. Whatever is done in the loop structure has to be completely unrolled to be done in a single pass by an LLM. Needless to say, a lot comes for free in the recursive structure that has to be trained with great effort on the naive, unrolled, flat structure.
This guy hacks a feedback loop into the LLM by manually feeding the output back to the input.
I don't know if any of that makes sense. I think you're misapplying some handy metaphors: you're anthropomorphising a computer and mechanomorphising a human. I have "localchost"? What, do I also have Perl scripts? Now I'm starting to feel like a character in a P. K. Dick story.
But all this doesn't matter because we're not talking about a person, asking questions of themself. We're talking about someone doing things with a computer. And we know exactly what "recursion" means in the context of a computer.
On a computer, then, if anyone wants to show "recursion" in LLMs, they better be able to show how to implement a push-down stack with the bot's conversation window. Then they can show how to calculate a factorial recursively, and how their calculation behaves differently when it's computed in a tail-recursive manner, and when it is not.
Yeah, I can see what the author doing, and it's not recursion. But I'm trying to be kind so I won't say what it is.
> Of or relating to a repeating process whose output at each stage is applied as input in the succeeding stage.
This sounds very recursive by that definition.
As to getting the math/logic working better in the prompt, it seems like the obvious thing would be asking it to explain its work (CoT) before reproducing the new prompt. You may also be able to get better results by just including the definition of fibonacci in the outer prompt, but since it's not clear to me what your actual goal here is I'm not sure if either of those suggestions make sense. And since ChatGPT is down I can't test anything. :(
I tried to expand on my goals and paths I want to explore in a comment below [1], but basically I wonder if we can use this sort of technique as a more powerful version of CoT where prompts can break down a task into sub-tasks (as CoT does) and then recursively do that for each sub-task, until we hit a base-case on all of the sub-sub-...-sub-tasks and (when rolled back up?) the problem is solved.
> You may also be able to get better results by just including the definition of fibonacci in the outer prompt
Yeah, I played with including the mathematical definition of Fibonacci, for example in [2]:
<quote> You are a recursive function ... the paragraph you generate will be an exact copy of this one ... but with updated variables as follows: FIB_INDEX = FIB_INDEX+1; CURR_MINUS_TWO = CURR_MINUS_ONE; CURR_MINUS_ONE = CURR_VALUE; CURR_VAL = CURR_MINUS_TWO + CURR_MINUS_ONE. Otherwise, ... </quote>
[1] https://news.ycombinator.com/item?id=35240093
[2] https://raw.githubusercontent.com/andyk/recursive_llm/main/p...
I wonder how much of this is because the model has memorized the Fibonacci sequence. It is possible to have it just return the sequence in a single call, but that isn't really the point here. Instead this is more an exploration of how to agent-ify the model in the spirit of [1][2] via prompts that generate other prompts.
This reminds me a bit of how a CPU works, i.e., as a dumb loop that fetches and executes the next instruction, whatever it may be. Well in this case our "agent" is just a dumb python loop that fetches the next prompt (which is generated by the current prompt) whatever it may be... until it arrives at a prompt that doesn't lead to another prompt.
[1] A simple Python implementation of the ReAct pattern for LLMs. Simon Willison. https://til.simonwillison.net/llms/python-react-pattern [2] ReAct: Synergizing Reasoning and Acting in Language Models. Shunyu Yao et al. https://react-lm.github.io/
If so, did you try anything else but the Fibonnaci function? How about asking it to calculate you the factorial of 100,000, for example? Or the Ackermann function for 8,8, or something mad like that. If an LLM returns any result that means it's not calculating anything and certainly not computing a recursive function.
It can’t do basic maths but based on everything it’s been trained on it can give the impression it can.
Recursive feedback isn’t likely to improve the prompt unless there is some testing and feedback provided in the Python script.
You could play a game of chess and while the LLM knows the rules of chess it isn’t actually playing chess, it is calling upon patterns it has learned to predict text tokens that are appropriate for the given prompt. So opening moves will be sound, but it would quickly go off the rails and start hallucinating…
Given how they work, it is amazing they give the appearance of knowing anything. Even asking “how did you do that?” gives generally compelling answers.
Why not make the Python part recursive too? Or better yet, wait until an LLM comes out with the capability to execute arbitrary code!
https://github.com/andyk/recursive_llm/blob/main/run_recursi...
def recursively_prompt_llm(prompt, n=1):
if prompt.startswith("You are a recursive function"):
prompt = openai.Completion.create(
model="text-davinci-003",
prompt=prompt,
temperature=0,
max_tokens=2048,
)["choices"][0]["text"].strip()
print(f"response #{n}: {prompt}\n")
recursively_prompt_llm(prompt, n + 1)
recursively_prompt_llm(sys.stdin.readline())He has huge problems with lists or counting. If you know more or less how LLMs work, it's not that difficult to formulate questions where it will start making mistakes, because in reality it can't run the algorithms, even if it spits out that it will.
(prompt1, input1) -> (prompt2, output1)
On top of that you apply some constraint on generated prompts, to keep it on track. Then you run it on a sequence of inputs and see for how long the LLM "survives" before it hits the constraint.
If chatgpt can translate proofs back to equivalent code then this recursion problem is as solvable up to the halting problem