ResearchAgent: Iterative Research Idea Generation Using LLMs (opens in new tab)

(arxiv.org)

124 pointsmilliondreams2y ago63 comments

63 comments

37 comments · 7 top-level

not-chatgpt2y ago· 9 in thread

Cool idea. Never gonna work. LLMs are still generative models that spits out training data, incapable of highly abstract creative tasks like research.

I still remember all the GPT-2 based startup idea generators that spits out pseudo-feasible startups.

bigyikes2y ago

Ignoring the “spits out training data” bit which is at best misleading, it’s interesting that you use the word “abstract” here.

I recently followed Karpathy’s GPT-from-scratch tutorial and was fascinated with how clearly you could see the models improving.

With no training, the model spits out uniformly random text. With a bit of training, the model starts generating gibberish. With further training, the model starts recognizing simple character patterns, like putting a consonant after a vowel. Then it learns syllables, and then words, and then sentences. With enough training (and data and parameters, of course) you eventually yield a model like GPT-4 that can write better code than many programmers.

It’s not always that clear cut, but you can clearly observe it moving up the chain of abstraction as the training loss decreases.

What happens when you go even bigger than GPT-4? We have every reason to believe that the models will be able to think more abstractly.

Your “never gonna work” comment flies in the face of exponential curve we find ourselves on.

ethanwillis2y ago

If we keep extrapolating eventually GPT will be omniscient. I really can't think of any reason why that wouldn't be the case, given the exponential curve we find ourselves on.

3 more replies

ramraj072y ago

I have asked chat GPT to generate hypotheses on my PhD topic that I know every single piece of existing literature about and it actually threw out some very interesting ideas that do not exist out there yet (this was before they lobotomized it).

ta9882y ago

Did you try with the API directly? I've had great results with my own prompts, much less so with the chatgpt one.

voxl2y ago

> (this was before they lobotomized it)

Of course, of course. Because god forbid anyone be able to reproduce your suggestion. Funnily enough I tried the same and have the exact opposite experience.

growthwtf2y ago

I think that ship has sailed, if you believe the paper (which I do).

LLMs are already super-human at some highly abstract creative tasks, including research.

There are numerous examples of LLMs solving problems that couldn't be found in the training data. They can also be improved by using reasoning methods like truth tables or causal language. See Orca from Microsoft for example.

CuriouslyC2y ago

they don't just spit out training data, they generalize from training data. They can look at an existing situation and suggest lines of experimentation or analysis that might lead to interesting results based on similar contexts in other sciences or previous research. They're undertrained on bleeding edge science so they're going to falter there but they can apply methodology just fine.

llm_trw2y ago

They just need to be better at it than humans, which is a rather low bar when you go beyond two unrelated fields.

krageon2y ago

When you're this confident and making blanket statements that are this unilateral, that should tell you you need to take a step back and question yourself.

pedalpete2y ago· 7 in thread

I've found where LLMs can be useful in this context is around free-associations. Because they don't really "know" about things, they regularly grasp at straws or misconstrue intended meaning. This, along with the volume of language (let's not call it knowledge) result in the LLMs occasionally bringing in a new element which can be useful.

gotts2y ago

Can you list some examples where free-associations from LLM were useful to you?

pedalpete2y ago

A lot of where I've benefited is in some marketing language. Rarely, or almost never has ChatGPT come up with something and I've thought "that's exactly what we wanted", but through iterations, it's taken me down paths I might not have found myself.

Unfortunately, ChatGPT doesn't have a good search interface, so I can't search through older chats, but I know when I was looking at re-naming our company, it didn't come up with our new name, but it lead me down a path which did lead to our name.

I was trying to understand a patent, and we were looking at the algorithm which was being used. ChatGPT misunderstood how the algorithm worked, but pointed to it's knowledge of a similar algorithm which worked differently, but was better suited to our purposes.

Calling this "free-association" may be taking some liberty. Many people would consider these errors, or hallucinations, but in some ways, they do look very similar to what many would call free-association IMO.

PeterStuer2y ago

Long, long time ago (1999, before LLM's) I made a virtual museum exhibit creator for education. The collection explorer created a connected graph where the nodes were the works of art and the edges were based on commonalities from their textual descriptions. It used very rudimentary language technology so it 'suffered' from things like homographs. Rather than being seen as a problem, the users liked the serendipity it brought for ideation.

I assume free but not random association could be a comparable support for ideation in research.

bongodongobob2y ago

Assume free-associations = hallucinations. Assume hallucinations are exactly what makes LLMs useful and your question can be rephrased as "Can you list some examples where LLMs were useful to you?"

2 more replies

robwwilliams2y ago

This approach is already useful in functional genomics. A common type of question requires analysis of hundreds of potentially functional sequence variants.

Hybrid LLM+ approaches are beginning to improve efficiency of ranking candidates and even proposing tests and soon I hope—higher order non-linear interactions among DNA variants.

ssn2y ago

I am interested in this. Can you point to a reference about the application of LLMs to sequence secreening? Thanks.

1 more reply

deegles2y ago

I like thinking of LLMs as "word calculators." Which I think really encapsulates how they aren't "intelligent" as the marketing would have you believe but also show how important the inputs are.

KhoomeiK2y ago· 5 in thread

A group of PhD students at Stanford recently wanted to take AI/ML research ideas generated by LLMs like this and have teams of engineers execute on them at a hackathon. We were getting things prepared at AGI House SF to host the hackathon with them when we learned that the study did not pass ethical review.

I think automating science is an important research direction nonetheless.

srcreigh2y ago

That’s pretty wild. What was the reason behind failing ethics review?

robbomacrae2y ago

I'm generally a proponent of AI and LLM but to me the decision was the right one. You are tasking people with implementing an idea generated by an algorithmic model with (I'm guessing) zero oversight that might have very little training that teaches it the importance of coming up with ideas worth implementing. Some may be more useful than others so it won't be fair from an accomplishment or motivation point of view.

Imagine you've already invested time going to this event and want to win the prize/credit but to do so you have to implement a plugin that makes webpages grayscale because of a random idea generator. Maybe some people would find that interesting but others would see it as wasting their time.

3 more replies

CJefferson2y ago

One obvious problem is, what if the ideas were obviously unethical?

I would personally let this pass ethics if someone read all the generated ideas, and took personal responsibility for them passing the basic ethics rules, or got them through the ethics committee if required, exactly the same as they would their own ideas.

brigadier1322y ago

I don't think LLMs are the right approach for this. Coordinated science would basically be a search problem where we verify different facts using experiments and use what we learn to determine what experiment to do next.

visarga2y ago

When you can run experiments quickly it becomes feasible to use ML and evolutionary methods to do novel discoveries, like AlphaTensor's better matrix multiplication than Strassen, and AlphaZero's move 37, upturning centuries of game strategy.

The paper "Evolution through Large Models" shows the way. Just use LLMs as genetic mutation operators. Evolutionary methods are great at search, LLMs are great at intuition but get stuck on their own, they combine well. https://arxiv.org/abs/2206.08896

The interplay between LLMs and Evolutionary Algorithms, despite differing in objectives and methodologies, share a common pursuit of applicability in complex problems. Meanwhile, EA can provide an optimization framework for LLM's further enhancement under black box settings, empowering LLM with flexible global search capacities.

Since chatGPT was first released hundreds of millions of people have been using it for assistance, and the model outputs influenced their actions, maybe even supported scientists to make new discoveries. The LLM text is filtered through people and ends up as real world consequences and discoveries that are reported in text, and get in the next training set closing the loop.

Trillions of AI tokens per month do this slow feedback game. AI speeds up the circulation of useful information and ideas in human society, and AI feedback gets filtered by the contact with people and the real world.

UncleOxidant2y ago· 5 in thread

The ideas aren't the hard part.

tokai2y ago

This. Any researcher should, over a lunch, be able to generate more idea than can be tackled in a life time.

falcor842y ago

The fact that a human expert can also do it doesn't mean the AI isn't valuable. Even if you just consider the monetary aspect, those few API calls would definitely be cheaper than buying the researcher lunch. But the big benefit is being able to generate those ideas immediately and autonomously every time there's new data.

1 more reply

kordlessagain2y ago

The number of the ideas has nothing to do with the quality of the ideas. Some ideas a gold, many aren’t.

fpgamlirfanboy2y ago

Tell that to PhD advisor that took credit for all my work because they were his ideas (at least so he claimed).

passwordoops2y ago

Unfortunately the good ones who do not steal credit are few and far between. Current incentives select for this behaviour. Not just in academia, but about everywhere.

Go to any meeting and state the obvious fact that "any idiot can have an idea. Making it happen is the tough part" then watch how the decision makers react

SubiculumCode2y ago· 2 in thread

In some fields of research, the amount of literature out there is stupendous, and with little hope of a human reading, much less understanding the whole literature. Its becoming a major problem in some fields, and I think, in some ways, approaches that can combine knowledge algorithmically are needed, perhaps llms.

wizzwizz42y ago

Traditionally, that's what meta-analyses and published reviews of the literature have been for.

SubiculumCode2y ago

even so.

deegles2y ago· 2 in thread

It would be fun to pair this with an automated lab that could run experiments and feed the results into generating the next set of ideas.

imranq2y ago

Check out: https://www.insitro.com/

They have an automated robotics powered research lab

geraneum2y ago

What would an automated lab look like?

barathr2y ago

This strikes me as similar to Cargo Cult Science.

https://calteches.library.caltech.edu/51/2/CargoCult.htm

https://metarationality.com/upgrade-your-cargo-cult

j / k navigate · click thread line to collapse

63 comments

37 comments · 7 top-level

not-chatgpt2y ago· 9 in thread

Cool idea. Never gonna work. LLMs are still generative models that spits out training data, incapable of highly abstract creative tasks like research.

I still remember all the GPT-2 based startup idea generators that spits out pseudo-feasible startups.

bigyikes2y ago

Ignoring the “spits out training data” bit which is at best misleading, it’s interesting that you use the word “abstract” here.

I recently followed Karpathy’s GPT-from-scratch tutorial and was fascinated with how clearly you could see the models improving.

It’s not always that clear cut, but you can clearly observe it moving up the chain of abstraction as the training loss decreases.

What happens when you go even bigger than GPT-4? We have every reason to believe that the models will be able to think more abstractly.

Your “never gonna work” comment flies in the face of exponential curve we find ourselves on.

ethanwillis2y ago

If we keep extrapolating eventually GPT will be omniscient. I really can't think of any reason why that wouldn't be the case, given the exponential curve we find ourselves on.

3 more replies

ramraj072y ago

ta9882y ago

Did you try with the API directly? I've had great results with my own prompts, much less so with the chatgpt one.

voxl2y ago

> (this was before they lobotomized it)

Of course, of course. Because god forbid anyone be able to reproduce your suggestion. Funnily enough I tried the same and have the exact opposite experience.

growthwtf2y ago

I think that ship has sailed, if you believe the paper (which I do).

LLMs are already super-human at some highly abstract creative tasks, including research.

CuriouslyC2y ago

llm_trw2y ago

They just need to be better at it than humans, which is a rather low bar when you go beyond two unrelated fields.

krageon2y ago

When you're this confident and making blanket statements that are this unilateral, that should tell you you need to take a step back and question yourself.

pedalpete2y ago· 7 in thread

gotts2y ago

Can you list some examples where free-associations from LLM were useful to you?

pedalpete2y ago

PeterStuer2y ago

I assume free but not random association could be a comparable support for ideation in research.

bongodongobob2y ago

Assume free-associations = hallucinations. Assume hallucinations are exactly what makes LLMs useful and your question can be rephrased as "Can you list some examples where LLMs were useful to you?"

2 more replies

robwwilliams2y ago

This approach is already useful in functional genomics. A common type of question requires analysis of hundreds of potentially functional sequence variants.

Hybrid LLM+ approaches are beginning to improve efficiency of ranking candidates and even proposing tests and soon I hope—higher order non-linear interactions among DNA variants.

ssn2y ago

I am interested in this. Can you point to a reference about the application of LLMs to sequence secreening? Thanks.

1 more reply

deegles2y ago

I like thinking of LLMs as "word calculators." Which I think really encapsulates how they aren't "intelligent" as the marketing would have you believe but also show how important the inputs are.

KhoomeiK2y ago· 5 in thread

I think automating science is an important research direction nonetheless.

srcreigh2y ago

That’s pretty wild. What was the reason behind failing ethics review?

robbomacrae2y ago

3 more replies

CJefferson2y ago

One obvious problem is, what if the ideas were obviously unethical?

brigadier1322y ago

visarga2y ago

UncleOxidant2y ago· 5 in thread

The ideas aren't the hard part.

tokai2y ago

This. Any researcher should, over a lunch, be able to generate more idea than can be tackled in a life time.

falcor842y ago

1 more reply

kordlessagain2y ago

The number of the ideas has nothing to do with the quality of the ideas. Some ideas a gold, many aren’t.

fpgamlirfanboy2y ago

Tell that to PhD advisor that took credit for all my work because they were his ideas (at least so he claimed).

passwordoops2y ago

Unfortunately the good ones who do not steal credit are few and far between. Current incentives select for this behaviour. Not just in academia, but about everywhere.

Go to any meeting and state the obvious fact that "any idiot can have an idea. Making it happen is the tough part" then watch how the decision makers react

SubiculumCode2y ago· 2 in thread

wizzwizz42y ago

Traditionally, that's what meta-analyses and published reviews of the literature have been for.

SubiculumCode2y ago

even so.

deegles2y ago· 2 in thread

It would be fun to pair this with an automated lab that could run experiments and feed the results into generating the next set of ideas.

imranq2y ago

Check out: https://www.insitro.com/

They have an automated robotics powered research lab

geraneum2y ago

What would an automated lab look like?

barathr2y ago

This strikes me as similar to Cargo Cult Science.

https://calteches.library.caltech.edu/51/2/CargoCult.htm

https://metarationality.com/upgrade-your-cargo-cult

j / k navigate · click thread line to collapse