But this is not what happened. Instead, some guys told AI agents to explore in the way that the guys think that animals explore. "Stanford researchers invented the “curious replay” training method based on studying mice to help AI agents"
That arxiv stuff looks perfectly normal but I kind of hate how it got more and more caricatured as it went through the university press office and hacker news clickbait pipeline.
I've been wondering for a while at what the next steps in adding 'inefficiencies' to AI processing would look like, commenting the other day to a friend that what's needed in the next 18 months is getting AI to be able to replicate the Eureka moments in the shower where latent information is reconstructed in parallel to processing tangential topics.
Going from "attention is all you need" to "attention and curiosity is what you need" seems like a great next step!
As not an RL person (I'm in generative), have people not re-increased the exploration variable after the model has been initially trained? It seems natural to vary that ee trade-off.
Looks like you shadowbanned this account. Maybe for posting a URL in the first comment.
Something, something, The Bitter Lesson.
As far as simply differing, much of the time there’s a character limit that’s hit. I’ve seen many posts with comment from the poster calling out their edit to the title and the character limit is usually cited.
It would be especially difficult to keep the character limit (I think there are legitimate design reasons for this) while also requiring that the title matches the submission as closely as possible. Who decides what words are omitted without it potentially being any of: patronizing, inaccurate, or misleading?
History has been repeating itself for thousands of years. We keep killing the prophets, and putting the absolute worst of us on pedestals. What's rational about that?
Dolphins mucking about in the water - that's rational.
If it is irrational that history repeats itself, do you think that it would be rational if history progressed towards some goal, and if yes, what is that goal?
You hopefully get the picture. We may get better at remembering history if united via a common cause under a common leadership. Otherwise it's just an organism looking for food and trying to survive.
My current imagining says that novelty and unexpected inputs drive our immediate understanding of the world around us. To have expectations you have to have to have a model. When that model breaks and is adjusted you have a novel experience and the model can be updated. This feedback loop is critical.
Example: other day I was grilling food and my digital food thermometer was on the metal prep area near the hot griddle. As I was walking away I reached for it, grabbed it, and expected to pick it up. However! I didn't know it had a magnet and it gave me back unexpected stimulus.
I immediately jerked my hand away and several thoughts happened near instantly. My thoughts went from I burned my hand to no, no pain, maybe a really bad burn, to no, no heat, no sizzling of flesh, to oops, wrong stimulus, something resisted, resisted how, it slid but wouldn't pick up easy, ah, a magnet.
The researchers here are right, I expect. You need curiosity and some goal, but you need to constantly tune the input for expectations and tweak the (mental) model of the world.
How many times do you, for a split second, totally misinterpret what you see or feel but near instantly self correct? Better AI will require putting forth it's initial result and then validating the result with feedback. The more unexpected the feedback the more novel the experience and more learning that can happen.
[1] 2018, "The Markov blankets of life: autonomy, active inference and the free energy principle", https://royalsocietypublishing.org/doi/10.1098/rsif.2017.079...
And, things like GPT are not 'embodied', since they don't live in the 'world' they can't associate language with physical reality. Put them in a simulated environment like a game, and it looks a lot more 'conscious'.
is that coincidence?