This is a more recent (Dec 2020) paper by one of the authors on combining empowerment and extrinsic goals: https://ieeexplore.ieee.org/abstract/document/9284556
I recall seeing a study (although not where) suggesting novelty seeking was a key hallmark of intelligence. Maybe this means the entropy-utility calibration drives their intelligence? (Alongside their actual material circumstances)
But also in loops, because we have to unlearn time to time, and break the ice, ok now I'm lost in my own metaphor :D
As far as I can tell, here it seems humans value choices over expected value. In other words, humans pay for the perception of freedom.
I.e. it's not that humans value choices over expected value, since valuing choices actually is the correct way to get larger expected value (with caveats such as how explore vs exploit tradeoff needs to be changed over time) - the message isn't that humans "pay for the perception of freedom" but that human evolved values, even seemingly irrational such as "need for perception of freedom" are actually close to mathematically optimal behavior.
An evidence implies an observable phenomenon (whose description is a statement of fact). Merely a chain of reasoning is never an evidence, because it usually based on flawed premises or domain-specific logic.
Now pay me.
In AI lens:
In a way, you can compare this to novelty seeking and intelligent exploration which is quite an active field in Artificial Life and game AI[1]. If you find this interesting: Jeff Clune, Kenneth Stanley and Joel Lehman conducted interesting related research.
Also, isn't this somehow related to the Free Energy principle by Karl Friston? If you look at entropy maximization as a way to minimize surprises.
Human behavior appears to point towards the maximization of current order as an investment in power/potential to drive future entropy, as opposed to simply maximizing entropy. This is the difference between building a nuclear bomb and keeping it, as opposed to building the bomb to use it. When one was used, it was meant to end a war, not start one. And success in life may as well be defined by hoarding order, be it technologically, financially, socially, or just objects. The pyramids were a feat in lowering entropy, not increasing it. And we love our diamonds.
This is also an extrapolation from the evidence in biology that energy entering a system increases order and contributes to the orderly structuring of matter and hence life [1].
[1] https://www.quantamagazine.org/a-new-thermodynamics-theory-o...
He has thanked me many time for that advice, which has resulted in a high-value path for him.
[1] https://www.alexwg.org/publications/PhysRevLett_110-168702.p...
A very strong heuristics that works well in many games (exceptions are usually very interesting games) and is the root to other heuristic concepts such as piece value, central positioning, "protected king", ... in Chess and similar concepts in, e.g., Starcraft.
Also very easy to implement, for discrete turn-based games it's just the number of moves in a given state.
"Yet we lack a decision-making framework that integrates preference for choice with traditional utility maximisation in free choice behaviour." -> utility maximisation "has charm for economists, but it rests on the shaky foundation of an implausible and untestable assumption" - Daniel Kahneman [2] -> TL;DR the author of "Thinking Fast and Slow" proves it false
"We found that participants were biased towards states that kept their options open, even when both states were balanced in the total number of goal locations. This bias was evident not only when both contexts were equally valuable but throughout all value conditions..." AND "Participants were not informed of the precise values ..." -> seeing the utilitarian variable being forced upon conclusions is disheartening
[1] https://www.thebalance.com/index-funds-vs-actively-managed-f... [2] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=870494