This is related, and it is the paper that lives constantly rent free in my head. I think it will retroactively be viewed as revolutionary:
https://www.alexwg.org/publications/PhysRevLett_110-168702.p...Basically, intelligent behavior is optimizing for "future asymptotic entropy" vs maximizing any immediate value. How intelligent a system is then become a measure of how far in the future it can model and optimize entropy effectively for.
(updated with pdf link)