undefined | Better HN

0 pointsbulletninja1y ago0 comments

Wow, very well put. Any suggestions for academic papers, books, or even online resources on these topics would be greatly appreciated.

0 comments

Rhapso1y ago

This is related, and it is the paper that lives constantly rent free in my head. I think it will retroactively be viewed as revolutionary: https://www.alexwg.org/publications/PhysRevLett_110-168702.p...

Basically, intelligent behavior is optimizing for "future asymptotic entropy" vs maximizing any immediate value. How intelligent a system is then become a measure of how far in the future it can model and optimize entropy effectively for.

(updated with pdf link)

programjames1y ago

Great paper! There are some similar ideas to this in game theory and reinforcement learning (RL):

[1]: Thermodynamic Game Theory: https://adamilab.msu.edu/wp-content/uploads/AdamiHintze2018....

[2]: piKL - KL-regularized RL: https://arxiv.org/abs/2112.07544

[3]: Soft-Actor Critic - Entropy-regularized RL: https://arxiv.org/abs/1801.01290

[4]: "Soft" (Boltzmann) Q-learning = Entropy-regularized policy gradients: https://arxiv.org/abs/1704.06440

Rhapso1y ago

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372954/

j / k navigate · click thread line to collapse

0 comments

Rhapso1y ago

(updated with pdf link)

programjames1y ago

Great paper! There are some similar ideas to this in game theory and reinforcement learning (RL):

[1]: Thermodynamic Game Theory: https://adamilab.msu.edu/wp-content/uploads/AdamiHintze2018....

[2]: piKL - KL-regularized RL: https://arxiv.org/abs/2112.07544

[3]: Soft-Actor Critic - Entropy-regularized RL: https://arxiv.org/abs/1801.01290

[4]: "Soft" (Boltzmann) Q-learning = Entropy-regularized policy gradients: https://arxiv.org/abs/1704.06440

Rhapso1y ago

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372954/

j / k navigate · click thread line to collapse