undefined | Better HN

0 pointsur-whale7y ago0 comments

Doesn't alphago use some form of Bandit algorithm in their MonteCarlo code?

0 comments

1 comments · 1 top-level

I believe that Monte Carlo Tree Search, used in AlphaGo, does work using bandit algorithms. On top of that AlphaGo uses Reinforcement Learning, which also uses bandit algorithms (in Sutton & Barto's book, "Reinforcement Learning: An Introduction", all of chapter 2 is about multi-armed bandits).

j / k navigate · click thread line to collapse