I believe that Monte Carlo Tree Search, used in AlphaGo, does work using bandit algorithms. On top of that AlphaGo uses Reinforcement Learning, which also uses bandit algorithms (in Sutton & Barto's book, "Reinforcement Learning: An Introduction", all of chapter 2 is about multi-armed bandits).