Bandit Algorithms Book [pdf] (opens in new tab)

(downloads.tor-lattimore.com)

195 pointscsabapalfi7y ago16 comments

16 comments

15 comments · 9 top-level

tomkat07897y ago· 5 in thread

Never heard of bandit algorithms before! Or if I did I didn't recognize it as something different from probability. What have people around here used them for?

haffi1127y ago

You can use it when determining the best solution being tested in as few trials as possible.

Say you are selling a product and you are AB testing something related to buying the product. When a user visits the site you ideally want to give him the version you are more confident is better. By using a bandit approach you can determine if say option A is currently better (w.r.t. some confidence bounds). After each visit you can update the bounds and after sufficiently many visits you have a winner. The main difference to more traditional AB testing is that the process is more adaptive and less time is wasted on exposing an inferior product to the user.

bochi7y ago

Bandits are probably one of the most underrated machine learning algorithms. One possible application is recommendation systems. Shameless self promotion. I wrote an article about it: https://towardsdatascience.com/how-not-to-sort-by-popularity...

mlechha7y ago

They're probably the most fundamental kind of reinforcement learning algorithms. Understanding bandit algorithms is crucial to developing a good understanding of RL.

zdkl7y ago

This rust project, to manage the number of threads in a monero miner afair. https://github.com/Ragnaroek/mithril

ur-whale7y ago

Doesn't alphago use some form of Bandit algorithm in their MonteCarlo code?

1 more reply

HugoDaniel7y ago· 1 in thread

Is this the book that is going to make me a poker master player ?

srean7y ago

If you play long enough it will make you regret less

pronoiac7y ago

This came up a couple of days ago: https://news.ycombinator.com/item?id=17637683

clickok7y ago

I skimmed through this and have already found a bunch of interesting sections, but there's also a ton of background information on topics related to bandit algorithms.

The authors say that this is the first draft of the book submitted to the publisher, so I suppose it's nearly complete? More details available at the site they put up, http://banditalgs.com/

shoo7y ago

Readers who enjoy banditry may also enjoy John Langford's http://hunch.net

joshuamorton7y ago

It always makes me sad that Thompson Sampling isn't (or at least doesn't appear to be) mentioned alongside things like UCB1. Its theoretically optimal, and relatively easy to grok, and not significantly more difficult to implement.

dsvmn7y ago

I really appreciate sharing the book. However, to everyone in charge with naming these files, please don't call it "book.pdf". It makes everyone go to their computer and rename the file after downloading it so that they can find it later. Give it a more intuitive name.

Thanks

daleroberts7y ago

Cool, nice to see that Tor was a student at ANU.

sureaboutthis7y ago

Well that's really great! What is it?

j / k navigate · click thread line to collapse