Whether any others have been made before now is anyone's guess. Botting is a known problem in online poker. If there's a golden goose out there, I'm sure it's being kept under wraps.
On another point, CMU just can't seem to catch a break, their thunder continuously being stolen by UofAlberta in poker research, first in limit, now no limit. UofA clearly tried to publish this before the CMU poker challenge that's supposed to begin soon.
To read more about the CMU challenge http://www.cmu.edu/news/stories/archives/2017/january/poker-...
And then of course you don't get anything unless you're one of the top three winners against the bot, so there's likely nothing to be gained from grinding out a marginal victory. You should just go ahead and play kinda stupid/aggro and hope you win some of the big flips and whatnot. There's literally nothing at stake for you except time value, so you might as well flame out early and then quit or run up a big stake to give yourself a shot at top 3.
Basically, the study design ensures the bot faces off against weak players playing in a way that would be sub-optimal in any other situation. Not surprised the bot won by a decent margin, nor that they are trying to spin this real hard in advance of the CMU poker bot matchup next week, which will be much more rigorous.
Top professionals are building their strategy around game theory. They'll attempt to play in such a way that they aren't exploitable and look to deviate when they've spotted a weakness in their opponent's play.
Basically, the game theory optimal strategy is unexploitable. In every situation, the best you can do is break even by also playing the optimal strategy. If you deviate from optimal strategy, the optimal strategy will beat you, but it's possible that a strategy tailored to taking advantage of your specific deviations would beat you more quickly.
Unexploitable play typically means that you bet a size with a range of holdings that would make your opponent indifferent to all of his options (And the converse is true when facing a bet). For humans, this means that they gravitate to a few standard bet sizes, while a computer could, in theory, balance their range with much more granularity.
Last I read, for training the neural net it'll play billions+ hands against versions of itself designed to exploit various weaknesses. It'll start out by performing random actions, for example, say it'll have a 33% chance to call your bet, raise, or fold. It then starts to see that it does better when it raises your bet with the nuts and also as a bluff. Eventually, it arrives at an equilibrium strategy.
Since computers are much better at randomness than humans are, they're able to more effectively play these types of strategies and with more complexity of bet sizing. There is what's called a mixed strategy, a strategy where given a situation with the same hole cards you will call, raise, or fold to a bet with some non-zero probability. Doing that as a human is very difficult, but it's something computers manage to do quite easily.
First, how does one become a pro poker player?
Second, does it work like a sport where you get paid from sponsorships, or do you just directly take home what you win? Or a combination of both?
Third, is this something that you can do part-time, or does it require full time attention?
Fourth, why did you quit?
Exploitative strategies, based on understanding opponent weaknesses and tendencies will win $ at a higher rate, but are themselves exploitable.
For example, almost never bluffing and playing only strong cards crushes beginners who play too many hands and call too much.
This strategy is easily beaten though by stealing most pots and then not paying off the infrequent big bets (strong hands don't come often enough).
A "perfect" game theory strategy is like armor, slowly bleeding the opponent every time they deviate from perfection themselves.
not sure if that helps but maybe some seeds to google at least
To dig deeper:
Or you can try actively punishing the big hands by folding out early. Of course, that strategy opens you up to being bled by your opponent bluffing strong hands. Attempting to actively punish the big hands here is a deviation. This is what jspiral means by "deviations from perfection".
The reason why simply playing the numbers fails is that if I know an opponent is playing this way, I'll just fold every time he decides to play.
Bluffing is a major component in No Limit, and there are very different profitable playing strategies.
computers are better at bluffing and randomness than humans are. Bluffing is an important optimizing strategy in playing poker well, and it entails tracking the expected value of a pot (which includes cost expectations, don't forget) and it entails randomness, necessary to obfuscate patterns of betting that could give away evidence of your bluffing strategy. Like chess and go, we may not be "there" yet with computers, but n00bs need to understand the theory.
What computers can't do is read "tells", so if you are a master poker player via tells (whether it's unconscious or conscious thinking on your part) then you will beat other humans better than a computer will; but, by the same token, the computer will not give you tells to read nor be fooled by your fake tells. I think the mistake in thinking newbies (even highly experienced ones) make is mixing together "the psychology" of the game with the mathematics of the game.
So to give an oversimplified concrete example of a poker bluffing strategy (inspired by Nesmith Ankeny's book), if odds of you drawing one of the cards you need to win a showdown are 1 out of 4 but the expected payoff is 20x then you not only need to stay in purely on expected value, but it is also an optimal time to bluff if you don't get your card. It is informationally better to have a bluffing strategy that masquerades as an "I have good cards" strategy and gives random information after the showdown rather than "bluffing" being something you do sheerly when you have shit cards. And to enforce a random strategy on yourself, he recommends using a system of the cards in your hand as the random number generator to tell you whether to bluff or not: as you can see, his strategy designed for human players is more perfectly implemented by a computer.
But they don't yet do that: this paper is about beating humans at heads up, which is a much more limited domain than a full table.
If you want to learn about why to bluff I'd recommend reading about using game theory to solve Kuhn poker.
I did a quick google review of Kuhn poker and I don't see how any of that would not benefit from the understanding I was attempting to convey in my initial post.
Timing can be informative, but it's actually weaker online than in person. In person, you know whether the person is physically present, and can generally gauge when they're paying attention also. Online, taking a long time could simply mean that they're not paying attention. (I've watched streams on Twitch of pro players working multiple tables online.)