However, I also wish that there were comparable efforts to create AIs that train humans. Basically, figure out a way to systematically, efficiently, and scalably train amateurs into masters. That IMO would be absolutely amazing (and something I'd gladly pay for).
Computers can do a lot of accurate brute-forcing; humans must see the position in a more holistic, intuitive way.
Excellent human players and excellent computer players are presumably doing completely different calculation tasks.
I would suggest that computers are still bad at approaching the task in a human-like way, but they will always be able to improve their method via Moore's law (at a minimum) where humans are stuck at their current level.
Stockfish might be able to tell you what it was doing, but not in a way that it would be reasonable for a human to follow.
What you are looking for is a teacher. You can pay for those ;) The closest we have got to a teacher app is perhaps a MOOC: not much computational progress has been made.
I (a non-chess player) just tried to play a game against the highest level AI (and lost, obviously).
Doing an analysis of the game afterwards, this is exactly what I experience: I do "f4" and I'm told (through the analysis tool) that the best move was "Nf3".
Now, the obvious question this leads to is: why? Why was this a better move? I don't think that, as a human being, memorizing "best moves" is going to lead to much improvement: we need to know WHY that move was the best move.
I'm sure there is a human-friendly way to explain why one move is the best move, and why my move wasn't, but the computer probably doesn't know this explanation, because it's approaching it from a brute-force perspective.
Surely, a chess computer can brute force all possible combinations, and deduce that this was the best move. But when this is not possible for a human being, just informing the me that "what you just did was not the best move", doesn't really do much to help me (as an amateur player).
The game, for reference: http://en.lichess.org/ehWjHnIc#0
Looking at the code posted, if the sub-scores were stored in an array and only added at the end, it would be possible to compare the positions after two moves by sub-score and find the biggest differences between subscores. Then you could say that position A is better than position B because it avoid doubled pawns, or has better bishops, etc.
The neurological topiary for chess is strictly harder, because chess has fewer analogues in day-to-day experience.
It's not a huge leap from modelling the game of chess to modelling the skill of chess-playing. Stockfish already offers post-game analysis. Imagine extending that to analyze a players entire career, and offer a series of problems, games against specially-constructed opponents, analysis of relevant historical games etc, all of which are designed to help the player improve. Given Siri, Google Now, Watson and so on, we're probably not far from being able to have a meaningful natural-language conversation with a computer on a narrow subject like chess.
One could imagine this kind of thing being extended to teaching other, similarly focussed skills, at least for beginners. Piano. Tennis. Rock climbing. Maybe even things like soccer and basketball.
But that kind of teaching-based-on-deep-analysis is a long way off for subjects like physics or electrical engineering. Computers can't do physics, much less evaluate human physicists. The best we can do in these areas is something like the Khan Academy, where computers present "content" created by humans, administer standardized tests designed by humans and present the results to humans for interpretation.
So yeah, teaching chess in a really sophisticated way isn't all that useful in the sense that physics or EE are useful. But really, if we could teach computers to understand physics better than people do, we'd use them to make breakthroughs in physics, and that would be a much bigger deal than being able to teach physics more effectively.
On the other hand, we don't play chess or tennis or piano because they're useful, so expert AI teachers for these subjects would be really valuable.
Why do you need an AI? I don't play chess, but I suppose the above is more or less what an elite chess school provides, and you could likely reproduce it with books + practice + private lessons. That is, what Ericson calls deliberate practice and coaching.
Real players at each level tend to have characteristic weaknesses that are expressible in terms of the same factors used in a program's evaluation function. Consider some of the following, which players at a certain level will exhibit and then get past as they improve.
* Bringing the queen out too early.
* Missing pins and discoveries.
* Failing to contest the center.
* Creating bad bishops.
* Bad pawn structure.
* Failing to use the king effectively in the endgame.
These are all quantifiable. They could all be used to create a more realistic and satisfying opponent at 1200, 1500, 1800, etc. All it takes is some basic machine learning applied to a corpus of lower-level games, and a way to plug the discovered patterns into the playing engine.
the other problem is that the devs of the current top scrabble engines, quackle and elise, are (naturally) focused on getting better and better at playing, not on plausible ways to play badly. it's something i keep meaning to work on when i have some spare time; i have a few ideas, but nothing i've had the time to explore properly.
But it's not hard to get the thing to use xboard in linux.
This is what i use to invoke it: xboard -fUCI -fcp stockfish -sUCI -scp stockfish&
it's very challenging - and in my experience, will unfortunately peg a core unless you explicitly pause the front end (the "P" button in between the two arrows in the upper right)
But the xboard way allows me to modify the engine, get debug output, try various different weights for pieces, put two engines against each other to see how specific features affect game play...
It helps me demystify such sophisticated software.
From a statistical point of view, this isn't actually significant, despite the fact that draws help reduce the variance.
45 of those games are draws, leaving a 13-6 score in favor of Stockfish. Considering a null hypothesis of a binomial distribution with n=19 and equal chance of winning, the two-sided p-value for that score is 0.115. Unless you already have strong evidence that Stockfish is better than Komodo, you shouldn't conclude anything about which one is best.
Suppose you have a coin, which gives a random outcome X. But you can only observe the outcome of X when another independent binary random variable Y is true. How can you tell if X is biased? Since X and Y are independent, the observations where Y is false are irrelevant since they don't tell you anything about X. So you just keep the observations where Y is true, and from there you can apply a binomial statistical test to the observations of X.
[ In case you're wondering whether applying statistical tests to variable sample sizes is valid, the answer is yes: a p-value is a uniform random variable from the set of observables (augmented by a continuous random variable, since our set of observables is discrete) to [0,1]. Our p-value is a mixture of p-values on smaller sample sizes, so it is still uniform. ]
This is exactly what happens here: consider a random outcome {win,lose,draw}. If you don't have a draw, let Y be true and X be the outcome of the game. If you have a draw, let Y be false and X be a random coin with the same distribution as for non-drawn games. Then X and Y are independent random variables and the above applies.
Informally: draws are not useful information in determining whether there are more wins than losses.
The draws are at best evidence towards equality (not against it). Allow them to vary and the likelihood of seeing a difference of 9 wins in 64 games with 45 draws moves up to 0.13 or 13% (when we assume the two players are identical, an appropriate null hypothesis) (even less significant). So in about one tournament in 8 you would expect this much of a lead, even if it was one algorithm playing itself. So from one tournament we say it is likely the one algorithm is in fact better, but it doesn't rise to the standard of being statistically significant.
<code>
# R code to empirically estimate two-sided probablity of
# seeing a lead of 9 games when 64 games are played
# and the assumed probability of a draw is 45/64
# with the null assumption win/loss odds are equal
simulate <- function(nplay,ndraw) {
sample(c('w','d','l'),size=nplay,replace=TRUE,
prob=c((nplay-ndraw)/2,ndraw,(nplay-ndraw)/2)/nplay)
}wldiff <- function(v) { abs(sum(v=='w')-sum(v=='l')) }
set.seed(350920)
stats <- replicate(10000,wldiff(simulate(64,45)))
print(sum(stats>=13-6)/length(stats))
## [1] 0.1341
</code>
(it is weird that somebody, not me, created a throw-away account to make the original comment. likely they are involved in chess development, or know how quickly stat discussions go sideways)
If this is two humans playing against each other in real-time, that is really impressive! It's so mind-boggling fast (mind you I'm not a chess player).
Can anyone inform an amateur like me what happens at 2:14 (https://www.youtube.com/watch?v=7YWYS209ydE#t=134) and before that as well. It seems to me that black captures his own pawn using his knight. Is that legal?
EDIT: Ah, I see. White pawn captures black pawn, and the knight captures the white pawn. I just happens so fast I didn't see the white move.
Can someone explain what's going on at 20:24 : http://youtu.be/7YWYS209ydE?t=20m20s ?
He forks the king and the queen, then his opponent moved his queen!?! This is not a legal move right?
An year ago Stockfish was near the top, but not quite there. Since then its development model became more open, and the open testing framework was introduced. The rate it's been adding Elo rating points since then has been amazing.
that seems insane.
You'd learn more by studying a book on AI, and maybe trying to implement an engine yourself, iMO.
The difference in score between the top engines is not that big.
The biggest value in studying computer chess for a programmer is IMO in seeing all the different performance optimization tricks.
https://www.youtube.com/watch?v=pNvVWeHZG00
As you can tell from his commentary, he thinks the machine is rather stupid (and he is winning after the opening) but the machine is a much better calculator than he is and when the situation becomes more concrete he has to force a draw.
Of course, the computer on his phone is considerably worse than the best chess engines, but top chess players generally consider computers to be excellent calculators but dumb in terms of general strategy.
Is computer assisted chess a thing? Perhaps with standardized hardware, but any software.
In chess I'm good at strategy but terrible at calculating and I miss obvious stuff all the time in my fight for strategy. I always thought I'd do great with computer assistance to look for the obvious stuff, and me telling the computer the long term strategy.
> That either shows a profound understanding... or a profound lack of understanding. ... ROOK B6! He has NO SHAME
As far as I understand the app (whose avatar's age goes up to 23) actually plays better than the living world champion?
— Magnus Carlsen, in an interview with Peter Thiel at the Churchill Club: https://www.youtube.com/watch?v=ZBnSU-LX1ss&t=23m51s
It's fantastic to see it take on the best commercial engines toe-to-toe and come out on top.
These ratings are computer ratings, and because they are playing very much isolates (computers vs other computers), without enough external human encounters, the ratings are consistent within the computer sphere, but not really like-for-like with human rating lists.
A lot has happened since Deep Blue - that was just a 6 game match, which really only proved that machines can play near a grandmaster level, but without nerves. And at times that is good enough to beat a world champion in match conditions, because the human is more susceptible to the psychological events.
Since Deeper Blue we've seen the emergence of chess engines on personal computers (rather than racks and racks of RS6000 mid-range servers), so Chess Genius, Fritz, Shredder, Junior, Rybka, Houdini and now Stockfish pushing chess engines ahead.
The Chess Genius - Junior levels show chess engines capable of matching grandmasters at blitz/active-play speeds, but still struggling at slower time controls.
Rybka onwards show a computer engine where elite grandmasters are seriously challenged.
Chess hasn't been solved. Just watching Magnus Carlsen's play shows we are expanding chess knowledge incrementally one game at a time. Yes, progress has slowed down, but the incremental improvement are still there.
Chess technique is refined to such a super high degree, but still far short of "solving chess". Perhaps this current generation can beat grandmasters when psychological factors are minimised, it's hard to tell.
It's certainly not as clear cut as a Porsche vs Usain Bolt over 100 meters.
1) Computer vs Computer can still go both ways so clearly the one chess engines make mistake others can exploit; 2) Some sequences are still hard find for computers, e.g. [1]. The long king walk by White is beyond the horizon of the computer. 3) Computer + Man against Computer + Man correspondence matches do benefit from human intervention.
[1]: http://www.chessgames.com/perl/chessgame?gid=1124533&kpage=5
For example, it may be reasonable to play an aggressive variation against an opponent because you think they might have difficulty finding a response in time pressure, but a computer can make precise calculations in any situation, and so such a strategy almost always backfires.
What's more, chess is often abstracted at higher levels in terms of things like long term plans ("I want to put pressure on the c7 pawn and control c6") instead of concrete material gains, which allows players to make progress even in positions where there are no direct threats and no sensible exchanges of pieces. Computers when faced with such situations will know that the position is objectively drawn and so will just shuffle their pieces around aimlessly, and playing against this kind of thing is not very good practice for actual human opponents who will try to find ways to beat you anyway.
Computers don't play like humans, and they are consistent in their weaknesses as well as their strengths. As players are more exposed to playing against various chess programs, they learn what positions computers are not so good at (long-range strategical themes, beyond the typical move horizon). A local optimisation, of sorts.
So players start adopting an anti-computer style of play, which requires a different mind set to pull off. That in turn affects their natural style of play.
So I argue the opposite, for preparing for top-flight tournaments, players probably avoid / severely limit playing against chess engines, to avoid this natural anti-computer bias to their play.
Sure, the engines are good for checking variations and analysis, as an assistant. But not as a leader, nor as an opponent.
As mentioned elsewhere in this thread, Stockfish is just an engine - you must install a GUI separately. XBoard is well known, but there are better alternatives:
[1] https://github.com/mcostalba/Stockfish/blob/master/src/evalu...
There is also a strong bias towards simplification, so if an evaluation feature is not proven to be an improvement it will be removed. Over the last few Stockfish versions, the # of lines has actually decreased in each version.
[1] http://en.wikipedia.org/wiki/Simultaneous_perturbation_stoch... [2] http://tests.stockfishchess.org/tests
Is there a Turing Test for computer chess, where humans and computers play each other and they, and commentators, analyse the play, but no-one knows who is a computer or human until after the commentary is published?
And if we ignore humans are people playing computers against other computers for some kind of machine learning play?
And how optimized for speed is the software? Do they really crunch out all performance they can?
(Sorry for the barrage of questions but I don't know enough about this space to do efficient websearches).
Depends on what you mean by dull. Computers play so well that it tends to be a complete mop-up regardless of any "anti-computer" strategies people may try. The dull aspect is the one-sidedness. What's hard to do is make computers play weakly in a human-like way. Lobotomized computers tend to make very inhuman blunders.
> Is there a Turing Test for computer chess, where humans and computers play each other and they, and commentators, analyse the play, but no-one knows who is a computer or human until after the commentary is published?
Not that I've ever seen, though it still wouldn't be much of a challenge. Computer moves are pretty easy to spot--generally unintuitive or seemingly paradoxical moves that have a very concrete justification. Especially, as I said above, if they were set to play at a weaker level.
> And if we ignore humans are people playing computers against other computers for some kind of machine learning play? I'll let others answer this, but it would surprise me if nobody was. Still, improving evaluation heuristics and analysis efficiency seems to be yielding better results than just machine learning.
> And how optimized for speed is the software? Do they really crunch out all performance they can? Yes. The more positions you can examine per second, the deeper your search tree can go, and the better your evaluations, etc.
Without knowing anything about chess, I'd say that makes humans sounds like sore losers. I wonder if this will lead to having tournaments where you don't get points for winning matches but rather get judged on style by a panel of (human) judges...
In other words, modern humans have been playing in dull, more reliable ways for over a century, so even if not sore loserish, it's at least a bit like the pot calling the kettle black.
On my first try I managed to draw the highest AI level, rated at 2510, while my rating is under 2000 irl. (I was unable to find an "offer draw" button so relied on the 50-move rule) Against stockfish running on my PC that would be impossible. http://en.lichess.org/reDfuSvI
I estimate that it would take me six months of work to get to the top-20 in the world, and I don't see how I can justify that work to myself.
Your comment seems either ignorant of the amount of work involved, or very arrogant about your skill, but does not provide any information for me to judge.
Why do you think you could beat even rybka with just six months of work?
I was a half-pro ~10 years ago. I won the championship of my country. I wrote an M.Sc. thesis on evaluation tuning. If I'd use the standard approach that I already know, 6 months is a conservative estimate. The question is - what for?