[LINK] The other chess

Fri Dec 8 12:27:35 AEDT 2017

On December 5 the DeepMind group published a new paper at the site of
Cornell University called "Mastering Chess and Shogi by Self-Play with a
General Reinforcement Learning Algorithm", and the results were nothing
short of staggering. AlphaZero had done more than just master the game, it
had attained new heights in ways considered inconceivable. The test is in
the pudding of course, so before going into some of the fascinating
nitty-gritty details, let’s cut to the chase. It played a match against the
latest and greatest version of Stockfish, and won by an incredible score of
64 : 36, and not only that, AlphaZero had zero losses (28 wins and 72
draws)!

Stockfish needs no introduction to ChessBase readers, but it's worth noting
that the program was on a computer that was running nearly 900 times
faster! Indeed, AlphaZero was calculating roughly 80 thousand positions per
second, while Stockfish, running on a PC with 64 threads (likely a 32-core
machine) was running at 70 million positions per second. To better
understand how big a deficit that is, if another version of Stockfish were
to run 900 times slower, this would be equivalent to roughly 8 moves less
deep. How is this possible?

...

In other words, instead of a hybrid brute-force approach, which has been
the core of chess engines today, it went in a completely different
direction, opting for an extremely selective search that emulates how
humans think. A top player may be able to outcalculate a weaker player in
both consistency and depth, but it still remains a joke compared to what
even the weakest computer programs are doing. It is the human’s sheer
knowledge and ability to filter out so many moves that allows them to reach
the standard they do. Remember that although Garry Kasparov lost to Deep
Blue it is not clear at all that it was genuinely stronger than him even
then, and this was despite reaching speeds of 200 million positions per
second. If AlphaZero is really able to use its understanding to not only
compensate 900 times fewer moves, but surpass them, then we are looking at
a major paradigm shift.

https://en.chessbase.com/post/the-future-is-here-alphazero-learns-chess