Since the earliest days of computing, computers have been used to search out ways of optimizing known functions. Deep Blue’s approach was just that: a search aimed at optimizing a function whose form, while complex, mostly expressed existing chess knowledge. It was clever about how it did this search, but it wasn’t that different from many programs written in the 1960s.
AlphaGo also uses the search-and-optimization idea, although it is somewhat cleverer about how it does the search. But what is new and unusual is the prior stage, in which it uses a neural network to learn a function that helps capture some sense of good board position. It was by combining those two stages that AlphaGo became able to play at such a high level.
This ability to replicate intuitive pattern recognition is a big deal. It’s also part of a broader trend. In an earlier paper, the same organization that built AlphaGo — Google DeepMind — built a neural network that learned to play 49 classic Atari 2600 video games, in many cases reaching a level that human experts couldn’t match. The conservative approach to solving this problem with a computer would be in the style of Deep Blue: A human programmer would analyze each game and figure out detailed control strategies for playing it.
By contrast, DeepMind’s neural network simply explored lots of ways of playing. Initially, it was terrible, flailing around wildly, rather like a human newcomer. But occasionally the network would accidentally do clever things. It learned to recognize good patterns of play — in other words, patterns leading to higher scores — in a manner not unlike the way AlphaGo learned good board position. And when that happened, the network would reinforce the behavior, gradually improving its ability to play.