AlphaGo Zero.

It seems that everyday you hear more and more about Artificial Intelligence. From speech recognition to genome mapping and most things in between, AI’s potential is ever growing. But just how intelligent is subject to speculation.

giphy
Dance, dance.

Films like Ex Machina and I Robot imagine a future with sentient machines taking over the world. It is true that AI is far more capable at performing complex tasks than humans are, but these programs usually only have a specific purpose and fall short of “Intelligent”. This provided the main goal of the Alpha program series.

 

AlphaGo

It all started with AlphaGo. The project set out to create a computer program that would become the greatest player of the Chinese game Go.The game Go is considered to be the hardest game for AI to play as game results are hard to predict unlike Chess, which in comparison is a much easier game for AI to play.

AlphaGo was developed by DeepMind technology which is based on neural networks. These technologies try to mimic the human mind by creating a network of artificial nodes that connect together. The neural network models complex relationships between input and output to find patterns in data. DeepMind uses this to simulate human short-term memory.

The AlphaGo program uses a supervised learning protocol to play the game. This is a machine learning technique which maps input and output based on example input and output. Essentially a load of data was collected of previous high levels games. This provided the program with a cheat sheet on how best to play the game and what strategies to use and where. It was through this that the program was able to use reinforced learning, which is a method of determining ideal behaviour within a specific context, in order to maximise performance.

Forest_of_synthetic_pyramidal_dendrites_grown_using_Cajal's_laws_of_neuronal_branching
Computer simulation of branching architecture within neural networks. 

All of this combined into a program that was able to calculate the probability of winning the game turn-by-turn, based on thousands of games played by some of the greatest human players. AlphaGo was able to determine the best course of action at any given time and to change its strategy accordingly.

In October 2015 AlphaGo was able to beat Fan Hui, who was the European Champion. It was the first time AI had ever been able to beat a professional player of the game. In March 2016 it beat Lee Sedol, who was one of the highest ranked players, with a score of 4-1 in a 5 game series. Later at the Future of Go Summit AlphaGo won a three game match against Ke Jie, who at the time was the worlds top ranked player.

 

Future of AI and Go

The AlphaGo project managed to prove that AI could become better than human players at the game Go. However, it still required human input as a base to its success. Whilst it was a very complex program it still needed a cheat sheet in order to determine its own moves. From a human point of view this is not as intelligent as it first seems. The program is capable of doing calculations much faster than a human, but given the same cheat sheet the program was given, a human may also be able to best the top player.

 

Enter AlphaGo Zero

After the success of AlphaGo DeepMind took to creating a similar program that didn’t use any human input/output data, instead they created a program that learned solely by playing itself. The program initial started of knowing nothing about the game besides the rules. It then used reinforced learning, playing against itself until it was capable of anticipating its own moves and how those moves would affect the games outcome as a whole. In three days AlphaGo Zero played 4.9 millions games against itself, one after the other. It managed to quickly develop the skills necessary for beating human plays in a few days rather than the few months that it took AlphaGo.

trainingtime-graph-171019-r01

AlphaGo Zero was so powerful because it was “no longer constrained by the limits of human knowledge”.

An article was published in October 2017 about AlphaGo Zero, shortly after in December 2017 released a preprint about AlphaZero which in 24 hours, achieved a superhuman level of play not only in Go, but also Chess and Shogi. Because AlphaGo Zero wasn’t dependant on human knowledge the program could be tailored for other games by merely changing the initial input of rules. After running for a few days it would quickly become the best play of those games.

All of these projects showed the potential of AI. The computational power has always been a given as computers have been able to process data much more efficiently than humans for quite a while now. The speed is a factor that doesn’t necessarily spring to mind when thinking about AI. AlphaGo Zero was able to play millions of games and learning from all that information in a few days. This is the true power and fear of machine learning techniques.

 

AlphaGo Zero and open-source

Since the release of these programs there have been a few open-source implementations of the technology.

Minigo : https://github.com/tensorflow/minigo is a python implementation of the AlphaGo Zero project.

Alpha Zero General : https://github.com/suragnair/alpha-zero-general is another python implementation of the AlphaZero project. Unlike Minigo, Alpha Zero General can be altered to play any game, providing you can code the rules.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s