Google’s AlphaGo can now teach itself from scratch to beat humans
It’s official; artificial intelligence no longer needs to learn from humans. For the first time, a neural network has taught itself completely how to play a game, assembling knowledge that took humans thousands of years, during a period of just a few days.
A new and improved version of Google’s AlphaGo has taught itself to beat humans at the game Go. Called AlphaGo Zero, this is the first time an artificial intelligence has become better at humans at the strategy game without any human input, starting from a blank slate.
The team behind the AI said this breakthrough means AlphaGo is no longer “constrained by the limits of human knowledge”.
AlphaGo became the overall champion after winning three matches against the world’s best Go player earlier this year. The AI had been trained using a combination of supervised learning, based on millions of human moves, both amateur and expert, and reinforcing what it had leant by playing against itself. This required many machines over several months, and used 48 specialised chips for neural network training, called TPUs.
The ancient game of Go works using two players, a board and pieces called stones. The aim is to surround more of the board than your opponent, and stones can be placed on the intersection of the lines on the board. There are ten to the power of 170 possible board configurations in Go – more than the number of atoms in the known Universe.
AlphaGo Zero was given no supervised training based on any expert moves. Instead, it trained by playing against itself, using a random sequence of moves to start off with. Each time it plays a new move, it gets a little bit better.
Instead of 48 TPUs, AlphaGo Zero only used four, and only one machine was needed.
“It’s amazing to see just how far AlphaGo has come in only two years,” said Demis Hassabis co-founder of DeepMind, the team behind the AI. “AlphaGo Zero is now the strongest version of our program and shows how much progress we can make even with less computing power and zero use of human data.”
This proves how effective the technique of ‘reinforcement learning’ can be for artificial intelligence. “The results suggest that AIs based on reinforcement learning can perform much better than those that rely on human expertise,” said Satinder Singh, from the University of Michigan, who was not involved in the study. “Indeed, AlphaGo Zero will probably be used by human Go players to improve their gameplay and to gain insight into the game itself.”
Singh does not think we should be worried about the increasing abilities of artificial intelligence compared to what we can do as humans.
“Yes, another popular and beautiful game has fallen to computers, and yes, the authors’ reinforcement-learning method will be applicable to other tasks,” he says. “However, this is not the beginning of any end because AlphaGo Zero, like all other successful AI so far, is extremely limited in what it knows and in what it can do compared with humans and even other animals.”
But the DeepMind team thinks, one day, this development might lead to AIs being able to solve scientific problems we, as humans, are not able to tackle.“Ultimately we want to harness algorithmic breakthroughs like this to help solve all sorts of pressing real world problems like protein folding or designing new materials,”” said Hassabis. “If we can make the same progress on these problems that we have with AlphaGo, it has the potential to drive forward human understanding and positively impact all of our lives.”