The first ‘big’ AI moment that made the world sit up and take note was Deep Blue defeating chess grand master Garry Kasparov back in 1997. However, while impressive, it was acknowledged that Deep Blue had to be pre-programmed with huge amounts of data and rules on how to use that data. It was impressive use AI but limited within a very confined context. The next headline grabbing demonstrations of leaps in AI technology didn’t come until the present decade. IBM’s ‘Watson’ defeated a couple of the world’s best Jeopardy! Players in 2011 and in 2016 Lee Sedol, a grand master of ancient Chinese strategy game ‘Go’ lost to AlphaGo, an AI opponent.
However, the pace of development in AI was seen to be undoubtedly picking up when earlier this year Libratus, a creation of Carnegie Mellon University computer scientists, defeated four professional no-limit Texas Hold’em Poker Players. Poker’s combination of playing to statistical odds with the ‘soft’ element of psychology incorporated by the role of bluffing makes it a much more rigorous testing-ground for AI than the other games in which AI has defeated the best human adversaries.
The achievement of another AI, however, has now surpassed even Libratus. AlphaGo Zero (AGZ), a second generation version of the AlphaGo AI that defeated grand master Lee Sedol in 2016, defeated its predecessor in a series of 100 games out of 100. While it might be expected that a next generation version AI would superior to its prequel, it is the way in which it won which is raising eyebrows around the world.
AlphaGo was programmed with data from literally thousands of games of ‘Go’ before taking on Sedol. AGZ started with nothing more than the rules of the game pre-programmed. The AI, starting from a clean slate, trained itself through playing itself. It had access to not one historical instance of a game of Go. With no supervision or influence from human data, the system self-improved and optimised its own ‘brain’. AGZ was essentially its own teacher.
Within a few days of self-learning, AGZ was able to surpass the limits of human knowledge by ‘an order of magnitude’ human knowledge. The AI was then able to teach human Go players new strategies and moves they would either not have arrived at themselves or after a great deal of time.
It is the efficiency of the self-learning process that DeepMind (a Google-owned company), the creators of AGZ, as well as academic observers, believe has a significance that will be seen in our daily lives in the future. There are specifics to the way AGZ works that mean its application is still limited for now. It works for problems where there is a finite number of actions that can be taken and the environment must have ‘rules’.
AGZ does, however, mark a move away from the pattern of training a model to imitate a batch of human-labelled data.
In the future, if AGZ’s human-bias free self-learning capabilities can be merged with other forms of AI such as machine learning and evolutionary algorithms so that it is more adaptable to imperfect physical environments, we will be getting close to humanoid AI. While we’re still some way away from ‘The Singularity’, the mythical moment artificial independence can become an independent, ‘sentient’, being, recent developments such as AGZX demonstrate that it may not be outside of the realms of possibility within our lifetimes.