As the latest technology in the world of AI develops quickly and ever more impressive applications are achieved a new related phenomenon is also growing – a questioning of whether or not it is too intelligent to be revealed. This week OpenAI, a non-profit AI research organisation financially backed by the great and good of the tech sector from Elon Musk to Sam Altman of the YCombinator investment vehicle, announced a break from its usual convention of sharing its intellectual property (IP).
OpenAI generally releases its research to the public with the philosophy its open source approach will help speed up the wider development of AI technology. However, the organisation has come to the conclusion that it should keep its latest AI model, GPT2, under wraps pending further discussion around the potential ramifications of the tech breakthrough it represents.
GPT2 is a text generator but one that shows a level of sophistication significantly beyond any previous AI text-generator. It works on being provided with a text sample that can range from just a few words to pages. The AI then completes or adds to the text. So if it was given a newspaper headline it could come up with a complimentary article. This is not new for AI but other existing algorithms struggle to maintain syntax and the kind of consistency of structure, flow and content that would pass for a human writer. Apparently not so GPT2.
The Guardian newspaper, presumably given access to the algorithm, fed in the first few paragraphs of one of its own articles on the topic of Brexit. Its journalist was impressed:
‘its output is plausible newspaper prose, replete with “quotes” from Jeremy Corbyn, mentions of the Irish border, and answers from the prime minister’s spokesman.
One such, completely artificial, paragraph reads: “Asked to clarify the reports, a spokesman for May said: ‘The PM has made it absolutely clear her intention is to leave the EU as quickly as is possible and that will be under her negotiating mandate as confirmed in the Queen’s speech last week.’”
The quality of GPT2’s output can be put down to a few factors. The first is that the dataset the AI was ‘trained’ on was many times larger than that of other text generator AIs. The algorithm also has a much broader remit than others, which are coded for narrower tasks such as translation, summarisation or reading comprehension. GPT2 can be applied to all of these narrower text-based remits and in tests outperformed other AIs specifically programmed for just one. It appears that its broader scope has actually enabled it to perform better at narrower, specialist tasks.
Such is the level of aptitude demonstrated by GPT2 that its creators at OpenAI are not confident they can completely anticipate what it might be capable of. Concerns around releasing its code into the public domain are centred on how effective it could be in generating high quality fake news or online reviews for products and services.
One of OpenAI’s goals is to demonstrate to the public what the latest technology in the world is capable of before it becomes mainstream. This, it is hoped, will raise awareness and teach people to be careful about taking things in the digital domain at face value.