Google on the Right Track
Times of India ( 11 July ) carries following report :
Google trying to keep AI from going rogue
Google is developing a new system designed to PREVENT ARTIFICIAL INTELLIGENCE FROM GOING ROGUE AND CLASHING WITH HUMANS.
The company’s DeepMind division, behind the AI that recently defeated the world’s number one Go player Ke Jie, has teamed up with Open AI, a research group partly funded by Tesla’s Elon Musk, to encourage machines to work in a certain way.
They’ve released a paper explaining how human feedback can be used to ensure machine-learning systems work things out the way their trainers want them to.
A technique called reinforcement learning (RL), which is popular in AI research, challenges software to complete tasks, and rewards it for doing so.
However, the software has been known to cheat, by figuring out shortcuts or uncovering loopholes that maximise the size of the reward it receives.
In one instance it drove a boat around in circles in racing game CoastRunners, instead of actually completing the course, because it knew it would still win a reward, reports ‘Wired’.
DeepMind and Open AI are trying to solve the problem by using human input to gauge when AI completes tasks in the “correct” way, and then reward them for doing so.
“In the long run it would be desirable to make learning a task from human pref- erences no more difficult than learning it from a programmatic reward signal, ensuring that POWERFUL RL SYSTEMS CAN BE APPLIED IN THE SERVICE OF COMPLEX HUMAN VALUES rather than low-complexity goals,” reads the report.
The improved RL system is too time-consuming to be practical right now, but it gives us an idea of how the development of increasingly advanced machines and robots could be controlled in the future.
Very likely , neither Sergey Brin nor Larry Page nor Sundar Pichai ( to whom , I did send following suggestion through email ) , may have read my following blog , but I have no doubt that the OMNIPRESENT , OMNISCIENT web crawler of Google did index it and reported it to the Re-inforcement Learning ( RL ) system !
Have no doubt that the Google Spider will do the same , within seconds of my uploading this blog ! And that OMNIPOTENT RL will interpret it the way I ( – a human with values for non-violence ) , want it to interpret !
Fast Forward to Future [ 3F ] … 20 Oct 2016
12 July 2017
www.hemenparekh.in / blogs