Google’s artificial intelligence DeepMind, became famous by beating the South Korean, professional GO player Lee Sedol.
Lee Sedol played a historic five game match against Google DeepMind’s AlphaGo computer program in March 2016. AlphaGo won the match and became the world’s very first computer which had defeated a world class human player on GO.
After achieving the impossible, DeepMind now has a very different challenge to focus on; Social Dilemmas.
The Artificial Intelligence (AI) department of Google developed and used new theoretic game scenarios to see if AI can learn to work together for a mutual benefit or not.
As DeepMind team states in a blog post, one of the most famous “games” for such social experiments is Prisoner’s Dilemma.
The prisoner’s dilemma is a standard example of a game analyzed in game theory that shows why two completely “rational” individuals might not cooperate, even if it appears that it is in their best interests to do so. It’s just a long way of saying people(intelligence) don’t like to be taken advantage of. ¹ ² ³
Of course, the DeepMind team found two new games that imitate this “compete or co-operate” dilemma closely, instead of putting two computers in jail.
The first game DeepMind team has tested this concept is called “Gathering”. Two agents in the game, represented as Red and Blue pixels, shared a world together with a mission of collecting as much as apples(green pixels) to get rewarded. While the apples were getting diminished, each AI could temporarily disable the other one by simply “tagging” it and gain more time to collect more apples. Here, watch the video that they’ve played thousands of times together;
When there are enough apples in the game for both parts, the agents learn to peacefully coexist and collect as many apples as they can together. However, as the number of apples gets diminished, the agents naturally learn that they should probably stop the other one by tagging it and earn more time to collect even more apples.
The second game “Wolfpack” puts two AI into 2v1 hunter (red pixel) and prey(singular blue pixel) situations.
In this case, the hunter gains points every time it catches the prey and gets rewarded, but the point it earns gets increased if the other teammate hunter is near.
After playing many games in both competitor and co-worker situations, DeepMind’s researchers examined each AI’s decisions. As you can see in the graphic below, greater scarcity in the game leads to more “tagging” behavior of agents and increases their aggressiveness level.
DeepMind researchers in the experiment say these results help us to understand and control better the complex multi-agent systems such as the economy, traffic systems, or the overall ecological health of our planet.