A few of the world’s top Dota 2 pros learned a valuable lesson this week: Don’t underestimate the power of artificial intelligence (AI). At an event in San Francisco on Sunday, a team ranked in the 99.95th percentile faced off against OpenAI Five, OpenAI‘s eponymous game-playing AI, and won just one match in a series of three.
After handily besting members of the audience in a warmup round, OpenAI Five got down to business with a five-person team of four Dota 2 players — Ben “Merlini” Wu, William “Blitz” Lee, David “MoonMeander” Tan, and Ioannis “Fogged” Lucas — and play-by-play commentator Austin “Capitalist” Walsh.
In the first of the two matches, Open AI Five started and finished strongly, managing to prevent its human opponents from destroying any of its defensive towers. The second match was a tad less one-sided — the humans took out one of OpenAI Five’s towers — but the AI emerged victorious nonetheless.
Both matches ended in under half an hour, compared to the 30-40 minutes an evenly-matched Dota game generally takes.
Only in the third match did the human players eke out a victory.
More than 100,000 Twitch streamers tuned in to watch the events unfold live.
Today has been an emotional ride. Moment of panic after Five declared such high win probabilities. But seems like those predictions were warranted. Extremely proud of the OpenAI team and very excited to continue improving the system as we prepare for The International.
— Greg Brockman (@gdb) August 5, 2018
OpenAI Five isn’t quite able to handle the full game yet — it can only play 18 out of the 115 different playable heroes, and it can’t use abilities like summons and illusions. But it’s able to draft a team in response to the opposing side’s choices, and to adopt human-like tactics like sustaining heroes’ health and mana in firefights.
Somewhat cheekily, it broadcasts the prediction of the first frame of a game to Dota 2’s global chat channel. In the first and second matches, it predicted a 95 percent win probability and 76.2 percent win probability, respectively. And in the third, which saw Twitch chat and the live audience choose its lineup of heroes, predicted a 2.9 percent chance of winning.
OpenAI last benchmarked OpenAI Five in June against five teams of amateur players, including one made up of Valve employees. It’s made a few improvements to the bot network since then, increasing its reaction time and introducing new strategies of play.
OpenAI Five plays 180 years’ worth of games every day — 80 percent against itself and 20 percent against past selves — on 256 Nvidia Tesla P100 graphics cards and 128,000 processor cores on Google’s Cloud Platform. (Its predecessor ran on a 60,000-core instance on Microsoft Azure.) It’s made up of five single-layer, 1,024-unit long short-term memory (LSTM) recurrent neural networks assigned to a single hero and trained using a deep reinforcement model, which rewards them for achieving goals like maximizing kills, minimizing deaths, and assisting fellow teammates.
Months ago, when OpenAI kicked off training, the AI-controlled Dota 2 heroes “walked aimlessly around the map.” But it wasn’t long before the AI mastered basics like lane defense in farming, and soon after, it nailed advanced strategies like rotating heroes around the map and stealing items from opponents.
Later this month, OpenAI plans to square OpenAI Five off against Dota 2 players at Valve’s annual multimillion-dollar International esports competition.
“People used to think that this kind of thing was impossible using today’s deep learning,” Greg Brockman, one of the cofounders of OpenAI, told VentureBeat in an interview last month. “But it turns out that these networks [are] able to play at the professional level in terms of some of the strategies they discover … and really do some long-term planning. The shocking thing to me is that it’s using algorithms that are already here, that we already have, that people said were flawed in very specific ways.”
OpenAI, a nonprofit, San Francisco-based AI research company backed by Elon Musk, Reid Hoffman, and Peter Thiel, among other tech luminaries, is applying some of the insights gleaned from OpenAI Five to other fields. In February, it released Hindsight Experience Replay (HER), an open source algorithm that effectively helps robots to learn from failure, and last week it published research on a self-learning robotics system that can manipulate objects with humanlike dexterity.
Updated at 2:17 p.m. Pacific: Added additional details from OpenAI’s blog post.