Dota 2

OpenAI Dota 2

Above: OpenAI employees gathered to watch a match.

Image Credit: OpenAI

Valve’s Dota 2 — a follow-up to Defense of the Ancients (DotA), a community-created mod for Blizzard’s Warcraft III: Reign of Chaos — debuted to great fanfare in 2013. It’s what’s known as a multiplayer online battle arena, or MOBA. Two groups of five players, each of which are given a base to occupy and defend, attempt to destroy a structure — the Ancient — at the opposing team’s base. Player characters (heroes) have a distinct set of abilities, and collect experience points and items which unlock new attacks and defensive moves.

It’s more complex than it sounds. The average match contains 80,000 individual frames, during which each character can perform dozens of 170,000 possible actions. Heroes on the board finish an average of 10,000 moves each frame, contributing to the game’s more than 20,000 total dimensions.

OpenAI’s been chipping away at the Dota 2 dilemma for a while now, and demoed an early iteration of a MOBA-playing bot — one which beat one of the world’s top players, Danil “Dendi” Ishutin, in a 1-on-1 match — in August 2017. But it kicked things up a notch in June with OpenAI Five, an improved system capable of playing five-on-five matches with top-ranking human opponents. It beat five groups of players — an OpenAI employee team, a team of audience members who watched the OpenAI employee match, a Valve employee team, an amateur team, and a semi-pro team — in early summer, and in August won two out of three matches against a team ranked in the 99.95th percentile.

To self-improve, OpenAI Five plays 180 years’ worth of games every day — 80 percent against itself and 20 percent against past selves — on 256 Nvidia Tesla P100 graphics cards and 128,000 processor cores on Google’s Cloud Platform. It’s made up of five single-layer, 1,024-unit long short-term memory (LSTM) recurrent neural networks assigned to a single hero and trained using a deep reinforcement model, which rewards the “hero” networks for achieving goals like maximizing kills, minimizing deaths, and assisting fellow teammates.

Fully trained OpenAI Five agents are surprisingly sophisticated. Despite being unable to communicate with each other (a “team spirit” hyperparameter value determines how much or how little each agent prioritizes individual rewards over the team’s reward), they’re masters of basic strategies like lane defense and farming, and even of advanced tactics like rotating heroes around the map and stealing runes from opponents.

“Games have really been the benchmark [in AI research],” Brockman told VentureBeat in an earlier interview. “These complex strategy games are the milestone that we … have all been working towards because they start to capture aspects of the real world.”

Starcraft II

Above: StarCraft II: Wings of Liberty launch

Image Credit: Blizzard

Blizzard’s StarCraft II was released in three parts over roughly four years. It’s a real-time strategy game that’s been hailed as one of the genre’s greatest (though it never gained the following that the original built), owing in large part to its difficulty. In-game resources to maintain and construct units and buildings have to be constantly collected and protected, and while match objectives ultimately depend on the selected game type, effective StarCraft strategies typically require players to juggle not only unit quantities and movements but economics and upgrades.

It’s a lot for an AI system to handle, but Chinese tech giant Tencent made some progress in September. In a white paper, researchers at the company described two AI agents — TSTARBOT1 and TSTARBOT2 — that together were trained to play one-on-one games pitting opposite teams of the same race (Zerg) against each other. On a notoriously tricky stage called Abyssal Reef, the AI system managed to defeat the game’s “Cheater AI,” which has full knowledge of resource and unit locations.

It took training. Lots of training. According to the paper’s authors, more than 1,920 parallel actors with 3,840 processors across 80 machines generated replay transitions at 16,000 frames per second. They processed billions of frames of video over entire days.

The results spoke for themselves. The TSTARBOTs — one of which kept track of the overall strategy while the other performed lower-level tasks like unit management — beat StarCraft II’s AI on the highest difficulty — level 10 — 90 percent of the time. Moreover, they held their own against human players who’d achieved the rank of Platinum and Diamond, the latter of which is two tiers below the highest (Grandmaster).

Quake III Arena

Quake III Arena, unlike StarCraft II and Dota 2, is a first-person shooter notable for its minimalist design; advanced locomotion features such as strafe-jumping and rocket-jumping; range of unique weapons; and speedy pace of play; and its emphasis on multiplayer gameplay. Up to 16 players face off against each other in arenas, or two battle it out head-to-head in tournament stages.

In a blog post in July, DeepMind shared the results of its research and experiments in Quake III. It revealed that it taught an AI agent  — cheekily dubbed “For the Win (FTW)” — to beat “most” human players it played against. After completing nearly 450,000 matches involving multiple agents (as many as 30, in some cases, and up to four games concurrently), it went undefeated against human-only teams in Capture the Flag and won 95 percent of games against teams in which humans played with a machine partner.

“We train agents that learn and act as individuals, but which must be able to play on teams with and against any other agents, artificial or human,” the paper’s authors wrote. “From a multi-agent perspective, [Capture the Flag] requires players to both successfully cooperate with their teammates as well as compete with the opposing team, while remaining robust to any playing style they might encounter.”

The AI agents weren’t provided the rules of the game beforehand, and the only reinforcement signal used to was a victory condition — i.e., capturing the most flags within five minutes. But over time, as DeepMind researchers modulated parameters like terrain type, elevations, and locomotion, FTW began to learn strategies like home base defense, following a teammate, and camping out in an opponent’s base to tag them after a flag has been captured. It even got the hang of tagging — i.e., touching an opponent to send them back to their spawn point.

Bonus round: AI in game design

Nvidia AI level generation

Above: An Nvidia system models digital environments on video footage.

Image Credit: Nvidia

This year’s state-of-the-art game-playing algorithms didn’t just beat the pants off of humans. They also demonstrated a knack for game design.

For instance, researchers at the Politecnico di Milano in Italy described a system which can automatically generate Doom levels.

To “teach” their two-GAN system how to create new stages, they sourced a public database containing all official levels from Doom and Doom 2 and over 9,000 levels contributed by the community. From these, they produced a 1) set of images — one per level — which captured features including the wall, objects, floor heights, walkable areas, and 2) vectors representing in numerical form key level characteristics like size, area, number of rooms.

After 36,000 iterations, the model was able to generate new levels that “captured [the] intrinsic structure of [handcrafted] Doom levels” — a possible step toward systems that might one day free up human designers to focus on “high-level features.”

“Our promising results, although preliminary, represent an excellent starting point for future improvements and highlight a viable alternative to classical procedural generation,” they wrote. “Most generated levels have proved to be interesting to explore and play due to the presence of typical features of Doom maps (like narrow tunnels and large rooms).”

They aren’t the only ones to achieve some success at AI level generation. In December, Nvidia took the wraps off of a system that’s capable of automatically crafting digital environments from video sources.

The development team accomplished the feat by training object classification algorithms to recognize specific objects in scenes, such as buildings, pedestrians, trees, and cars. Next, they used a GAN to model those objects virtually, in three dimensions.

“It’s a new kind of rendering technology, where the input is basically just a sketch, a high-level representation of objects and how they are interacting in a virtual environment,” Nvidia vice president of applied deep learning Bryan Catanzaro told VentureBeat in a phone interview. “Then the model actually takes care of the details, elaborating the textures, and the lighting, and so forth, in order to make a fully rendered image.”

Such a model promises to take a load off of game developers’ shoulders. Currently, blockbusters like Red Dead Redemption 2 and Grand Theft Auto V take teams of hundreds of people years — sometimes close to a decade — to create.


You can't solo security COVID-19 game security report: Learn the latest attack trends in gaming. Access here