The intricacies of evolutionary linguistics are myriad and underexplored, but research involving artificial intelligence (AI) might unlock the door to new theories about how dialects develop among users. In a paper (“Emergent Linguistic Phenomena in Multi-Agent Communication Games“) on the preprint server, scientists at Facebook AI, Google AI, and New York University describe a framework in which agents trained with deep reinforcement learning — a technique that uses a system of rewards to drive them toward certain goals — while “playing” a series of games demonstrated some of the “linguistic phenomena” observed in natural language.

This work isn’t the first to investigate language with machine learning algorithms — a paper published by Facebook researchers in June 2017 detailed how two agents learned to “negotiate” with each other in chat messages. But they claim it’s the first to use “latest-generation deep neural agents” capable of dealing with “rich perceptual input,” and they say it convincingly demonstrates that language can evolve from simple exchanges.

The team began by deploying groups — communities — of agents equipped with the ability to communicate in a simulated environment, with complexities ranging from simple (a set of equations) to relatively complicated (a deep neural network). The “games” the agents were tasked with playing had several key properties: They were symmetric, enabling the agents to act as both “speakers” and “listeners”; they allowed the agents to communicate about something “external” to themselves, such as the sensory experience of something in their environment; and they took place in a world the agents could at least partially observe.

In the first of several experiments, in which between three and 10 autonomous agents “played” with each other, the success rates between self-play — an agents’ ability to complete a game alone — and cross-play — agents’ success in solving a game as a team — were “indistinguishable” after 150-200,000 plays, regardless of the size of the community. The researchers claim that this implies a common, shared language emerges only if there are more than two users of that language.

In a subsequent test, two different linguistic agent communities — communities that had developed separate communication tools — were exposed to each other. The researchers report that all agents learned to “speak” with the agents from the other community and that, moreover, certain agents — “bridge” agents — managed to get a handle on the new shared protocol “faster” and “better.”

Even agents that had never interacted with each other before successfully completed games 65.6 percent of the time after 200,000 plays, but only when intergroup connectivity — i.e., communication among agents within their respective communities — was high. When intergroup interaction occurred only half as frequently as intragroup interaction (interaction between agents of different communities), the success rate dropped to 52.4 percent.

“All that is needed in order for a common language to emerge is a minimum number of agents,” the paper’s authors wrote. “This finding demonstrates the rapid shift toward a common protocol in both groups where all agents learn to speak a shared language, regardless of whether they actually interact with agents from the other group, [and] implies that there is a critical level of intergroup connectivity.”

Those weren’t the only discoveries of note. Across all scenarios, the agents preferred to integrate or assimilate, rather than segregate, when it came to language, the researchers say, and language complexity — the range with which the agents expressed their states — tended to plateau earlier when there was a larger imbalance between two communities’ population.

“This implies that the new-born contact languages arising from the contact of two similarly sized communities tend to be substantially simpler,” they noted.

In a final experiment, a chain of communities of equal population (five agents) communicated with each other in sequence, almost like a game of telephone. The researchers found that while agents from a pair of adjacent communities could communicate with each other almost as well as those within a single community, communicability degraded as the distance between pairs of communities grew; the agents from the first community in the chain and last community couldn’t understand each other at all.

These findings taken together suggest that language doesn’t depend on evolved, complex linguistic capabilities, the authors claim, but can arise from “simple social exchanges” between “perceptually enabled agents” playing communication games.

“We observed that a symmetric communication protocol emerges without any innate, explicit mechanism built in an agent, when there were three or more of them in a linguistic community,” they wrote.