MIT CSAIL's CommPlan AI helps robots efficiently collaborate with humans

In a new study, researchers at MIT's Computer Science and Artificial Intelligence Lab propose a framework called CommPlan, which gives robots that work alongside humans principles for "good etiquette" and leave it to the robots to make decisions that let them finish tasks efficiently. They claim it's a superior approach to handcrafted rules, because it enables the robots to perform cost-benefit analyses on their decisions rather than follow task- and context-specific policies.

CommPlan weighs a combination of factors, including whether a person is busy or likely to respond given past behavior, leveraging a dedicated module -- the Agent Markov Model -- to represent that person's sequential decision-making behaviors. It consists of a model specification process and an execution-time partially observable Markov decision process (POMDP) planner, derived as the robot's decision-making model, which CommPlan uses in tandem to arrive at the robot's actions and communications policies.

Using CommPlan, developers first specify five modules -- a task model, communication capability, a communication cost model, a human response model, and a human action-selectable model -- with data, domain expertise, and learning algorithms. All modules are analytically combined to arrive at a decision-making model, and during task execution, the robot computes its policy using hardware sensors, the decision-making model, and a POMDP solver. Finally, the policy is executed using the robot's actuators and communication modality.

CommPlan communicates in the following ways:

It informs and asks humans about the state of its decision-making (e.g., "I am going to do action at landmark.")
It commands humans to perform specific actions and plans ("Where are you going?")
It answers humans' questions ("Please make the next sandwich at landmark.")

To evaluate CommPlan, the researchers staged an experiment involving a Universal Robot 10 with a Robotiq gripper and 15 human participants, who were tasked with performing meal prep in a kitchen. The robot had to reason within a planning time of 0.3 seconds about a large state space and determine (1) which of four cups should be filled next, (2) whether to wait to ensure safety or to move to complete the task, (3) if it chose to move, its trajectory to reach the cup, (4) whether to use its communication modality; and (5) which communication message to convey.

The team reports that the robot successfully worked in conjunction with humans to complete tasks like assembling ingredients, wrapping sandwiches, and pouring juice. Importantly, it did so more safely and efficiently compared with baseline handcrafted and communications-free silent policies.

"Many of these handcrafted policies are kind of like having a coworker who keeps bugging you on Slack, or a micromanaging boss who repeatedly asks you how much progress you've made," paper coauthor and MIT graduate student Shen Li said. "If you're a first responder in an emergency situation, excessive communication from a colleague might distract you from your primary task."

In the future, the researchers hope to extend CommPlan to other domains, like health care, aerospace, and manufacturing. They've only used the framework for spoken language so far, but they say it could be applied to visual gestures, augmented reality systems, and others.

More