We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!

AI researchers from the Allen Institute of Artificial Intelligence and the University of Washington have trained a drone agent with a box on top to catch a range of 20 objects in a simulated environment. In trials, the drone had the lowest catch success rate with toilet paper (0%) and the highest with toasters (64.4%). Other objects included alarm clocks, heads of lettuce, books, and basketballs. Overall, the system’s success rate in catching objects outpaces two variations of a current position predictor model for 3D spaces, as well as a frequently cited reinforcement learning framework proposed in 2016 by Google AI researchers.

For the study, a launcher threw each object two meters (6.5 feet) toward a drone agent. Each simulation was set in a living room and took place in the AI2-THOR photo-realistic simulated environment. In the virtual environment, the weight, size, and structure of objects determined the way they were thrown, the velocity of a throw, and whether an object bounced off a wall. In every trial scenario, the launcher was positioned at roughly above average human height. The model was trained using a data set of 20,000 throws, with the launcher randomly positioned for each throw of 20 objects.

“Our proposed solution is an adaptation of the model-based reinforcement learning paradigm. More specifically, we propose a forecasting network that rolls out the future trajectory of the thrown object from visual observation. We integrate the forecasting network with a model-based planner to estimate the best sequence of drone actions for catching the object,” reads a paper describing the model.

The authors propose a new framework that predicts the future movement of objects and reacts “to make decisions on the fly and receive feedback from the environment to update our belief about the future movements.”

The paper was published as part of the Computer Vision and Pattern Recognition (CVPR) conference happening online this week. According to the AI Index report, CVPR had more than 9,000 attendees and was the second-largest annual AI research conference in the world in 2019. Organizers told VentureBeat in a statement that more than 6,500 people registered for CVPR 2020. Other major AI conferences — like ICLR and ICML — have also shifted online, and last week NeurIPS organizers said the machine learning conference will go all-digital in December.

Other works introduced at CVPR this week include details on an MIT model that predicts how people paint landscapes and still lifes and generative models that create 3D avatars from photos.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.