Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More

Almost every gadget you use in the near future will benefit from AI. We’ll continue using Alexa on speakers like the Echo, but even a fan in your living room will use AI to control the temperature. The lights will notice when you pull a book off a shelf and will brighten in the corner of the room where you like to read. A coffee maker will use machine learning to make adjustments based on your taste preferences.

It will happen, and it already has with several new products. One that stands out is called Meeting Owl, a $799 device that sits on a conference room table and can automatically determine who is talking. It uses machine learning to understand not just who is speaking, but if there’s a conversation between two people and whether the videoconferencing chat should show two or more views for the video. It’s all on the fly, as well. And the device doesn’t need to be pre-programmed (setting how many people are in a room, for example).

Mark Schnittman, the Owl Labs cofounder and CTO who created the Meeting Owl, explained to VentureBeat that there’s a lot going on behind the scenes.

“The Meeting Owl’s artificial intelligence allows it to act as a real-time director and editor,” he said. “When it’s placed in an arbitrary room in an arbitrary location, it can figure out where people are, which people are most relevant to the conversation at any given moment, and how best to display those people to the viewer.”


Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.


Register Now

Inside the device — which looks a bit like an owl — there’s a 360-degree camera and a 360-degree microphone array. Schnittman says a technique called static beam analysis helps identify the speaker. Basically, it’s a method that can determine the highest concentration of energy in an audio sample. It means all of the various microphone signals from the device help identify the strongest signal. “These and other sensing systems are fused together to give a frame-by-frame analysis of where people are and how wide of a field of view the camera needs to reliably show each person,” he said.

Then machine learning takes over. The Meeting Owl gathers the sensor data and uses AI to determine who should appear on screen or if there should be multiple people at the same time. “To do this, the Meeting Owl uses a statistical model to optimize how it decides to display the various people,” he explained. “Of course, people move, so all this is done dynamically on the fly, all with the goal of optimizing the experience for the remote user.”

What’s interesting about this technique is that the users in the room barely notice. With the best AI, that’s always the goal. The device thinks like a human, showing the right people (or group) at the right time to match the flow of conversation. From the standpoint of those in the room, it almost seems like a human is moving a camera toward the correct human speaker or multiple people.

It’s a good use case — no one has to tweak the software before the meeting; no one has to push a remote button. The meeting flows. From all indications, this is how many gadgets will operate in the near future. It will seem like someone is behind the curtain, making our gadgets function more efficiently. Now, if someone can just create a device for that one guy in marketing who keeps checking his phone constantly.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.