SAN FRANCISCO — Here in the U.S., we like our Google search box. We type into it almost without thinking, and we revise our searches again and again. It feels natural.
But in the future, we might depend on additional interfaces to find information on the Internet, namely by speaking and by taking pictures of the things around us. Large groups of computers in the cloud can now understand the words and sentences we dictate into our phones and identify objects that appear in the photographs.
And Baidu — the second-largest web search provider in the world, with its biggest user base in its home country of China — has been preparing its systems for a time when text will be just another option for searching, and not necessarily the default.
“In five years, we think 50 percent of queries will be on speech or images,” Andrew Ng, Baidu’s chief scientist and the head of Baidu Research, said Wednesday during a Gigaom meetup on his area of expertise, deep learning.
A type of artificial intelligence, deep learning involves training systems called artificial neural networks on lots of information derived from audio, images, and other inputs, and then presenting the systems with new information and receiving inferences about it in response.
At Baidu in particular, deep learning has informed speech recognition, image search, web ranking, and advertising systems, said Ng, who joined Baidu from Google earlier this year.
Making Baidu’s neural networks more accurate — a key focus for Ng now — could yield more effective searches in countries where significant percentages of populations are illiterate.
“Speech and images are, in my view, a much more natural way to communicate [than text],” Ng said.
Indeed, already one out of all 10 queries Baidu receives comes through speech, he said. And pointing a smartphone camera at a handbag might identify a particular model more quickly than endlessly rephrasing a typed query. As Ng put it, “It’s easier to show us a picture.”
But regardless of the ways consumers attempt to search the web, it sounds like the underlying technology will matter a lot.
“I think that whoever wins AI will win the Internet,” Ng said.
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here