SUNNYVALE, California — Chinese tech company Baidu has yet to make its popular search engine and other web services available in English. But consider yourself warned: Baidu could someday wind up becoming a favorite among consumers.
The strength of Baidu lies not in youth-friendly marketing or an enterprise-focused sales team. It lives instead in Baidu’s data centers, where servers run complex algorithms on huge volumes of data and gradually make its applications smarter, including not just Web search but also Baidu’s tools for music, news, pictures, video, and speech recognition.
Despite lacking the visibility (in the U.S., at least) of Google and Microsoft, in recent years Baidu has done a lot of work on deep learning, one of the most promising areas of artificial intelligence (AI) research in recent years. This work involves training systems called artificial neural networks on lots of information derived from audio, images, and other inputs, and then presenting the systems with new information and receiving inferences about it in response.
Two months ago, Baidu hired Andrew Ng away from Google, where he started and led the so-called Google Brain project. Ng, whose move to Baidu follows Hugo Barra’s jump from Google to Chinese company Xiaomi last year, is one of the world’s handful of deep-learning rock stars.
Ng has taught classes on machine learning, robotics, and other topics at Stanford University. He also co-founded massively open online course startup Coursera.
He makes a strong argument for why a person like him would leave Google and join a company with a lower public profile. His argument can leave you feeling like you really ought to keep an eye on Baidu in the next few years.
“I thought the best place to advance the AI mission is at Baidu,” Ng said in an interview with VentureBeat.
Baidu’s search engine only runs in a few countries, including China, Brazil, Egypt, and Thailand. The Brazil service was announced just last week. Google’s search engine is far more popular than Baidu’s around the globe, although Baidu has already beaten out Yahoo and Microsoft’s Bing in global popularity, according to comScore figures.
And Baidu co-founder and chief executive Robin Li, a frequent speaker on Stanford’s campus, has said he wants Baidu to become a brand name in more than half of all the world’s countries. Presumably, then, Baidu will one day become something Americans can use.
Now that Ng leads Baidu’s research arm as the company’s chief scientist out of the company’s U.S. R&D Center here, it’s not hard to imagine that Baidu’s tools in English, if and when they become available, will be quite brainy — perhaps even eclipsing similar services from Apple and other tech giants. (Just think of how many people are less than happy with Siri.)
A stable full of AI talent
But this isn’t a story about the difference a single person will make. Baidu has a history in deep learning.
A couple years ago, Baidu hired Kai Yu, a engineer skilled in artificial intelligence. Based in Beijing, he has kept busy.
“I think Kai ships deep learning to an incredible number of products across Baidu,” Ng said. Yu also developed a system for providing infrastructure that enables deep learning for different kinds of applications.
“That way, Kai personally didn’t have to work on every single application,” Ng said.
In a sense, then, Ng joined a company that had already built momentum in deep learning. He wasn’t starting from scratch.
Only a few companies could have appealed to Ng, given his desire to push artificial intelligence forward. It’s capital-intensive, as it requires lots of data and computation. Baidu, he said, can provide those things.
Baidu is nimble, too. Unlike Silicon Valley’s tech giants, which measure activity in terms of monthly active users, Chinese Internet companies prefer to track usage by the day, Ng said.
“It’s a symptom of cadence,” he said. “What are you doing today?” And product cycles in China are short; iteration happens very fast, Ng said.
Plus, Baidu is willing to get infrastructure ready to use on the spot.
“Frankly, Kai just made decisions, and it just happened without a lot of committee meetings,” Ng said. “The ability of individuals in the company to make decisions like that and move infrastructure quickly is something I really appreciate about this company.”
That might sound like a kind deference to Ng’s new employer, but he was alluding to a clear advantage Baidu has over Google.
“He ordered 1,000 GPUs [graphics processing units] and got them within 24 hours,” Adam Gibson, co-founder of deep-learning startup Skymind, told VentureBeat. “At Google, it would have taken him weeks or months to get that.”
Not that Baidu is buying this type of hardware for the first time. Baidu was the first company to build a GPU cluster for deep learning, Ng said — a few other companies, like Netflix, have found GPUs useful for deep learning — and Baidu also maintains a fleet of servers packing ARM-based chips.
Now the Silicon Valley researchers are using the GPU cluster and also looking to add to it and thereby create still bigger artificial neural networks.
But the efforts have long since begun to weigh on Baidu’s books and impact products. “We deepened our investment in advanced technologies like deep learning, which is already yielding near term enhancements in user experience and customer ROI and is expected to drive transformational change over the longer term,” Li said in a statement on the company’s earnings the second quarter of 2014.
Next step: Improving accuracy
What will Ng do at Baidu? The answer will not be limited to any one of the company’s services. Baidu’s neural networks can work behind the scenes for a wide variety of applications, including those that handle text, spoken words, images, and videos. Surely core functions of Baidu like Web search and advertising will benefit, too.
“All of these are domains Baidu is looking at using deep learning, actually,” Ng said.
Ng’s focus now might best be summed up by one word: accuracy.
That makes sense from a corporate perspective. Google has the brain trust on image analysis, and Microsoft has the brain trust on speech, said Naveen Rao, co-founder and chief executive of deep-learning startup Nervana. Accuracy could potentially be the area where Ng and his colleagues will make the most substantive progress at Baidu, Rao said.
Matthew Zeiler, founder and chief executive of another deep learning startup, Clarifai, was more certain. “I think you’re going to see a huge boost in accuracy,” said Zeiler, who has worked with Hinton and LeCun and spent two summers on the Google Brain project.
One thing is for sure: Accuracy is on Ng’s mind.
“Here’s the thing. Sometimes changes in accuracy of a system will cause changes in the way you interact with the device,” Ng said. For instance, more accurate speech recognition could translate into people relying on it much more frequently. Think “Her”-level reliance, where you just talk to your computer as a matter of course rather than using speech recognition in special cases.
“Speech recognition today doesn’t really work in noisy environments,” Ng said. But that could change if Baidu’s neural networks become more accurate under Ng.
Ng picked up his smartphone, opened the Baidu Translate app, and told it that he needed a taxi. A female voice said that in Mandarin and displayed Chinese characters on screen. But it wasn’t a difficult test, in some ways: This was no crowded street in Beijing. This was a quiet conference room in a quiet office.
“There’s still work to do,” Ng said.
‘The future heroes of deep learning’
Meanwhile, researchers at companies and universities have been hard at work on deep learning for decades.
Google has built up a hefty reputation for applying deep learning to images from YouTube videos, data center energy use, and other areas, partly thanks to Ng’s contributions. And recently Microsoft made headlines for deep-learning advancements with its Project Adam work, although Li Deng of Microsoft Research has been working with neural networks for more than 20 years.
In academia, deep learning research groups all over North America and Europe. Key figures in the past few years include Yoshua Bengio at the University of Montreal, Geoff Hinton of the University of Toronto (Google grabbed him last year through its DNNresearch acquisition), Yann LeCun from New York University (Facebook pulled him aboard late last year), and Ng.
But Ng’s strong points differ from those of his contemporaries. Whereas Bengio made strides in training neural networks, LeCun developed convolutional neural networks, and Hinton popularized restricted Boltzmann machines, Ng takes the best, implements it, and makes improvements.
“Andrew is neutral in that he’s just going to use what works,” Gibson said. “He’s very practical, and he’s neutral about the stamp on it.”
Not that Ng intends to go it alone. To create larger and more accurate neural networks, Ng needs to look around and find like-minded engineers.
“He’s going to be able to bring a lot of talent over,” Dave Sullivan, co-founder and chief executive of deep-learning startup Ersatz Labs, told VentureBeat. “This guy is not sitting down and writing mountains of code every day.”
And truth be told, Ng has had no trouble building his team.
“Hiring for Baidu has been easier than I’d expected,” he said.
“A lot of engineers have always wanted to work on AI. … My job is providing the team with the best possible environment for them to do AI, for them to be the future heroes of deep learning.”
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here