Microsoft beats Google, Intel, Tencent, and Qualcomm in image recognition competition

Microsoft Research has taken first place in several categories at the sixth annual ImageNet image recognition competition. Technology from Microsoft was able to outperform systems from Google, Intel, Qualcomm, and Tencent, as well as entries from startups and academic labs, according to the results.

The winning system from Microsoft researchers Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun is called "Deep Residual Learning for Image Recognition." (A paper detailing the system has just been published.)

The technology is notable partly because of its complexity.

"We train neural networks with depth of over 150 layers," the team wrote in a description of their method. "We propose a 'deep residual learning' framework that eases the optimization and convergence of extremely deep networks. Our 'deep residual nets' enjoy accuracy gains when the networks are substantially deeper than those used previously. Such accuracy gains are not witnessed for many common networks when going deeper."

This research area has become quite popular among technology companies, which are seeking to improve their own internal systems as well as consumer-facing products. The broad category of deep learning, which is at the core of these high-performing networks, involves training artificial neural networks on large sets of data, such as photographs, and then showing them new data to make inferences.

Microsoft has humorously demonstrated its capabilities with this type of technology with the "How Old Do I Look?" and "How's My Moustache Doing?" apps. And it has been commercializing image recognition technology through Microsoft Research's Project Oxford initiative.

The big competition at ImageNet requires entrants to correctly locate and classify objects in 100,000 photographs from Flickr and various search engines, placing them into 1,000 object categories (tarantula, iPod, mosque, toy shop, modem, and so on) with as few errors as possible.

Microsoft's winning entry had a classification error rate of 3.5 percent and a localization error rate of 9 percent.

In previous years, Google, startup Clarifai, and NEC have come out in front when it comes to classification.

This year, the Microsoft system from He, Zhang, Ren, and Sun also came in first place for an ImageNet competition around object detection.

"We even didn’t believe this single idea could be so significant," Sun is quoted as saying in a Microsoft blog post.

Baidu does not show up in this year's rankings. The company made more submissions than were permitted and ultimately apologized and fired the team leader who directed juniors to make the unacceptable submissions.

For this competition, IBM provided Nvidia GPUs (graphical processing units) in the SoftLayer public cloud for participating teams to use.

More