Baidu Research's breast cancer detection algorithm outperforms human pathologists

Baidu Research today announced it has developed a deep learning algorithm that in initial tests outperforms human pathologists in its ability to identify breast cancer metastasis.

The convolutional neural net was trained by splitting 400 large images into grids of tens of thousands of smaller images, then randomly selecting 200,000 of those smaller images. The algorithm then performs analysis to classify each of the smaller photos as well as its neighboring cells.

A variety of algorithms have been introduced to help pathologists examine images that can be gigabytes in size by cutting them into smaller parts. Baidu Research's algorithm attempts to move this technique forward by mimicking a pathologist's method to examine the area surrounding a breast cancer tumor cell, at once examining individual cells and nearby cells.

"Our innovation and our algorithm, we've taken this grid of images and then we are jointly predicting each one of them are tumor cells or normal cells ... by modeling their spatial correlations. And because of knowing the spatial correlations between each of these patches, the algorithm can make much more confident predictions," Yi Li, a machine learning research scientist at Baidu's Silicon Valley Artificial Intelligence Lab told VentureBeat in an phone interview.

On the FROC score, a measurement that takes into account the average detection sensitivity with 6 predefined false-positive rates per slide, Baidu's algorithm achieves a score of 80.9, higher than the 72.4 average from human pathologists and the 80.74 of the winner of the Camelyon16 challenge, a competition organized by the Consortium for Open Medical Image Computing. A cancer detection algorithm released last year by Google AI research achieved a FROC score of 89.

Images used to train Baidu's algorithm were obtained from the Camelyon16 challenge and include a combination of images with tumors, images without tumors, and images that have not been classified one way or the other.

The ability to detect conditions like skin cancer or diabetic retinopathy from images to classify medical images at rates equal to or better than medical experts has led some to refer to computer vision as one of the most promising forms of artificial intelligence to date.

In order to advance the development of the technique to take into account the health of surrounding cells, Baidu plans to open-source the algorithm, Li said.

"We are actually working on open-sourcing this algorithm so this can benefit the whole medical research community and even the entire health care industry," he said. "In order to really test if these algorithms are actually applicable in clinical settings, we think we need to seek more collaboration with hospitals or other medical resources to evaluate our algorithms in a much larger setting or dataset and different types of cancer cells to see if our algorithm can still hold high levels of accuracy and even outperforming experienced pathologists."

Also open-sourced in recent days was breast density classification with deep convolutional neural networks by researchers from New York University. Breast density is also a critical part of image analysis for breast cancer screenings.

More