AI classifies sleep disorders like sleep apnea, hypopnea, and arousal

AI capable of detecting restless sleep is nothing new. In April, researchers at Stanford and Université Paris-Saclay proposed a system that can predict location, duration, and type of sleep event in EEG charts, and in November Oxford scientists described a framework that could automatically detect REM sleep behavior disorder. But a method described in a preprint paper published on Arxiv.org ("SleepNet: Automated sleep disorder detection via dense convolutional neural network") takes a slightly different tack.

Rather than look for patterns of disordered sleep in slices of sensor data, it takes into account a range of data collected during polysomnography (sleep studies). This, the paper's authors say, is what helped it nab first place in Computing in Cardiology's 2018 PhysioNet challenge for detecting sleep arousal.

"Very little research has been done concerning the effect that non-apnea [and] hypopnea arousals have on sleep quality and general health because they are difficult to detect, [and] sleep arousals have been shown to have lower inter-scorer reliability when compared to apnea [and] hypopnea," the paper's authors write. "A more robust method of detecting [sleep] arousals would allow health researchers to determine the effects that these events have on health, as well as develop more effective treatments to reduce their frequency. The purpose of this work is to determine how accurately ... arousals can be detected with the use of deep learning methods."

The team architected their AI system atop a convolutional neural network, a class of neural networks commonly applied to visual imagery analysis, with a remapping mechanism to "simplify the network decision-making process." To improve generalization, they used a multitask learning mechanism that looked for correlations among three conditions: arousal, apnea, and hypopnea. And to train it they sourced the 12 measurement channels provided in the open source PhysioNet challenge corpus, which contains manually annotated polysomnography data from 1,985 patients monitored at Massachusetts General Hospital's sleep laboratory.

The researchers repeated the full training process a total of 4 times across different fourfolds of training and validation data, with 794 samples per fold in the training set and 100 validation and 100 consistent testing records. Then they averaged the outputs to obtain a final prediction.

In experiments, the researchers found that an ensemble strategy -- that is, one that tapped multiple trained models -- improved performance compared with several single-model strategies. It wasn't perfect -- it sometimes overestimated the apnea-hypopnea severity. But they claim it was able to predict arousal, apnea, and hypopnea accurately enough to generate a sleep monitoring report with "sufficiently low estimation errors."

More