Android gains support for hardware-accelerated PyTorch inference

Google's Android team today unveiled a prototype feature that allows developers to use hardware-accelerated inference with Facebook's PyTorch machine learning framework. This enables more developers to leverage the Android Neural Network API's (NNAPI) ability to run computationally intensive AI models on-device. Google says this partnership between the Android team and Facebook will allow millions of Android users to benefit from experiences powered by real-time computer vision and audio enhancement models, like Facebook Messenger's 360-degree virtual backgrounds.

On-device machine learning can bolster features that run locally without transferring data to a remote server. Processing the data on-device results in lower latency and can improve privacy, allowing apps to work without connectivity.

The latest release of the NNAPI, which is designed to run AI operations on Android devices, provides a single set of APIs to take advantage of accelerators like graphics cards, digital signal processors, and neural processing units. Android 11 marked the launch of NNAPI 1.3, which added support for quality of service APIs, memory domains, and more. This prototype builds on the comprehensive support for over 100 operations, floating point, quantized data types, and hardware implementations from partners across the Android ecosystem.

Facebook says speedup doubled, as did reduction in power requirements using the NNAPI prototype feature for PyTorch. This is in addition to offloading work from the processor, which allowed the underlying models to perform other tasks.

The NNAPI can be accessed via an Android C API or a higher-level framework like Google's TensorFlow Lite. The initial release includes support for well-known linear convolutional and multilayer perceptron models on Android 10 and above. Performance testing using the popular MobileNetV2 computer vision model showed a roughly 10 times speedup compared with a single-threaded processor. As part of work toward a stable release, Google says future updates will incorporate additional operators and model architectures, including Mask R-CNN, an object detection and instance segmentation model.

The PyTorch-focused enhancements on Android follow the debut of PyTorch 1.4, which introduced a framework for distributed model parallel training and Java support for PyTorch inference based on the PyTorch Mobile for Android interface. At release, the experimental feature was only available for Linux and for inference. (PyTorch currently supports Python and C++.) PyTorch Mobile for iOS and Android devices launched last fall as part of the rollout of PyTorch 1.3, with speed gains coming from quantization, Google TPU support, and a JIT compiler upgrade.

More