Baidu releases PaddlePaddle 1.6 with federated learning tools and more

Baidu's PaddlePaddle (PArallel Distributed Deep LEarning), a platform originally developed by Baidu scientists for the purpose of applying AI to products internally, was open-sourced in September 2016. Since then, the capabilities on offer have grown substantially, and today it gained 21 new features intended to "improve usability" and "accelerate [the] widespread ... deployment" of AI.

"[Machine] learning is highly generalized, standardized, automatic, and modularized, bringing AI from laboratory to industrial scale," said Baidu CTO Haifeng Wang, who revealed that more than 1.5 million developers have used PaddlePaddle since its release, including engineers at Chao Fang and Guangdong Power Grid Company. "We will continue to open-source PaddlePaddle and drive technological development, industrial innovation, and social progress together with developers."

Chief among the enhancements is perhaps Paddle Lite 2.0, the second generation of the Paddle Lite module Baidu released last year. It's tailored for inference on mobile, embedded, and internet of things devices, and it's compatible with both PaddlePaddle models and pretrained models from other sources. Now, Paddle Lite lets developers implement ResNet-50 -- a popular image recognition AI model -- with roughly seven lines of code while supporting edge-based field-programmable gate arrays (FPGAs) and low-precision inference using operators with the INT8 data type.

On the development kit side, PaddlePaddle now packs four tools in total: ERNIE for semantic understanding (NLP), PaddleDetection and PaddleSeg for computer vision (CV), and Elastic CTR for recommendation. By way of a refresher, ERNIE is a pretraining framework for semantic understanding that incrementally gains knowledge through multi-task learning, while PaddleSeg is an image segmentation library supporting tasks from data augmentation to modular design. PaddleDetection, an object detection suite, has been upgraded with the addition of over 60 models. As for the newly released Elastic CTR, it serves parameter deployment forecasts and provides process documentation for distributed training on Kubernetes.

PaddlePaddle 1.6 also ships with a novel framework -- Paddle Graph Learning (PGL) -- for heterogeneous graph learning on walk-based paradigm and message passing-based paradigms, boosting the number of graph learning models that PaddlePaddle supports to 13. Plus, there's the PaddleFL federated learning framework, which taps the open source FedAvg and differential privacy-based SGD algorithms to enable distributed learning for model training on a corpus of decentralized data.

The new PaddlePaddle includes release of an upgraded version of EasyDL, a platform that's been used by more than 65,000 enterprises to build over 169,000 models in manufacturing, agriculture, service industries, and more with a drag-and-drop interface. EasyDL Pro -- the newest iteration -- is a one-stop development platform for engineers looking to deploy algorithms with fewer lines of code. As for the new Master mode, it's designed to help developers better customize models for tasks using a library of pretrained models and tools for transfer learning.

Baidu open-sourced PALM, which won the Machine Reading for Question Answering (MRQA) 2019 championship, and added support in PaddlePaddle for more operators and APIs and over 100 new models across natural language processing, computer vision, speech, and recommendation. PARL, a high-performance distributed training framework for reinforcement learning (RL), has been upgraded with more parallel mechanisms and support for parallel evolutionary algorithms. Lastly, PaddleSlim (a PaddlePaddle module for model compression) gained a quantitative training function and a hardware-based small model search capability, coinciding with the arrival of an auto fine-tune function for hyperparameter optimization in PaddleHub (a toolkit for managing pretrained models).

"By providing hardware support, cloud-to-edge deployment, development kits, and Master mode, we've significantly improved PaddlePaddle's performance and feature set," said executive director of Baidu AI Group Tian Wu. "In the future, PaddlePaddle will keep advancing large-scale distributed computing and heterogeneous computing, providing the most powerful production platform and infrastructure for developers to accelerate the development of intelligent industries."

Baidu has made its AI ambitions clear in the two years since it launched Baidu Brain, its eponymous platform for enterprise AI. Last year, Baidu took the wraps off Kunlun, an AI chip designed to handle models for on-device edge computing and datacenter processing that's able to achieve up to 260 tera-operations per second (TOPS) and 512GB/second memory bandwidth. More recently, Baidu announced that its conversational AI assistant DuerOS has reached an install base of 400 million devices, up from 150 million last November. To date, more than 200 partners have launched 110 DuerOS-powered devices and about 16,000 software developers are actively contributing to its apps marketplace.