Chinese Web company Baidu is announcing today that it is releasing key artificial intelligence (AI) software under an open-source Apache license. The WARP-CTC C library and optional Torch bindings are now available on GitHub, by way of Baidu Research’s Silicon Valley AI Lab (SVAIL).
The connectionist temporal classification (CTC) approach dates back to 2006, when it was documented in a paper from the Swiss AI lab IDSIA. Baidu Research developed WARP-CTC on top of that technology in order to improve its own speech recognition capability.
“We found that currently available implementations of CTC generally required significantly more memory and/or were tens to hundreds of times slower,” the Baidu Research team wrote in a blog post on the news.
The CTC approach involves recurrent neural networks (RNNs), an increasingly common component used for a type of AI called deep learning. Recurrent nets have been shown to work well even in noisy environments.
Andrew Ng, Baidu Research’s chief scientist, is noted for his research on artificial neural networks running on top of graphics processing units (GPUs), and indeed WARP-CTC works on top of GPUs and x86 CPUs alike.
Facebook, Google, and Microsoft, among others, have open-sourced their AI software as well. Recently Facebook went so far as to share its AI server hardware designs with the public. Today’s move from Baidu marks a big step forward in terms of Baidu’s knowledge sharing outside of academic papers.
“A lot of open source software for deep learning exists, but previous code for training end-to-end networks for sequences (like our Deep Speech engine) has been too slow,” Baidu wrote. “We want to start contributing to the machine learning community by sharing an important piece of code that we created.”