Google today open-sourced Lyra in beta, an audio codec that uses machine learning to produce high-quality voice calls. The code and demo, which are available on GitHub, compress raw audio down to 3 kilobits per second for “quality that compares favorably to other codecs,” Google says.
While mobile connectivity has steadily increased over the past decade, the explosive growth of on-device compute power has outstripped access to reliable, fast internet. Even in areas with reliable connections, the emergence of work-from-anywhere and telecommuting have stretched data limits. For example, early in the pandemic, nearly 90 out of the top 200 U.S. cities saw internet speeds decline as bandwidth became strained, according to BroadbandNow.
It’s Google’s assertion that Lyra can make a difference in these scenarios.
Lyra’s architecture is separated into two pieces, an encoder and decoder. When someone talks into their phone, the encoder captures distinctive attributes, called features, from their speech. Lyra extracts these features in 40-millisecond chunks and then compresses and sends them over the network. It’s the decoder’s job to convert the features back into an audio waveform that can be played out over the listener’s phone.
An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.
According to Google, Lyra’s architecture is similar to traditional audio codecs, which form the backbone of internet communication. But while these traditional codecs are based on digital signal processing techniques, the key advantage for Lyra comes from the ability of its decoder to reconstruct a high-quality signal.
Google believes there are a number of applications Lyra might be uniquely suited to, from archiving large amounts of speech and saving battery to alleviating network congestion in emergency situations.
“We are excited to see the creativity the open source community is known for applied to Lyra in order to come up with even more unique and impactful applications,” Google Chrome engineers Andrew Storus and Michael Chinen wrote in a blog post. “We [want] to enable developers and get feedback as soon as possible.”
The Lyra code is written in C++ using the Bazel build framework. The core API provides an interface for encoding and decoding at the file and packet levels, and the complete signal processing toolchain is provided, which includes filters as well as transforms. Google’s example code integrates with the Android NDK to show how Lyra can work with Java-based Android apps, and Google has also provided the weights and vector quantizers necessary to run Lyra.
“This release provides the tools needed for developers to encode and decode audio with Lyra, optimized for the 64-bit ARM android platform, with development on Linux,” Storus and Chinen continued. “We hope to expand this codebase and develop improvements and support for additional platforms in tandem with the community.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.