Microsoft's new settings let users contribute recordings to improve its speech recognition systems

Microsoft today announced it will give customers finer-grain control over whether their voice data is used to improve its speech recognition products. The new policy will allow customers to decide if reviewers, including Microsoft employees and contractors, can listen to recordings of what they said while speaking to Microsoft products and services that use speech recognition technology including Microsoft Translator, SwiftKey, Windows, Cortana, HoloLens, Mixed Reality, and Skype voice translation.

Maintaining privacy when it comes to voice recognition is a challenging task, given that state-of-the-art AI techniques have been used to infer attributes like intention, gender, emotional state, and identity from timbre, pitch, and speaker style. Recent reporting revealed that accidental voice assistant activations exposed private conversations, and a study by Clemson University School of Computing researchers found that Amazon Alexa and Google Assistant voice app privacy policies are often "problematic" and violate baseline requirements. The risk is such that law firms including Mishcon de Reya have advised staff to mute smart speakers when they talk about client matters at home.

Microsoft stopped storing voice clips processed by its speech recognition technologies on October 30, and Google Assistant, Siri, Cortana, Alexa, and other major voice recognition platforms allow users to delete recorded data. But this requires some (and in several cases, substantial) effort. That's why over the next few months, Microsoft says it'll roll out new settings for voice clip review across all of its applicable products. If customers choose to opt in, the company says people may review these clips to improve the performance of Microsoft's AI systems "across a diversity of people, speaking styles, accents, dialects, and acoustic environments."

"The goal is to make Microsoft’s speech recognition technologies more inclusive by making them easier and more natural to interact with," Microsoft wrote in a pair of blog posts published this morning. "Voice clips will be de-identified as they are stored -- they won't be associated with [a] Microsoft account or any other Microsoft IDs that could tie them back to [a customer]. New voice data will no longer show up in [the] Microsoft account privacy dashboard."

If a customer chooses to let Microsoft employees or contractors listen to their recordings to improve the company's technology, in part by manually transcribing what they hear, Microsoft says it will retain the data for up to two years. If a contributed voice clip is sampled for transcription, the company says it might retain it for more than two years to "continue training and improving the quality of speech recognition AI."

Microsoft says that customers who choose not to contribute their voice clips for review will still be able to use its voice-enabled products and services. However, the company reserves the right to continue accessing information associated with user voice activity, such as the transcriptions automatically generated during user interactions with speech recognition AI.

Tech giants including Apple and Google have been the subject of reports uncovering the potential misuse of recordings collected to improve assistants such as Siri and Google Assistant. In April 2019, Bloomberg revealed that Amazon employs contract workers to annotate thousands of hours of audio from Alexa-powered devices, prompting the company to roll out user-facing tools that quickly delete cloud-stored data. And in July, a third-party contractor leaked Google Assistant voice recordings for users in the Netherlands that contained personally identifiable data, like names, addresses, and other private information. Following the latter revelation, a German privacy authority briefly ordered Google to stop harvesting voice data in Europe for human reviewers.

For its part, Microsoft says it removes certain personal information from voice clips as they're processed in the cloud, including strings of letters or numbers that could be telephone numbers, social security numbers, and email addresses. Moreover, the company says it doesn't use human reviewers to listen to audio collected from speech recognition features built into its enterprise offerings.

Increasingly, privacy isn't merely a question of philosophy, but table stakes in the course of business. Laws at the state, local, and federal levels aim to make privacy a mandatory part of compliance management. Hundreds of bills that address privacy, cybersecurity, and data breaches are pending or have already been passed in 50 U.S. states, territories, and the District of Columbia. Arguably the most comprehensive of them all -- the California Consumer Privacy Act -- was signed into law roughly two years ago. That's not to mention the Health Insurance Portability and Accountability Act (HIPAA), which requires companies to seek authorization before disclosing individual health information. And international frameworks like the EU's General Privacy Data Protection Regulation (GDPR) aim to give consumers greater control over personal data collection and use.