Voice assistant technology is supposed to make our lives easier, but security experts say it comes with some uniquely invasive risks. Since the beginning of the year, multiple Nest security camera users have reported instances of strangers hacking into and issuing voice commands to Alexa, falsely announcing a North Korean missile attack, and targeting one family by speaking directly to their child, turning up their home thermostat to 90 degrees, and shouting insults. These incidents are alarming, but the potential for silent compromises of voice assistants could be even more damaging.
Nest owner Google — which recently integrated Google Assistant support into Nest control hubs — has blamed weak user passwords and a lack of two-factor authentication for the attacks. But even voice assistants with strong security may be vulnerable to stealthier forms of hacking. Over the past couple of years, researchers at universities in the US, China, and Germany have successfully used hidden audio files to make AI-powered voice assistants like Siri and Alexa follow their commands.
These findings highlight the possibility that hackers and fraudsters could hijack freestanding voice assistant devices as well as voice-command apps on phones to open websites, make purchases, and even turn off alarm systems and unlock doors — all without humans hearing anything amiss. Here’s an overview of how this type of attack works and what the consequences could be.
Speech-recognition AI can process audio humans can’t hear
At the heart of this security issue is the fact that the neural networks powering voice assistants have much better “hearing” than humans do. People can’t identify every single sound in the background noise at a restaurant, for example, but AI systems can. AI speech recognition tools can also process audio frequencies outside the range that people can hear, and many speakers and smartphone microphones pick up those frequencies.
These facts give bad actors at least two options for issuing “silent” commands to voice assistants. The first is to bury malicious commands in white noise, as US students at Berkeley and Georgetown did in 2016. The New York Times reported that students were able to play the hidden voice commands in online videos and over loudspeakers to get voice-controlled devices to open websites and to switch to airplane mode.
Another example of this sort of attack comes from researchers at Ruhr University Bochum in Germany. In September, they reported success with encoding commands in the background of louder sounds at the same frequency. In their short demonstration video, both humans and the popular speech-recognition toolkit Kaldi can hear a woman reading a business news story. Embedded in the background data, though, is a command only Kaldi can recognize: “Deactivate security camera and unlock front door.” Experts say in theory this approach could be used at scale, through apps or broadcasts, to steal personal data or make fraudulent purchases. Such purchases could be hard for retailers to screen out because they would come from a trusted device and use valid payment information.
Another approach is to launch what researchers at Zhejiang University in China call a DolphinAttack by creating and broadcasting commands in a frequency outside the range of human hearing. This type of attack relies on ultrasonic transmissions, which means the attacker must be near the target devices to make it work. But the Zhejiang researchers have used this technology to get a locked iPhone to make phone calls per inaudible commands. They said DolphinAttack can also get voice-controlled devices to take photos, send texts, and visit websites. That could lead to malware, theft of personal data, fraudulent purchases, and possibly extortion or blackmail.
How tech companies can guard against inaudible command threats
Amazon, Google, and Apple are always working on improvements for their voice assistants, although they don’t typically delve into the technical specifics. A paper presented by the Zhejiang researchers recommends that device microphones be redesigned to limit input from the ultrasonic range that humans can’t hear or to block inaudible commands by identifying and canceling the specific signal that carries them. The authors also suggested harnessing the power of machine learning to recognize the frequencies most likely to be used in inaudible command attacks and to learn the differences between inaudible and audible commands.
In addition to those short-term fixes, scientists and lawmakers will need to address longer-term challenges to the safety and efficacy of voice-recognition technology. In the US right now, there’s no national legislative or regulatory framework for voice data and privacy rights. California was the first state to pass a law limiting the sale and data-mining of consumer voice data, but it only applies to voice data collected through smart televisions.
As the number of use cases for voice recognition grows along with the Internet of Things, and as the number of players in the space rises, the risk of voice-data breaches will rise, too. That raises the possibility of fraud committed with recordings of consumers’ voice data. Sharing audio files is much easier and faster than cloning a credit card or using copying someone’s fingerprint with silicone, which means voice data could be valuable to organized criminals. Fraud prevention professionals will need to build and maintain clean, two-way databases of consumer voice data to ensure that companies can recognize legitimate customer contacts. And merchants may need to analyze voices for links to previous fraud incidents when they screen orders.
How to protect your voice-controlled devices
Right now the dangers of voice-command hijacking seem mostly theoretical and isolated, but the recent past has shown us that fraudsters adapt quickly to new technology. It’s wise to follow safety practices that can protect your devices from voice hacking and safeguard your data in other ways, too. Use strong, unique passwords for your IoT devices. Don’t leave your phone unlocked when you’re not using it. PIN-protect your voice assistant tasks that involve your home security, personal data, finances, or health records — or simply don’t link that information to your voice-command devices.
Research into inaudible voice commands and the risks they pose is still relatively new, and the security and tech industries have seen that every new advancement gives bad actors new opportunities. As more researchers shed light on the weak spots in speech-recognition AI, the industry has the opportunity to make its products more secure. In the meantime, it’s up to each of us to protect our devices and be discerning about the types of information we share with Alexa, Siri, Cortana, and other voice assistants.
Rafael Lourenco is Executive Vice President at retail fraud prevention company ClearSale.