Presented by Infineon 

How much smarter can a smart speaker get? Quite a lot smarter, if two major market trends play out as expected.

The first trend driving the uptake of smart speakers is their role in strengthening the ecosystems of companies such as Amazon, Apple, and Google. Each has introduced smart speakers as a way of extending the reach of their services into our lives and gathering more data about what we want and do. Each company is evolving its smart-speaker hardware quite rapidly, to extend its functionality and make it suitable for use in more settings. These companies are also continually upgrading the supporting software and services, to make them more engaging.

The second trend that is increasing the smarts of smart speakers is the addition of more functionality. Smart speakers integrate so easily into our homes — and through their voice user interfaces — into our busy lives, that they have become an excellent platform for additional functionality. Some even expect that smart speakers will evolve into the role of smart-home hubs, securing and coordinating the actions of multiple smart-home devices across home networks, while acting as a gateway between those networks and the wider internet.

Presence detection, gesture-based control, and environmental monitoring

What sort of additional functionality can we expect to see in smart speakers? Smart speakers can already sense people’s presence in a room using their microphone arrays, but it is also possible to build more sophisticated presence-detection systems, operating over longer distances, using radar technology. Presence sensing can then be used as part of a home security system, or as a way of monitoring the sick and elderly in their homes by building up profiles of their usual daily actions and sending alerts if people deviate from them.

Developers are also exploring implementing gesture-based control to work alongside their voice interfaces in the near field of smart speakers. Think of a smart speaker on a countertop in a noisy kitchen — it would be useful to be able to raise or lower your hand over the speaker to raise or lower its volume, wave at it dismissively to skip a song in a playlist, or hold up a hand to cancel an alarm. It might even be possible to combine voice and gesture controls — if stroking a dog’s back and talking to it quietly can calm it down, surely interface designers can exploit similar combined interactions to develop new control metaphors for smart speakers?

The ubiquity of smart speakers in homes also makes them good platforms for environmental monitoring. The pandemic has taught us how important it is to monitor the air quality in our homes, offices, and schools, and particularly to keep an eye on CO2 levels, which can quickly affect concentration and productivity. Indeed, the California Energy Commission recommended in July 2020 that school rooms should be fitted with CO2 sensors to monitor and enable the control of CO2 concentrations in class. Researchers at the University of Colorado have even explored using CO2 levels as a proxy with which to measure the concentration of aerosols containing SARS-CoV-2 viral particles exhaled by people infected by COVID-19.

Some smart devices are already offering health monitoring facilities. Google’s second-generation Nest Hub adds a screen to a smart speaker and so is described as a ‘smart display’. It uses radar to track a user’s breathing and movement during sleep, combining the resultant radar data with information from its temperature and ambient light sensors and its microphone array to analyze the user’s sleep patterns and sleep quality.

Many technologies are available to help make smart speakers even smarter.

Audio: Output is important — but microphones will determine functionality

Of course, audio is at the heart of all smart speakers.

Consumers expect high-quality audio from smart speakers, and many designers are now using multichannel approaches to create room-filling sound from small enclosures. Traditional class D switching amplifiers need relatively large and costly filters, and their inefficiency leads to excessive heat generation in small enclosures. Infineon has adapted the traditional class D audio amplifier IC topology to use a multilevel amplification approach that cuts power consumption, reduces electromagnetic interference, and limits out-of-band noise, while retaining the high-quality audio output that consumers now expect.

Microphone performance is also vital to smart speakers, since voice user interfaces need high-quality audio to achieve robust recognition in noisy environments.

One way to improve the accuracy of voice recognition is to use multiple microphones, favouring the signal from the nearest microphone and using inputs from the other microphones to characterize and remove background noise. This works best when the microphones are closely matched, so that the noise-cancelling algorithms don’t have to account for any bias that they would otherwise introduce.

Micro-electromechanical systems (MEMS) microphones have a volume of just a few cubic millimetres, are physically robust, and can be mounted to a PCB in the same way as an IC. The microphones can handle a very wide range of input levels without distortion. They are trimmed during manufacture to ensure that they closely match their specification, and their performance won’t drift over time. These three attributes make it much easier to create matched microphone arrays that provide a consistent input to voice-recognition algorithms.

Micro and macro movements: Someone breathing or walking by?

Infineon’s IM69D130 exemplifies some of other the advantages of using MEMS techniques to build microphones. The part has a highly linear response within a dynamic range of 105dB, and its distortion is kept below 1% at sound pressure levels of 128dB SPL. The microphone also has a high signal to noise ratio of 69dB(A), which gives it the headroom to distinguish quiet sounds and sounds from a long distance away from background noise. All these characteristics help improve the utility of smart speakers, for example by making it possible to control them from another room or when they are playing loud music.

Arrays of Infineon MEMS microphones can be used to enable beam-forming strategies that locate a user within a room by comparing the signals from each microphone. For more sophisticated presence detection, radar ICs can detect someone’s presence in a room even if they are not speaking.

Infineon’s XENSIV™ 60 GHz radar technology is implemented as a low-power chip of just 5 x 5.6mm. It can sense micro and macro movements at a range of up to 5 metres, enabling one device to sense movement as subtle as a person breathing, and as obvious as someone walking by. The parts are supplied with presence detection software, and provide a good alternative to other ways of detecting human presence, such as lasers, ultra-sonic or PIR sensors.

Smart hubs and privacy

The other key function of smart speakers, especially as they evolve into smart hubs for home automation, is to secure their user’s privacy, protect the integrity of any data flowing over the home network, and act as gatekeeper for communications with the wider internet.

Infineon’s OPTIGA™ Trust M security controller IC offers physical security, by handling security functions in a separate device to the main processor. It can store arbitrary user data and cryptographic keys in a secure element onboard. It can be authenticated as a unique and genuine part, making it more difficult to clone devices. And it is tamperproof, so that the secrets it holds cannot be extracted by physical attack.

These features can enable encrypted communications and can be programmed with the credentials needed to automatically make secure connections to cloud servers at boot time. It also has the facilities necessary to check that any firmware updates it is downloading are coming from a trusted source and have not been tampered with in transit.

How much smarter can smart speakers get? The ecosystem players will continue to evolve their hardware, software, and services to increase user engagement. Others will be able to take advantage of the kind of additional functionality described in this article to create innovative features.

In doing so, smart speakers may evolve into the role of smart home hubs, coordinating and protecting the activities of multiple devices on smart home networks. Infineon has the hardware, firmware, and software, as well as the experience and advice, to enable developers to make their smart speakers smarter, while ensuring they remain secure.

Dale Wedel is Application Marketing Manager, Smart Speakers and Emerging IoT at Infineon.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact