Missed the GamesBeat Summit excitement? Don't worry! Tune in now to catch all of the live and virtual sessions here.
A day after Microsoft used its Build 2019 developer conference to talk about AI and accessibility, Google is doing the same at its I/O 2019 developer conference. The Mountain View company unveiled three separate efforts: Project Euphonia (to help people with speech impairments), Live Relay (to help people who are deaf or hard of hearing), and Project Diva (to give people some independence and autonomy via Google Assistant).
Google cited a few numbers from the World Health Organization to back its efforts. Over 1 billion people, or 15% of the population, live with some sort of disability. That number is expected to rise as people get older and live longer.
Google has an accessibility sandbox at I/O 2019 where attendees will be able to try out these research products. Whether you’re at I/O or not, however, you’ll still want to read on and watch the videos. But first, grab a tissue.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
Project Euphonia, which is in the early research stages, aims to help people with speech impairments communicate more easily. Speech impairments can be caused by developmental disorders such as cerebral palsy and autism, or neurologic conditions such as stroke, ALS (amyotrophic lateral sclerosis), MS (multiple sclerosis), TBI (traumatic brain injuries), and Parkinson’s. With Project Euphonia, Google is hoping AI can improve computers’ ability to understand impaired speech. And in turn, computers can help ensure everyone is understood.
The Project Euphonia team is part of Google’s AI for Social Good program. The team partnered with the nonprofit organizations ALS Therapy Development Institute (ALS TDI) and ALS Residence Initiative (ALSRI) to record the voices of people who have ALS. By learning about the communication needs of people with ALS, the team was able to work on optimizing AI-based algorithms to more reliably recognize and transcribe the words they say.
On an ongoing basis, Google is collecting slurred (dysarthric) speech from individuals who have ALS, and turning their recorded voice samples into a spectrogram. The team is then using correctly transcribed spectrograms to train its AI system to better recognize this type of speech.
The models are currently limited — they only work for individuals who speak English and have impairments typically associated with ALS. Google does, however, believe the research can be subsequently applied to larger groups of people and to different speech impairments.
The AI tools that provide these improvements to speech recognition are only possible with speech samples. The more speech samples to train the models on, the greater the potential to understand more people. Google is also training personalized AI algorithms to detect sounds or gestures, and take actions such as generating spoken commands to Google Home or sending text messages.
The video above features Google speech researcher Dimitri Kanevsky, who learned English after he became deaf as a young child in Russia, and Steve Saling, who was diagnosed with ALS 13 years ago. Kanevsky is using Live Transcribe with a customized model trained uniquely to recognize his voice. Saling is using non-speech sounds to trigger smart home devices and facial gestures to cheer during a sports game.
The reason Google is talking about this project at I/O is simple: The company needs more samples. If you are or know someone with slurred or hard-to-understand speech, Google asks that you fill out this form and record a set of phrases.
People who are deaf or hard of hearing often communicate via sign language or chat. But what about when they can’t see the person they are talking to and texting isn’t available? Voice calls aren’t an option, until Google software engineer Sapir Caduri decided they are.
Live Relay uses on-device speech recognition and text-to-speech to let your phone listen and speak on your behalf. The research project makes it possible for a person who is speaking to call someone who is deaf or hard of hearing. The tool converts speech into text in real time and sends back written messages as spoken voice. The person who is speaking can simply talk on the phone, and the person who is deaf or hard of hearing can text on their phone.
Live Relay also leverages Google’s Smart Compose and Smart Reply features. Predictive writing suggestions and instant responses help the person typing keep up with the speed of a voice call.
Importantly, Live Relay runs entirely on your device, meaning your private calls don’t get sent to Google. The tool doesn’t require a data connection (just cell service) and only relies on audio. That means Live Relay works with any incoming voice call from any phone, including landlines.
Google considers Live Relay an alternative to Real-Time Text (RTT) and Relay Services. In fact, the team argues Live Relay could come in handy for all users. Ever get an important phone call but can’t step out and talk? Live Relay would let you take that call by typing instead of talking.
Google even plans to integrate real-time translation into Live Relay, further breaking down communication barriers. Imagine being able to call anyone in the world and communicate regardless of what language they speak. (The speaking person talks in their preferred language and the text appears in the receiver’s language, and vice versa.)
Live Relay was born last year, when Caduri read a young woman’s social media post about how her deaf boyfriend struggled to fix their home internet connection. The ISP’s tech support knew he was deaf, but had no way to communicate with him via text, email, or chat. Caduri got in touch and asked the woman why she didn’t just make the call on his behalf. She explained it was important to her boyfriend to feel independent and be empowered. Caduri realized Google had all the technology to help people make and receive phone calls without having to speak or listen.
Google employee Lorenzo Caggioni has a nonverbal 21-year-old brother, Giovanni, who was born with Down syndrome, West syndrome, and congenital cataracts. As voice and touchscreen technologies started to emerge, Caggioni decided to help his brother access his music and movies without any help.
Project Diva, which stands for DIVersely Assisted, helps people give the Google Assistant commands without using their voice. A person who is nonverbal or has limited mobility can use an external switch device to trigger Google Assistant commands.
The team examined various trigger commands, including pressing a big button with their chin, foot, or even a bite. After months of brainstorming and presentations at different accessibility and tech, the team built a prototype and won an Alphabet accessibility innovation competition. The solution was a box that you plug an assistive button into using a 3.5mm jack. The signal coming from the button is then converted to a command sent to the Google Assistant.
Caggioni’s Milan-based Google team eventually partnered with the Google Assistant Connect team to turn the prototype into Project Diva. Giovanni can now listen to music on the same devices and services his family and friends use by just touching a button with his hand.
The Project Diva team is now investigating moving from single-purpose buttons to RFID tags. By attaching tags to objects and associating a command to each tag, you could have a cartoon puppet trigger a cartoon on TV, or a physical CD trigger music on your speaker, Caggioni envisions.
Since this is I/O, Google is giving developers the technical details so they can build their own Project Diva device.