Adam Cheyer, a Brandeis University and UCLA alum with degrees in computer science and AI, knows a thing or two about digital assistants. He previously led the Cognitive Assistant that Learns and Organizes (CALO) project at SRI International’s Artificial Intelligence Center, which sought to integrate cutting-edge machine learning techniques into a platform-agnostic cognitive assistant. Cheyer was on the founding team of Siri, the startup behind the eponymous AI assistant technology that Apple acquired in 2010 for $200 million, and he cofounded Viv Labs, which emerged from stealth in 2016 after spending four years developing an assistant platform designed to handle complex queries.
Samsung acquired Viv in October 2016 for roughly $215 million and soon after tasked Cheyer and colleagues with building their startup’s technology into the company’s Bixby assistant, which rolled out in March 2017 alongside the Samsung Galaxy S8 and S8+. The fruit of their labor — Bixby 2.0 — made its debut in October 2017 at Samsung’s Bixby Developer Conference, and it formally launched on the Galaxy Note9 in August 2018.
Today, Bixby is available in over 200 countries and on over 500 million devices, including Samsung’s Family Hub 2.0 refrigerators, its latest-gen Smart TV lineup, and smartphone and tablet series that include the Galaxy S, Galaxy Note, and mid-range Galaxy C, J, and A. (Sometime this year, the first-ever smart speaker with Bixby built in — the Galaxy Home — will join the club.) On the features front, Bixby has learned to recognize thousands of commands and speak German, French, Italian, U.K. English, and Spanish. And thanks to a newly released developer toolkit — Bixby Developer Studio — it supports more third-party apps and services.
But the Bixby team faces formidable challenges, perhaps chief among them boosting adoption. Market researcher Ovum estimates that 6% of Americans used Bixby as of November 2018, compared with 24% who used Alexa and 20% who favored Google Assistant. For insight into Bixby’s development and a glimpse of what the future might hold, VentureBeat spoke with Cheyer ahead of a Bixby Developer Session in Brooklyn today.
Here’s a lightly edited transcript of our discussion.
VentureBeat: I’d love to learn more about Bixby Marketplace, Bixby’s upcoming app store. What can customers expect? Will they have to search for Bixby apps and add them manually or will they be able to launch apps with trigger words and phrases?
Adam Cheyer: Bixby Marketplace will eventually be available as part of the Galaxy Store, the app store [on Galaxy phones, Samsung Gear wearables, and feature phones]. Samsung is committed to having a single place where you can buy apps, watch faces, and other items. You’ll be able to find Capsules there, in addition, and Capsules contributed by developers in Samsung’s Premier Development Program will have key placement. But I think the coolest way to interact with the Marketplace will be through Bixby itself.
There’s a number of approaches you can take already. One is Bixby Home, [the left-most dashboard] on Galaxy phones’ home screens. On [smartphones], you just tap on the Bixby button and swipe to see featured Capsules and other Capsules in all sorts of categories.
You can also discover Capsules automatically through natural language. For instance, if you say something like “Get me a ride to San Francisco,” Bixby will respond ”Well, you don’t have any rideshare providers enabled right now, but here are several providers in the Marketplace.” You’ll then be prompted to try [the different options] and decide whether you like one brand, another brand, or both. If you enable more than one, Bixby will ask which you’d like to use by default.
Also, as you suggested, you can invoke Capsules with a name or phrase. For instance, you can say “Uber, get me a ride to San Francisco.”
VentureBeat: Right. So eventually, will developers be able to charge for voice experiences — either for Capsules themselves or Capsule functionality? I’m envisioning something akin to Amazon’s In-Skill Purchases, which supports one-time purchases and subscriptions.
Cheyer: Absolutely. The first version of the Bixby Marketplace will not feature what I call “premium Capsules,” which means paid apps or subscription apps. But we’re working hard on that, and we’ll have some announcements around that soon. We know that the content providers of the world need to make a living, and we will absolutely support that.
Transactional Capsules can charge money — we have providers like Ticketmaster and 1-800-Flowers who are accepting purchases today, and we’ve worked really hard to lower purchase friction for our commerce partners. If you’ve saved your card on file anywhere within the Samsung ecosystem, Bixby will know about it — you just say “Send some flowers to my mom,” and the 1-800-Flowers Capsule will say “Great — do you want to pay with your usual card?”
Additionally, we support OAuth for partners like Uber, which have cards on file within user accounts. You’re able to attach Bixby and give it account access privileges so that you can make purchases in these partners’ payment flows.
VentureBeat: You added new languages to Bixby recently — they joined English, Korean, and Mandarin Chinese. What are a few of the localization barriers the team’s facing as they bring Bixby to new territories?
Cheyer: We’re working hard to launch at least five new languages a year, and we may up that in the future.
We believe that offering the right tools and building an ecosystem that scales will enable the world’s developers to create fantastic content for end-users. This is especially important when it comes to globalization because it means that we don’t have to localize every single service. Instead, we provide a platform that has the same capabilities in each language.
VentureBeat: So on the subject of developer tools, has the Bixby team investigated neural voices like those adopted by Amazon and Google? I’m referring to voices generated by deep neural networks that sound much more human-like than the previous generation of synthetic voices.
Cheyer: I’m not going to announce anything that’s not yet in production, but I will say that Samsung has significant capabilities not only on the text-to-speech side of things but on the speech recognition side, as well. There’s significant advances being made in AI, and neural network voices is certainly one of them. There’s also a lot of work ongoing in automatic speech recognition (ASR) — we’re transitioning from hidden Markov model approaches to pure end-to-end neural networks — and we’re seeing ASR models move from the cloud to edge devices like phones.
We’re definitely aware of all of this, and you can rest assured that we’re working hard on these areas.
VentureBeat: You briefly mentioned privacy. As you’re probably aware, there’s some concern about how recorded commands from voice assistants are being stored and used. Bixby already offers a way to delete recordings, but would the team consider introducing new commands or in-app settings that’d make it even easier to delete this data?
Cheyer: Sure — we’re open to all of those things. Privacy is an important and multifaceted issue. For me personally, it’s not just the fact that my voice was used to tune a particular speech recognition model somewhere. I’m much more concerned about what an assistant’s doing on a semantic level — what it knows about me and why it’s showing me certain information.
But different users are going to worry about different things. You have to offer a variety of ways to let users control the data that companies have, and how they use that data.
One thing that’s important to note is that we’ve made control over what Bixby learns a fundamental platform capability. I’ll give you an example: With Bixby, developers can opt to use machine learning to process requests from users. If I ask Bixby about the weather in Boston, it might not be obvious, but which “Boston” I’m referring to is actually a preference. Most people are going to choose Boston, Massachusetts, but people who live in Texas might choose Boston, Texas. It’s kind of annoying to have to repeatedly specify which Boston you want, which is why Bixby is built to learn preferences about things like restaurants, products, and ridesharing globally and locally.
We surface these learnings to users in the Understandings page. They’ll see that Bixby guessed that they meant Boston, Massachusetts last time they asked about the weather. If they didn’t, they’re able to update it or make it clear that they don’t want Bixby to know this information about them. They always have total visibility of what is known about them and how it’s being used at a very granular level.
VentureBeat: Would you say that this degree of customization and personalization is one of Bixby’s strong suits?
Cheyer: Yes. At a high level, most of what competitors provide developers is speech recognition and language understanding, but that’s pretty much it. From there, the developer has to hardwire each use case and hand-code everything that happens next, whether that’s querying a mapping server to figure out which “Boston” a user is referring to.
Bixby is a radically different platform, where every question that gets asked of the user goes through a machine learning process. No other platform on the market from any competitor has anything like that. We have an AI tool, and for every use case that a developer comes up with, it will write the code and take care of assembly, calling out to different APIs, translating the information from one API to another, interacting with the user, and learning from interactions. All that comes for free with the platform. This type of learning — we call it dynamic program generation — has a lot of benefits around privacy and security. It makes for a much richer experience and saves a lot of development time.
Another thing I think is super exciting about the Capsule approach is that developers can use the same Capsule across multiple devices. They don’t need a Capsule for a TV, a different Capsule for a refrigerator, and a third Capsule for a phone. When they build a Capsule, that Capsule is the same Capsule that supports all languages and all devices. We have this new Bixby Views system that’s publicly available so that developers can build multimodal graphical experiences with the richest palette of components available. And there are automatic conversions so that unmodified capsules will run brilliantly on things as small as watches, as big as the Wall TV, and everything in-between.
The primary usage of most other assistants goes to the built-in services that come out of the box with those assistants. Despite tens of thousands of skills or actions that third parties create, very little usage goes to third-party developers for a variety of reasons.
One of the things that we’re committed to with Bixby is giving as much power and focus to third-party developers as possible. That’s really our long-term plan for winning — having a vibrant ecosystem where most of the business and most of the user traffic goes to third parties.
VentureBeat: Samsung recently announced Bixby Routines, which learn habits to preemptively launch apps and settings. Do you have any updates to share on Routines or any other predictive features that might soon come to Bixby?
Cheyer: Samsung acquired not only Viv Labs’ technology but acquired SmartThings [in August 2014], and one of the things SmartThings does is that it manages complex routines [through SmartThings Cloud]. So there’s that already, and I think there will be continued enhancements in existing integrations between Bixby and SmartThings — that’s something to look for.
A single use case in the Bixby framework — like booking a flight — gets broken down into around 60 different automatically generated steps executed interactively. If you think about Routines, it’s just changing together different steps — the base technology is already being used in Capsule development. Requests are still mostly Capsule-specific, and so there’s limited cross-capsule capability. As soon as we open that up on the natural language side, the execution and dynamic programs generation capabilities are all there already.
VentureBeat: It’s fair to say that a key appeal of assistants is how seamlessly they command home devices. I’d love to hear about how the Bixby team is thinking about third-party device integration, and perhaps areas where they’re investing time and resources into making these interactions more powerful.
Cheyer: For me, one of the big announcements this year for Bixby will be device footprint. Last year, Bixby 2.0 was out on a single phone, but Samsung is aggressively working to backport the new Bixby to every smartphone with at least a Bixby button and some even without Bixby buttons. As a result, the number of devices that have Bixby will go up significantly just in the handset dimension. Samsung’s also beginning to ship Bixby today on Family Hub refrigerators, Smart TVs, and in the future Galaxy Home speakers, and there are many, many more devices that I can’t talk about that will be getting Bixby this year.
I think when you look back at the end of the year, you’ll see Bixby nearly everywhere. And, as you know, Samsung has a device footprint of about a billion devices, and the company is on record as saying that by 2020, all of the devices they sell will be connected. They’re investing heavily to make Bixby a ubiquitous control interface.
VentureBeat: Speaking of ubiquity, can we expect to see Bixby on third-party devices in the near future? Will developers and manufacturers eventually be able to add Bixby to their devices?
Cheyer: That was something extremely important to me when Samsung acquired Viv. We always intended to make the assistant as important an ecosystem as the web and mobile — once you have a thriving ecosystem and businesses are supporting it, the assistant becomes more than just a device feature. You want to get it on every single device in the world to drive more requests and allow users to benefit from its scale.
Samsung committed to me and has publicly committed to shipping Bixby on non-Samsung devices. This year, our work will focus on getting it out to millions of Samsung devices and opening a market for third-party developers. And next year or sometime soon, our intention absolutely is to open up Bixby to non-Samsung devices.