As usage of chat programs like Slack and HipChat explodes, companies are looking to integrate more of what they do with these programs. App builders, especially bot makers, are rushing in to take advantage of this rise in conversational interfaces. While using natural language processing (NLP) — that is, letting users talk to your bot the way they would a human — does seem like the secret sauce to delighting customers, it’s not always the easiest path.
At Talla, we’ve invested in and learned a lot about both NLP and the associated user experience. NLP is still very much an open problem in academic research, and while we have an extremely strong data science team, it is still difficult to build. NLP technologies remain imperfect and immature, and there aren’t a lot of engineering best practices around building your tech stack or creating a good user experience. The tools and techniques that exist for other areas of engineering don’t exist at the same scale for NLP.
We launched our Task Assistant product, a natural language to-do list manager for Slack, a few weeks back in order to test out some of our theories and collect feedback to use in future intelligent assistants. After having over 700 companies start using the product, here’s what we learned.
Lesson 1: Human language is extremely varied
Even with something as straightforward and limited as a task list, we were surprised at all the different ways people engaged with Talla. Add in things like appreciations, metaphors, misspellings, and slang and you have a wide variety of things that Talla needs to be trained in.
Alan Packer, director of engineering at Facebook’s Language Technology team, has a great talk on how they built their machine translation technology. The language of the workplace isn’t quite as varied, and we’ve been able to constrain it somewhat by the type of assistant we offer. While it still doesn’t make NLP easy, it reduces the problem so that we don’t have to deal with it to the same extent as constantly adjusting to the new, cool way to talk.
Lesson 2: You can’t just pass off the unclear use cases to a human
A lot of bot companies have humans behind the scenes. When a bot isn’t able to understand something, they pass it off to the humans. The idea is that people will train the model until it has enough data to answer things on its own. But it’s not always a sustainable solution, especially as users get more comfortable with using bots and have higher expectations of what they can understand. That’s because when you have a handful of customers with unique questions it’s manageable, but it doesn’t scale. You might be surprised to learn that 15 percent of Google searches are unique, which equates to hundreds of millions of unique queries a day. It would be difficult to scale humans up to answer all of them, so you have to be careful about relying too much on human intervention.
Lesson 3: Map intent with contextual awareness
Most NLP will be done on small data sets, limiting the amount of contextual awareness bots have to draw from to make assumptions about user intent. Imagine asking a bot the following: “Bot, please send me a report on current traffic.” For a marketer, that could be website traffic. For someone in sales or an exec on the way to a meeting, that could be the traffic on the road. They are completely different intents driven by context. It might seem like a small example that’s easy to solve for, but making corrections in a conversational interface isn’t easy, particularly if the conversation has a couple steps to it. Now imagine this happening all of the time.
Constraining bot types is one solution to this. The NLP models can thus make certain assumptions about what a user is asking for and they can be more helpful to users, rather than broad and shallow. Constrained use cases have an additional benefit for users, which is allowing them to form a mental model about the types of things the assistant does. Otherwise, we’ve found, users can’t remember the things it does, and they end up using just just one feature.
Lesson 4: Sometimes, it’s the human’s ‘NLP’ that’s the problem
Since Talla has been out in the wild, we’ve seen something strange: People don’t read what she says to them — they gloss right over it. Here’s a common example from our task list functionality. If you want to change the due date on something that’s already on your list, you can ask Talla to reschedule. But people don’t always specify which task they want rescheduled. Those interactions look something like this:
Human: Talla, reschedule a task.
Talla: Okay, which task would you like me to reschedule?
Part of the fix for this comes down to the way you format messages within Slack, by using markdown, buttons, or emojis to emphasize and guide the user to the next actions they need to take.
For all of these challenges, initial user onboarding is crucial. Teaching people to train their bots allows for correction of misunderstandings due to variability in human communication. Managing expectations for bot capabilities keeps users from looking for their assistant to do things outside their scope, which cuts back on the need for a human back-end. And getting to know more details about each bot user, like their job title or direct reports, helps for greater contextual awareness. It’s not easy to nail all of these things. But if you can, you’ll provide a powerfully useful tool.
Overall, NLP has the ability to vastly improve user experience, but presents unique technical challenges.