Understanding bots through 4 interface categories

"A rose by any other name would smell as sweet," to paraphrase a famous bard. There is a huge ongoing boom in messaging-based applications coming to market, generically called "bots." Regular readers of technology sections are no doubt also aware that, by most accounts, the performance and adoption of these bots has failed to match the hype.

But why is this so? Is the technology premature, are businesses rushing in too quickly, or are bots inherently limited?

The first step to answering that question is deciding what "bot" even means. It is a travesty how loosely that term is used. In tech and business media, "bot" simultaneously refers to old technology like SmarterChild, new technology enhanced by machine learning, (pre)programmed-to-respond interfaces, and flexible dialogue designed to mirror human conversation. Stating that a bot uses artificial intelligence also does not add nuance, since A.I. encapsulates a wide range of capabilities and various performance levels.

We cannot continue to generalize the technologies underpinning different types of messaging interfaces as if they're equivalent. They're not. Bots, unlike roses, need more clarification and better categorization. To promote clarity, journalists, technologists, marketers, and business leaders need to choose their words more carefully in the ongoing conversation on chatbots, messenger bots, virtual assistants, conversational UI, and natural language interfaces (just a small sampling of the various labels these phenomena get slapped with).

To that end, we're proposing a concise framework to raise the level of discussion. This framework is grounded in careful analysis of the existing "bot" landscape, as well as years in the trenches hammering away at natural language's toughest puzzles.

In this proposed framework, there are four categories of messaging interfaces, each defined by the types of permitted input. We draw a distinction between different types of "bots" and progressively more complex conversational interfaces that support natural language, large vocabularies, and flexible multipart dialogue. To determine which type of messaging interface to invest in, companies must examine the specifics of their own use case.

Without further ado, here is the breakdown of messaging interface categories, with an infographic at bottom and available here.

Click navigation

This is the simplest form of messaging bot. Dispensing altogether with custom text-field and voice input, these bots use images, radio buttons, and scripted text to traverse a logical and preset navigation flow.

Such bots are well-suited to a smaller library of content, usually curated, that can be navigated using structured menus and prompts. An example would be a bot that lets you explore the latest headlines by clicking on the category of news desired, or a bot that provides buttons for ordering off a menu. For such applications, click navigation bots can be fast, efficient, and intuitive.

Keyword search

These are slightly more advanced bots that can search for keywords found within text fields or voice search input. Since natural language is not supported, these bots work well for small data sets with a limited number of potential search queries. In other words, the user flow must be sufficiently predictable to be hard-coded.

A weather bot is a good example of this category. Here, a user speaks or types a query like "what's the weather in Seattle?" or "is it rainy in Seattle?" The system searches for the keyword "Seattle" and then displays the appropriate weather forecast. This example falls under "keyword search" instead of the next category, "structured phrase matching," because keywords are searched for without regard to the structure of the sentence and the surrounding words within a bounded domain. Such keyword-based bots provide a natural search experience, albeit for specific domains and use cases where they can work well.

Structured phrase matching

Structured phrase matching introduces the notion of natural language understanding (NLU) for short phrases with predictable keyword categories. For instance, if a user were to say "I need a flight from San Francisco to Houston," the bot would understand from the sentence structure that San Francisco is the departure city and Houston is the arrival city. The bot would also understand multiple facets in the same query -- flights to a destination, on a certain date, within a given price range.

Structured phrase matching works well for larger data sets where there is a limited number of workable sentence structures within a specific knowledge domain. The bot in the above example is only supposed to book flights, for example.

Structured phrase matching bots verge on appearing to be truly natural language, but they are actually working within predetermined language structures. One way that companies using such bots address the likelihood of users deviating from predetermined structures is by inserting human assistants when queries don't conform to the predetermined structure.

Unrestricted natural language

This category presents an experience that is much more akin to a human conversation. Unrestricted natural language interfaces understand natural sentences of any structure, whether spoken or typed, and can engage in multipart, dynamically generated dialogue to disambiguate, complete, and resolve queries. Because the list of potential queries can number into the millions (or even billions), these systems require extensive training on large data sets using large-scale machine learning.

This type of interface is conversational in a human-like way, hence its categorization as a conversational interface. Calling such interfaces "bots" conflates simple point-and-click navigation with technology that is at the forefront of machine learning and is rapidly evolving. Conversational interfaces can be experienced with successful products from large companies such as Amazon, Apple, and Google that have committed to this resource-intensive approach. At the same time, white-labeled conversational interface platforms from tech startups (MindMeld included) are giving other companies access to conversational technologies.

"Bots," unlike roses, actually refers to many different things without users being aware. Hopefully, by referring more specifically to the desired functionality when we discuss "bots," we can elevate the level of dialogue and combat the erroneous assumption that all messaging interfaces are roughly equivalent, or can be fit to any requirement. Only then can businesses scope their projects accurately and consumers avoid being soured by hype. While bots are not like roses, let's hope, like roses, bots never go out of fashion.

Click navigation

Keyword search

Structured phrase matching

Unrestricted natural language

More