Amazon is using AI and machine learning to predict context from customers’ queries. In a preprint paper accepted to the ACM SIGIR Conference on Human Information Interaction and Retrieval scheduled to take place this month, Amazon researchers describe a system that predicts activities like “running” from queries like “Adidas men’s pants.” It could help to improve the quality of search results on Amazon.com, which could enhance the overall Amazon shopping experience.

As Adrian Boteanu, contributing author and Amazon Search customer experience applied scientist, explains in a blog post, most product discovery algorithms look for correlations between queries and products. By contrast, the researchers’ AI identifies the best matches depending on the context of use.

To train the system, the team assembled a list of 173 context-of-use categories divided into 112 activities (such as reading, cleaning, and running) and 61 audiences (like child, daughter, man, and professional) based on common product queries. They used standard reference texts to create aliases for the terms they used to denote the categories, and then they scoured a corpus relating millions of products to query strings for reviews for the category terms plus their aliases. If either the original category terms or the alias turned up in any review of a given product, the product was labeled with the corresponding category term.

Amazon recommender system

The above-mentioned corpus correlated strings with products according to an affinity score (from 1 to 15), where a low score indicates a weak correlation. To train the context-of-use predictor, the researchers produced another data set where each of those entries consisted of three data items: a query; a product ID, annotated with context-of-use categories; and the query-product affinity score. This data set — which was divided into two smaller sets, one annotated according to activity and one according to audience — was used to train six different machine learning models.

Each model was trained to predict context of use on the basis of query strings, and in tests, the best-performing managed to anticipate product annotations with 97% accuracy for activity categories and 92% for audience categories. When human reviewers were presented with rank-ordered lists of categories generated by the activity models, the reviewers agreed an average of 81% of the time with the system’s per-item predictions.

“This suggests that the contexts of use identified by our system could help product discovery algorithms deliver more-relevant results, improving the customer experience. Moreover, the minimal human supervision required to produce training data means that our method could be expanded to new categories with relatively little effort,” the blog post stated.