The magic of machine learning is its ability to uncover patterns to return astonishing results. Things get dark when suddenly you realize that those sophisticated machine learning-powered recommendation engines are laying bare the secrets you’ve been hiding and revealing your truest, most naked self.

“I used to say I like jazz from the 1930s, 40s — Lester Young, Chet Baker, cool jazz,” says Oscar Celma, head of research at Pandora. “But then you look at my songs, and you’re like, yeah, you say you like that, but you’ve been listening to a lot of pop, female singers, which is great too.”


Head of Research Oscar Celma is a featured speaker at MB 2017, July 11-12 in San Francisco. He’ll be talking about human-in-the-loop, an important, but not frequently discussed, aspect of AI. He’ll be among dozens of others from iconic brands sharing how companies are using AI to stay ahead. See the full roster of speakers here.


Music can be incredibly personal. What we listen to is often how we define ourselves, how we express who we are, and most importantly, if you’re trying to impress, the type of person we hope others see us as. But Pandora looks at those Taylor Swift songs you’ve got on repeat, and can figure out who you really are.

“The way people see themselves in regard to the type of music they listen to is not really aligned with the signals they’re giving,” Celma explains. “We have a better understanding about the music they like than they think about themselves.”

“When you have over 80 billion thumbs, you know a lot about your user base,” he adds.

Thumbprint Radio is an especially good example of how well Pandora understand its users, Celma says, calling it the “the biggest radio station on earth.” It has over 30 million engaged users that use it at least twice a week.

It’s the most personalized station on Pandora, consisting of all the songs you’ve given a thumbs up to over the whole course of your Pandora experience, combined with their unique recommendation algorithms.

There are those explicit signals that feed their recommendation engine — the underground subgenres you check off and the obscure European bands you’re really into. Then you’ve got all the Beyonce and K-Pop you’ve been grooving to at work, plus the implicit data that comes from analyzing the contexts in which you’ve been switching through your playlists (Are you on the treadmill at the gym? In your kitchen whipping up a Sunday brunch?).

The human-in-the-loop factor

That data is then matched with the Music Genome Project’s database, the foundation of Pandora’s success. It’s an effort to “capture the essence of music at the most fundamental level” by analyzing over over 450 attributes of every song in their library. Human experts listen to one song at a time and annotate those attributes; Celma’s data scientists then use machine learning to scale that data to up to 50 million songs.

This is where the important human-in-the-loop (HITL) factor comes into play. Some argue that any AI model lacking some sort of human-in-the-loop aspect is flawed.

“Our music analysts have been really careful annotating all these songs, so we use this knowledge as ground truth to train our machine learning algorithm,” says Celma. “The algorithms can discern whether it’s a male or female singer, the type of voice—whether it’s a nasal type of Dylan sound or raspy vocals like Ray LaMontagne.”

Add that to the collaborative filtering of 76.7 million monthly active users, where cohorts of similar listeners to you, aka people with the same taste in music, can help identify the songs you’re most likely to thumb up. It should feel like having 76.7 million friends saying “Hey, have you heard this new album?”

But AI can also make you cooler, as if those 76.7 million friends have their 76.7 million thumbs on the pulse of the music industry. Pandora’s thumb is called Next Big Sound, which they acquired in 2015.

The company provides external analytics for online music, and can identify the artists statistically predicted to achieve future success. It analyzes the popularity of musicians in every nook and cranny of the world, from social network chatter and interest in the artist’s Wikipedia page, to the popularity of a video on YouTube and their music across streaming services, radio, and even late night shows.

From those external signals, the algorithm can help determine which new, exciting music to add into Pandora’s radio stations. And the biggest discovery has been that in the context of Thumbprint Radio, users seem to be more open to discovery and novelty, compared to their normal listening on other, less personalized channels.

“Because it’s a nice combination of familiarity with music that you like, combined with songs that we think you might like, people react very positively to these novel songs,” Celma says.

It’s a mix of art and science, Celma says. Human listening plus machine listening, plus the whims of the world when it comes to uncovering that next big superstar in the macro, or introducing you to a little bossa nova group you can name-drop to your friends. And each station, all 30 million of them, is completely unique.

As of the last quarter of 2016, Thumbprint has racked up over 5.5 billion spins and 20 million active daily listeners, demonstrating how how well recommendation engines can boost engagement and engender some impressive stickiness.

As algorithms like Pandora’s are refined and AI and machine learning gets more sophisticated, customers are increasingly expecting that kind of granular personalization, the “I’m watching you while you sleep” sort of knowledge about their wants and needs that’s possible to demonstrate now.

It has powerful implications across industries. Amazon estimates that 35 percent of all sales are triggered by their own recommendation engine; Netflix has claimed its recommendation system saves the company $1 billion a year by helping them identify what customers really want and refine their content buying accordingly.

The way forward seems clear, and the AI and machine learning technology is there: what kind of data does your customer offer you — and what kind of problems can you solve for both your company and for them with it? When you know customers better than they know themselves, your business gets smarter, and users keep coming back for more.