Pandora today announced the launch of the Podcast Genome Project, a podcast recommendation engine akin to Pandora’s Music Genome Project for suggesting songs and creating personalized radio stations.
Both projects rely on labeled datasets to train AI models with supervised learning.
“At the core of it, the Music Genome and the Podcast Genome are both meant to analyze kind of the basic level of what is this song about and what is this episode of a podcast about?” Pandora chief product officer Chris Phillips told VentureBeat in a phone interview. “And specifically on the episode level, we then take that understanding of the content and use both humans for quality and machines and the algorithm to help us scale that understanding. Then we use that with our algorithms to help provide unique, high-quality discovery for our consumer listener base.”
As with the Music Genome Project, giving a thumbs-up or thumbs-down to podcast episodes will help fine-tune and personalize recommendations.
The Podcast Genome Project will initially be available to 1 percent of app users and will be rolled out to the larger Pandora user base in the coming weeks, a company spokesperson told VentureBeat in an email.
Podcast recommendations will live alongside music recommendations in the Browse section of the Pandora app, and users will also be able to surface them using the Pandora search engine.
The Podcast Genome Project uses speech-to-text translation and natural language processing so it can understand not just what’s in a podcast host’s biography or an episode description, but also more granular information. For example, it might let you know an episode of a cooking podcast is about baking pastries with your kids.
“Tens of thousands” of podcast episodes are included in the corpus of data that informs the podcast recommendation engine today, Phillips said, from popular podcast producers like Gimlet, NPR, and WNYC and shows like “The Ringer,” “This American Life” to many lesser-known podcasts.
“It’s not just about the top most popular podcasts getting more listening — that will happen — it’s also about the long tail discovery, and that’s why the science is so critical, because we can democratize discovery way better than any other service because of that,” Phillips said.
Each episode is labeled by Pandora’s curation team with more than 1,500 potential attributes, such as podcast category — like whether it is a serious news program or has comedic undertones or can be described as satirical or mentions a trending topic like Brexit.
The Podcast Genome Project definitely has a ways to go to catch up with the Music Genome Project, which began in 2005 and includes 60 million labeled songs.
Pandora was acquired in September by SiriusXM in a $3.5 billion all-stock deal. That acquisition has not yet been finalized, but it could result in the prioritization of SiriusXM content, Phillips said.
Pandora got its start in podcasts years ago when it began to license episodes of “Serial” and “This American Life” and created “Questlove Supreme,” its first original podcast.
The arrival of a recommendation engine could lead to more Pandora original podcasts, though Phillips declined to share any details.