Parse.ly's Currents uses AI to measure real-time audience attention

For web publishers, engagement is a valuable metric. In fact, content optimization platform Parse.ly claims audience attention -- which it defines as the way topics, moments, contexts, locations, devices, and sources interact with each other in real time -- is more predictive of behavior than demographics, social signals, and search queries. Case in point: In a recent study, it found that attention data can accurately predict a movie's box office success several weeks before the premiere.

That's why Parse.ly in June launched Currents, a new feature that peels back the curtains on attention and its contributing influences. And it's why the company is today making Currents available to all customers -- including those on its free tier.

"If there's one thing the media industry needs, it's transparency. That's been my personal mission since starting Parse.ly with my cofounder several years ago," Parse.ly CTO Andrew Montalenti said. "We think that Currents will shine a bright light on how news and content on the internet really works ... [it's] like a live poll of the internet."

Currents comprises five core data dimensions: Story Clusters, or groupings of closely related articles; Topics; Categories; Traffic Sources; and Geography. A sophisticated machine learning backend enables it to learn news story topics and categories automatically, and by honing in on the "meaningful" words in text -- that is to say, those related to people, places, things, and ideas -- it's able to suss out the context and subject of articles.

That's accomplished with the help of word embeddings (specifically fastText, a pretrained model for text representation), which Currents uses to gain an understanding of articles at a semantic level. The NLP engine learns the relationships among articles in an unsupervised way and groups them together automatically.

Additionally, Currents creates knowledge graphs -- ontologies that encompass representations, formal naming, and definitions of categories, properties, and relations between concepts -- and extracts references to important people and places. That's how it can separate, for example, stories about Elon Musk and Tesla from ones about SpaceX.

"The system really [understands] the news -- how all the various narratives, subnarratives, and storylines affected content and attention," Montalenti said, "and it [does] that by using statistics and collective user intelligence, not via manual human curation."

Currents is nuanced enough to comprehend not just topics and categories, but also subtopics and subnarratives -- more than 80 broad categories and several hundred "leaf categories" supplied by the Internet Advertising Bureau. Moreover, it's able to identify relationships between articles to the tune of hundreds of thousands of articles a day.

Launching Currents was a considerable engineering challenge, Montalenti said. Parse.ly had to deploy a "petabyte-scale warehouse" that could aggregate data from "billions" of news reading sessions over minutes, hours, and days. More than a billion people read the more than 1 million articles published in Parse.ly's network every month.

Since the launch of Currents in beta, the company has uncovered a few surprising insights:

Democratic-leaning states such as Washington, Oregon, and Minnesota are consuming more news about Donald Trump and Bob Woodward's book, while solidly Republican states such as Texas, Oklahoma, and Tennessee are reading about Colin Kaepernick and Nike.
In the first week in September, Currents showed that of the 7.2 million people who read about Tesla and SpaceX chief Elon Musk, 4 million specifically sought out the 600 stories about his marijuana usage on Joe Rogan's podcast.

24 hours of Currents data is free without registration, and seven days is available with free registration.

"I'm personally most excited to see how publishers make use of the data to change their content and platform strategy," Montalenti said. "But I'm also really excited to see how it gets used in industries outside of media -- in digital marketing, PR, communications, and even in areas like finance, politics, and entertainment. I imagine a lot of businesses are going to change now that you can know exactly how many people are reading content about any given topic on any given hour, day, week, or month."

New York-based Parse.ly, which was founded in 2009, also offers its core Parse.ly Analytics platform, which is used by 400 publishers and media companies. Despite the cutthroat competition (e.g., Google Analytics, Chartbeat), it has managed to secure clients like the Wall Street Journal, Time, Bloomberg, Condé Nast, Hearst, HelloFresh, and Ben & Jerry’s.

Parse.ly recently raised $6.8 million in a funding round led by Grotech Investors and Blumberg Capital (contributing to a total haul of $12.9 million). The company counts about 72 employees in its workforce and told Poynter last year that it has reached profitability.