SAN FRANCISCO — Ask analytics wonk Tom Davenport what’s changed in the decades since companies started collecting and reporting on data, and he’ll talk about the rise in the number of data sources, the emergence of data scientists, and the need to get more people inside companies analyzing data.

But as Davenport interviewed data scientists (along with DJ Patil, a co-creator of the term “data science”) about what those people actually do for a living, he realized that their jobs weren’t as sexy as some people might imagine.

“People spend a huge amount of time on what they call munging data or extracting, filtering, cleaning data from various kinds of systems,'” Davenport, author of the 2014 book Big Data @ Work: Dispelling the Myths, Uncovering the Opportunities, said at VentureBeat’s DataBeat conference today.


From VentureBeat
VB just released The State of Marketing Analytics: Insights in the age of the customer. $499 on VB Insight, or free with your martech subscription.

Davenport became convinced that as lots of people inside companies want to analyze more kinds of data, data scientists need to cut down on the amount of time they spend on this dirty work.

Tools like Trifacta and Paxata have emerged in the past few years to speed up the cleaning work, and more recently, Davenport said he’s been influenced by the approach that startup Tamr is taking to data management.

Tamr incorporates machine learning as well as crowdsourcing, said Davenport, who is also the president’s distinguished professor of information technology and management at Babson College. And with support for NoSQL databases and the Hadoop open-source software for managing lots of different kinds of data, Tamr could be a key part of the latest generation of data analytics.

“I don’t think you can do all this without adopting some new approaches to data integration and curation,” Davenport said. “It’s just not going to happen without that.”

As with all our events, we cover every company that appears onstage at DataBeat without regard to sponsorship. You can view a list of all of our sponsors here.

More information:

Tamr connects and enriches the vast reserves of underutilized internal and external data, allowing enterprises to use all their data for analytics and decision making. Tamr combines machine learning and advanced algorithms with collect... read more »

Tom Davenport is the President’s Distinguished Professor of Information Technology and Management at Babson College, the co-founder of the International Institute for Analytics, a Fellow of the MIT Center for Digital Business, and a ... read more »

Powered by VBProfiles


VB's research team is studying web-personalization... Chime in here, and we’ll share the results.