Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


SAN JOSE, Calif. — Some venture capital firms love to invest in big data startups. But they don’t always love to use big data technology themselves.

Bay Area venture capital firm Data Collective is different. It’s all about Hadoop, that open-source software for storing, processing, and analyzing lots of different kinds of data.

“We have Hadoop clusters running,” Matt Ocko, a managing partner at Data Collective, said during a panel discussion on Thursday at the Strata + Hadoop World big data conference. “We have a couple thousand cores working away on interesting stuff. I would say the identifying factors for success and startups are very, very flimsy and stochastic.”

Ocko made the remark after someone in the audience asked if he and others on the panel, including Amplify Partners’ Mike Dauber and CRV’s Max Gazor, were engaged in any sort of “Moneyball for startups.” The idea refers to Michael Lewis book “Moneyball” — which was later made into a movie — that depicts Oakland Athletics general manager Billy Beane’s use of obscure statistics to improve the performance of his team. The term has since been co-opted by data scientists and vendors who say businesses can improve their own results by carefully examining their data.

From left, Matt Ocko of Data Collective, Arif Janmohamed of Lightspeed Venture Partners, Cack Wilhelm of Scale Venture Partners, Max Gazor of CRV, Mike Dauber of Amplify Partners at Strata + Hadoop World in San Jose, Calif., on Feb. 20.

Above: From left, Matt Ocko of Data Collective, Arif Janmohamed of Lightspeed Venture Partners, Cack Wilhelm of Scale Venture Partners, Max Gazor of CRV, Mike Dauber of Amplify Partners at Strata + Hadoop World in San Jose, Calif., on Feb. 20.

Image Credit: Jordan Novet/VentureBeat

A few VC firms have taken steps to do that. Balderton Capital last year hired data scientist Ferenc Huszár, and Accel Partners, Greylock Partners, and IA Ventures, among others, have brought on data scientists in residence.

At Data Collective, it sounds like Hadoop is helping in several ways. Ocko elaborated on his comments in an email on Friday, while making sure not to give away too much:

Other than that it’s a large hybrid cluster, and it provides competitive advantage in sourcing, evaluating, and managing our companies, we don’t really discuss specifics. I guess you could say that it involves several AI disciplines simultaneously, and that we’ll be integrating more tech over time from some of our stealthy but world-class ML/AI companies.

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more
Become a member