Tom Reilly, chief executive of big data company Cloudera, walks a fine line when it comes to the data warehouse technology that companies spend so much money on.
Rather than replacing data warehouses with new-wave systems based on Hadoop, his company’s software — dubbed an enterprise data hub — can add plenty of value by sitting next to data warehouses, he told VentureBeat in an interview after Cloudera announced an impressive $160 million round of funding.
“Most of our customers implement their enterprise data hub adjacent [to] and integrated with their enterprise data warehouse,” Reilly said. “The value we bring to that equation is our enterprise data hub becomes a staging area for transformations of data and then to deliver that into an enterprise data warehouse for operational analytics.”
In other words, the computing and storage resources reside inside a data warehouse — a hardware and software combination taking up space in enterprises’ data centers — while Hadoop takes care of the preparation of data before it goes into the data warehouse, on top of cheaper commodity hardware.
Articulation of Hadoop’s advantages and its ability to play nicely with data warehousing is a like a tightrope performance at this moment in time.
Companies have been waking up to the appeal of Hadoop. “I’d say every Fortune 5,000 or Global 5,000 enterprise already has a Hadoop cluster,” Reilly said. So Cloudera needs to address that interest. But it also needs to speak to companies that plan to keep relying on data warehouses for the time being, perhaps because they think Hadoop is too complex or not secure enough.
Cloudera and others have been working on these obstructions to adoption, and as time goes by, Hadoop-based systems for storing and processing data should become more of a standard.
Cloudera’s enterprise data hub includes open-source software like Hadoop’s file system and tools for querying and analyzing data along with proprietary management and data-governance tools. This hub can accept and add structure to unstructured data, and it can also improve performance, Reilly said.
That position is sensible now, because Cloudera needs to maintain good relations with partners in the data warehousing market, like Teradata and Oracle. But as Hadoop becomes more technically viable for reliable business use, the rhetoric could change.
“If they proceed the way they want to, I think you’re going to see their messaging evolve, getting increasingly more, ‘We will replace rather than we will compliment [the data warehouse],'” Wikibon big data analyst Jeff Kelly told VentureBeat.
“I think the situation is they can’t replace the enterprise data warehouse right now, but that’s clearly what they want to do at some point. It’s very nuanced messaging.”
Either way, the company needs to post major revenues in some way as it looks to go public, which could happen later this year. Reilly said Cloudera has more than 300 paying customers.
The new funding brought with it an increase in Cloudera’s valuation, which has some significant implications.
“For Cloudera, the practical impact is that the new funding has made the company far more expensive for any potential suitor. With hundreds of millions in the bank and documented market momentum, their valuation can only climb. Money in the bank is a first line defence against acquisition,” Tony Baer, an Ovum analyst covering big data, wrote in an email to VentureBeat.
Cloudera doesn’t have to rush into an initial public offering. Still, Reilly said he’s been focused on getting the company ready to be public.
Analysts believe Cloudera heads up the pack of companies selling distributions of different parts of the Hadoop ecosystem of open-source software for storing and processing lots of different kinds of data. Hortonworks, MapR, and Pivotal do have different strengths and key partnerships, but lately, Cloudera has been targeting other kinds of companies in the data world in order to have a bigger market to tap. Hence the marketing of the bigger-than-Hadoop enterprise data hub, which Cloudera began last year.
“Cloudera is pursuing a broader strategy than before — being a ‘Hadoop distribution vendor’ is limiting — and as the answer to the question ‘what is Hadoop’ has become much broader and more complex, so has their strategy, incorporating more dimensions of information management,” Merv Adrian, a Gartner analyst covering big data, wrote in an email to VentureBeat.
At least Cloudera has been forming many public-cloud integration announcements. Kelly said the cloud was the vision for handling big data at Cloudera — just look at the company’s name — but many companies have been slow to put their most valuable data outside of their control in a public cloud. Almost all of Cloudera’s deployments are in on-premises data centers, Kelly said. But if and when the public cloud becomes the default for handling big data, Cloudera should be sitting pretty.
Another challenge facing Cloudera: Hadoop-centric applications for enterprises and startups are still emerging.
“The Hadoop application market is fairly immature today, largely built around BI [business intelligence] and analytics tooling in many cases coming from startups rather than established vendors,” Donnie Berkholz of analyst firm Redmonk wrote in an email to VentureBeat. But outside of those areas, things are still developing. So the alliances with data warehouse companies add up on that front, too.
But Reilly is optimistic about Hadoop and specifically Cloudera’s flavor of it.
‘We believe that Hadoop will be adopted into every major enterprise,” he said. “And the way they’ll implement it is as an enterprise data hub.”