Cloudera is embracing the cloud in a big way.
The Palo Alto, Calif.-based company is expanding its partner ecosystem to bring its distribution of Apache Hadoop, the open-source data processing framework, to public cloud services from IBM, Verizon, and Savvis, the company announced today.
“Our customers have, almost without fail, deployed in their own data centers,” Mark Olson, chief strategy officer at Cloudera, told VentureBeat. ”But, increasingly, customers have been asking us about deployment in the cloud.”
IBM SoftLayer, Verizon Terremark, and Savvis CenturyLink will resell Cloudera’s Hadoop distribution as an instance that runs on their public cloud offerings. T-Systems already has a partnership with Cloudera through which it offers analytics-as-a-service on top of its existing cloud computing infrastructure.
“Each of those vendors has lots of good enterprise relationships already, and this makes it easier for us to reach them,” said Olson. In particular, ”SoftLayer reselling Cloudera is a big coup for us.”
SoftLayer began offering big data solutions about a year ago, partnering first with MongoDB and then Riak and Cloudant. But Cloudera is its first Hadoop partnership.
“We feel that Cloudera is a leader in the Hadoop space, and we have a lot of customers today who are already leveraging Hadoop for their workloads,” said Marc Jones, vice president of product innovation at SoftLayer. ”Having the capability to spin up a multinode Hadoop cluster on bare metal servers is a powerful proposition for a lot of companies.”
Editor’s note: Our upcoming DataBeat conference, Dec. 4-Dec. 5 in Redwood City, will focus on the most compelling opportunities for businesses in the area of big data analytics and beyond. Register today!
Cloudera’s distribution can already run on cloud services if you load it onto your virtual machines, but these partnerships will simplify the process of setting up a Hadoop cluster on a SoftLayer, Verizon, Savvis, or T-Systems cloud. Cloudera isn’t announcing pricing for any of its partners, but Olson noted that they’ll be different, as each vendor is setting their own price. IBM said its Cloudera services start at $699 per month, which will nab you an Intel Xeon 5620-based server with 24GB of RAM and two 500GB SATA storage drives.
With its Hadoop distribution, Cloudera helps organizations make sense of their structured and unstructured data. It’s focused on markets that produce (and need to analyze) exceptional amounts of data: telecommunications, government, financial services, retail, energy, and so on.
Cloudera is a major contributor of code to Hadoop, and it also contributes to the OpenStack project, ensuring that the two play nice together. A lot of Cloudera customers prefer to deploy in a private cloud, so OpenStack support is important, said Olson.
“The reason our customers often want to deploy in a private cloud is that all their data is already in the data center — and moving it out of the enterprise can be prohibitive,” he said. “Even if it’s feasible to move it out, some of the customers we work with are concerned about publishing their data on hardware that they don’t own.
“The key goal here has been to let customers run the platform where they want, when they want, and how they want it.”
Cloudera’s Hadoop distribution is popular among enterprises, but not without competition: EMC and Intel offer their own Hadoop distributions.