Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


It’s a new day, and Google has a new cloud service for storing and processing big data. Google Cloud Dataproc, which is being launched in beta today, is a managed service for running Hadoop and Spark.

Independent startups like Qubole, Altiscale, and Xplenty offer commercial software for running open-source Hadoop on top of public clouds, but now there’s an option that’s native to the Google Cloud Platform.

“Cloud Dataproc automation helps you create clusters quickly, manage them easily, and save money by turning clusters off when you don’t need them,” Google Cloud Platform product manager James Malone wrote in a blog post on the new service. “With less time and money spent on administration, you can focus on your jobs and your data.”

Microsoft Azure and Amazon Web Services, two other major public clouds, both have their own first-party services for running Hadoop, with HDInsight and Elastic MapReduce, respectively. Support for Spark — the open-source big data processing framework that’s seen as a successor to the MapReduce engine — has come to both. Now Google will have its own full-fledged tool to compete directly. And that’s important in the growing public cloud market, where ease of use and cost are both critical factors.

Developers can run batch and streaming jobs with the Google Cloud Dataflow service, but that’s a largely proprietary system not explicitly based on the widely used Hadoop open-source big data software.

Google makes it possible to run core Apache Hadoop on its public cloud with the Cloud Launcher quick-start tool, but Cloud Dataproc makes it easy to manage clusters once they’re running.

Like Google Compute Engine, the new service is priced by the minute after the first 10 minutes.

Learn more about the new service here.

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more
Become a member