Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. Watch now.


SAN FRANCISCO — Google made a big contribution to the big data world 10 years ago, when it released a paper on MapReduce, a programming model for doing big computing jobs on hefty data sets. But it turns out that, all this time, Google has been working on something far more advanced.

At Google’s annual I/O shindig today, the tech giant announced a service that can do much, much more than MapReduce: Google Cloud Dataflow. It can either run a series of computing jobs, batch-style, or do constant work as data flows in. Engineers can start using the service in Google’s burgeoning public cloud. Google takes care of managing the thing.

“We handle all the infrastructure and the back-end work required to scale up and scale down, depending on the kind of data needs that you have,” Brian Goldfarb, head of marketing for the Google Cloud Platform, told VentureBeat ahead of Google I/O.

Google Cloud Dataflow is Google’s response to public-cloud market leader Amazon Web Services’ Kinesis stream-processing service, which was first announced in November. The service is one more tool that helps flesh out Google’s cloud offering in a highly competitive business.

Event

Intelligent Security Summit

Learn the critical role of AI & ML in cybersecurity and industry specific case studies on December 8. Register for your free pass today.

Register Now

The new service draws from technologies Google has developed in recent years, including the FlumeJava library for running data pipelines in parallel and the MillWheel stream-processing framework.

What’s interesting is that Google, like some other companies, has gotten over, or moved on from, the MapReduce technology it pioneered.

“It’s funny — we’ve been doing massive data for a long time here,” Goldfarb said. “We’ve learned a few things, and one of the things we’ve learned is we don’t want to use MapReduce anymore.”

With MapReduce, ingestion of data — before it gets transformed or analyzed — can be tough and time-consuming. And with more connected devices offering up data for immediate analysis, MapReduce wasn’t the best fit. Time would be better spent figuring out the best way to analyze data, not kicking off long-running MapReduce jobs and simultaneously tinkering with different code to do stream processing with open-source tools like Storm. Hence the development of new, hybrid tools.

And Cloud Dataflow is already showing value at Google.

“It’s the way we do all of our internal analysis,” Goldfarb said.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.