Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More

It’s been a long, long time since Google came up with the foundational technologies for storing and processing big data. This year, the company developed a new tool for working with data as it comes in, and now Google is keen to see people use it.

The new technology, implemented as a cloud service called Google Cloud Dataflow, was unveiled in June, and Google has begun to accept requests from developers for limited access in an alpha program. But now Google is going further: It’s releasing a software-development kit (SDK) in Java for using Cloud Dataflow under an open-source license.

“The idea behind the SDK is to allow us to get to other languages, so that people can bring their own language,” Tom Kershaw, director of product management on the Google Cloud Platform, told VentureBeat in an interview.

The open-source move could result in more developers coming around to the approach Google has thrown its weight behind: setting up pipelines to process data as it comes in, instead of or in addition to doing batch processing jobs that take a while. What’s more, the open-sourcing strategy could increase the usage of Cloud Dataflow on the Google Cloud Platform, which competes with other big and growing public clouds, like Amazon Web Services and Microsoft Azure.


Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.


Register Now

The way Kershaw sees it, Google is already positioned to have a niche in that market.

“Our view is that we are the cloud for big data,” he said. “If you have a big data problem, Google Cloud is the place you want to be. We invented a lot of this technology, we helped make a lot of the technology mainstream, and we want to make it easy to use.”

The question is whether developers will agree en masse that the Google cloud can outperform others. Part of that calculation depends on the pipeline tools available from other clouds, as well as on open-source tools, such as Apache Spark Streaming. But Spark, for instance, is compatible with Cloud Dataflow, Kershaw said.

Ultimately, he said, it’s a matter of getting more people onboard with a different type of computing from what they might be used to.

“Our strong belief is that it’s going to change the approach of data processing,” Kershaw said. “It will make streaming the norm, rather than the exception.”

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.