At its Google Cloud Next conference in San Francisco today, Google announced the launch of Cloud Dataprep, a service that lets people clean up their data sets before pushing it into a service like Google’s BigQuery managed data warehousing service.
The software uses machine learning to suggest transformations, said Brian Stevens, vice president of cloud platforms at Google.
Stevens didn’t say it onstage, but the software is in fact an embedded version of startup Trifacta’s Wrangler Enterprise app for cleaning up data with an easy-to-use point-and-click interface.
“Instead of leveraging HDFS [Hadoop Distributed File System], [Apache] Hive and [Apache] Spark for deployment (as Trifacta Wrangler Enterprise does), Cloud Dataprep integrates seamlessly with Google Cloud Storage, BigQuery and Cloud Dataflow,” a Trifacta spokesperson told VentureBeat in an email.
It does not appear that the service will be free, as Google says it will announce pricing information later. Currently Cloud Dataprep is available in private beta, and customers have to sign up to start using it. For now private beta users will only have to pay for the BigQuery, Google Cloud Dataflow, and Google Cloud Storage resources that they use while using Cloud Dataprep.
The service may compete with startup Paxata‘s software, and it’s also a response of sorts to the Glue service for extracting, transforming, and loading (ETL) data that public cloud market leader Amazon Web Services (AWS) recently announced. Microsoft Azure does not currently have a standalone tool that’s directly competitive with Google Cloud Dataprep.
Trifacta, which now has users at 4,500 companies, announced a $35 million funding round last year.
“As a leading data preparation company, it’s logical for us to follow these trends in the market, and we’re excited to work with Google as we place a strong focus on serving companies investing in cloud-based solutions. Google’s guidance and assistance has significantly accelerated our own cloud roadmap. We’re proud of what we’ve put forth together and extremely excited to see our solution bring new sources of value to Google Cloud’s rapidly expanding customer base,” Trifacta CEO Adam Wilson wrote in a blog post.
Update at 2:23 p.m. Pacific: Added information about the type of Trifacta software that is embedded in Google Cloud Dataprep.