At its re:Invent user conference in Las Vegas today, public cloud infrastructure provider Amazon Web Services (AWS) announced the launch of AWS Glue, a tool for automatically running jobs for cleaning up data from multiple sources and getting it all ready for analysis in other tools, like business intelligence (BI) software.
This type of work is typically known as extract-transform-load, or ETL. Companies including Informatica and Talend offer software for it. Now AWS has a cloud service for it.
It’s been possible to use AWS infrastructure to do ETL work, with services like EMR (Elastic Map Reduce). The other big public clouds have Hadoop-based tools for this sort of thing, too. But with AWS Glue it will be easier.
And with the help of JDBC connectors, it will be able to connect with data in on-premises services, making AWS Glue another proof point that AWS is interested in working with organizations that still retain their own on-premises data center infrastructure.
When data changes at their original sources, “jobs can be triggered again to make sure you always have access to the latest information,” Amazon vice president and chief technology officer Werner Vogels said.
“AWS Glue simplifies and automates the difficult and time consuming data discovery, conversion, mapping, and job scheduling tasks,” as AWS wrote in a blog post. “AWS Glue guides you through the process of moving your data with an easy to use console that helps you understand your data sources, prepare the data for analytics, and load it reliably from data sources to destinations.”