SAN FRANCISCO — Microsoft today unveiled a couple of new services for working with data in its Azure public cloud and announced useful enhancements to its existing Azure SQL Database service at the Build conference today.
The new Azure SQL Data Warehouse will be able to store companies’ valuable relational data in the cloud, while unstructured data can go in the new Azure Data Lake for the purpose of doing big-data analytics.
Microsoft is also introducing a new level of abstraction for the operation of multiple databases through its Azure SQL Database service. Companies will be able to pool resources and control price and performance across several databases operating on the cloud service, given a certain regular budget. Microsoft calls this new feature “elastic databases.”
“We are creating this concept of elastic database tools that enable the SaaS [software as a service] vendor to aggregate large numbers of databases and be able to have a predictable business on top of this unpredictable sort of workflow,” T.K. “Ranga” Rengarajan, corporate vice president of data in Microsoft’s cloud and enterprise division, told VentureBeat in an interview this week.
The elastic databases and the two new services aren’t earth-shattering, but taken together, they should help Microsoft Azure compete with public cloud market leader Amazon Web Services, growing challenger Google Cloud Platform, and the other players.
The Azure SQL Data Warehouse, which will become available in public preview in June, comes more than two years after the launch of Amazon’s cloud data warehouse Redshift. And last year, former Microsoft executive Bob Muglia came forward to talk about his own cloud data warehousing startup, Snowflake Computing.
After announcing the Azure SQL Data Warehouse during the keynote today, Scott Guthrie, executive vice president of the Microsoft Cloud and Enterprise group, took time out to directly compare the news service with Amazon’s Redshift data warehouse. Here’s a screen grab of the chart Guthrie showed:
Microsoft, of course, thinks its new service is quite respectable. Based on the massively parallel processing architecture in Microsoft’s widely used SQL Server on-premises database software, it builds on the new elasticity of Azure SQL Database.
Indeed, each one of the nodes inside the new Azure SQL Data Warehouse is a node in Azure SQL Database, Rengarajan said.
“We believe it’s the first elastic cloud data warehouse that’s able to grow and shrink dynamically,” he said.
And of course, Azure SQL Data Warehouse can integrate with other Microsoft technologies, like Azure Machine Learning and Power BI for business intelligence.
The announcement of Azure Data Lake comes two years after Microsoft embraced the Hadoop open-source big data software and came out with its Azure HDInsight service, in partnership with Hadoop vendor Hortonworks. More recently Microsoft has partnered with another Hadoop vendor, Cloudera.
But Azure Data Lake, which is now in private preview, is different from Hadoop on its own, in the sense that it’s automatically geographically distributed from the get-go, and it’s “optimized for analytic workloads,” Rengarajan said. Still, it is compatible with the Hadoop Distributed File System, as well as Microsoft’s own HDInsight and open-source tools like Spark, Storm, and Kafka.
The new service would be a good match for applications with many sources sending in little bits of data all at once — like analytics for the use of sensor-laden Internet-connected devices, for example.
From a competitive standpoint, several startups, like Altiscale and Qubole, already provide Hadoop-based systems for storing and serving up unstructured data. And some cloud providers, like Amazon, have their own implementations. Azure Data Lake could cause some of these vendors to make product changes in order to stay competitive.
The pricing model for the new elastic databases in Azure SQL Database is fascinating too. Microsoft has devised what it calls elastic database throughput units, or eDTUs, that can be pooled together. The metric gives a sense of overall database performance, reflecting computing power, memory, and read and write rates. Here’s a chart of the exact pricing for the new feature:
The Elastic Databases feature in the coming weeks will get a querying function that will work with data in all databases in a given pool, a Microsoft executive said during today’s keynote.