Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Hewlett-Packard Enterprise (HPE) this week revealed that it has acquired Ampool, a provider of a distributed SQL engine based on the open source Presto project that allows users to access data stored in multiple databases.
HPE plans to incorporate the distributed SQL engine it has gained into the HPE Ezmeral container platform, based on Kubernetes clusters. Ezmeral offers IT organizations a range of data services that include support for the Apache Spark framework, frameworks for machine learning operations (MLOps), and now SQL platforms.
The Presto distributed SQL engine is already available in a container format. Ampool is currently in the process of adding support for that format to its distribution of Presto, said Anant Chintamaneni, general manager for HPE Ezmeral.
In addition, Ampool has also developed the ability to create a meta store for accessing data stored in multiple databases. Once those joins are created, Ampool allows customers to store the joins in cache memory to boost overall performance, using a tool based on open source Apache Geode software.
The overall goal is to reduce the overhead associated with providing access to multiple data sources by building a data federation layer on top of an acceleration engine that boosts the speed at which analytical query processing occurs at scale, Chintamaneni said. Ampool makes the task of managing multiple SQL databases easier for the database administrator (DBA), he added. “It reduces the effort required on the part of the DBA to manage multiple databases,” he said.
HPE also plans to integrate Ampool with the managed HPE GreenLake service it provides for its servers running in on-premises IT environments. The company last month announced a range of containerized services that will be delivered using an instance of the HPE Ezmeral platform accessed via the HPE GreenLake service.
In effect, HPE is creating a platform that enables an internal IT team to manage data, regardless of where it is stored, as a service. HPE GreenLake will extend that platform to a service through which HPE manages the on-premises IT environments using a cloud-like operating model, said Chintamaneni.
The level of maturity of managing data holistically varies widely across organizations. Most data today is still managed within the context of the application used to create it. Many organizations are just now starting to aggregate that data in so-called data lakes that enable applications to access data regardless of how it was created or ultimately stored. Loosely coupled query engines that can support multiple analytics and business intelligence tools accessing a range of backend data sources have become a critical requirement.
As organizations increasingly recognize that data is a business asset that needs to be managed, the processes for managing it are becoming more structured. The way SQL requests are being generated has also transformed. Canned SQL queries launched by business intelligence and reporting tools are giving way to ad-hoc queries that are much less predictable. As such, the level of processing horsepower that needs to be available on demand has steadily increased.
It may be a while before data is truly managed as a service within most organizations. However, some (arguably long overdue) progress is being made. The challenge is that melding all the data science, engineering, and management expertise required to realize that goal spans a range of technology and cultural challenges that are not easily overcome. Business leaders often assume that data should be easily accessible whenever required. Explaining why it’s not is one of the primary reasons the divide between IT and the rest of the business remains as wide as it is.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.