Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more


Of course, Google has been busy. In the past few years, researchers have cooked up a complex and powerful new data-warehousing system called Mesa.

Mesa was born out of Google’s core business: Internet advertising. To serve its advertising customers and internal needs, Google collects detailed information about a given ad, and it has to record and process that data in real time, according to a new research paper on the system.

Mesa deals with these data at great scale, taking in data in near real time. Mesa “handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day,” the paper’s authors wrote. Moreover, they wrote, Mesa is datacenter-failure-proof, as it’s “geo-replicated across multiple datacenters.”

It’s possible that Mesa will lead to a new cloud service available on the Google Cloud Platform. That could help the company further distinguish itself in its cloud warfare against Amazon Web Services, which has a data-warehousing service called Redshift, and Microsoft Azure, which can and do drop cloud prices and frequently release new cloud services, just like Google.

Such a development wouldn’t be too farfetched. After Google introduced the Dremel query system in a research paper, the company created BigQuery based on Dremel and made it available as a cloud service on the Google Cloud Platform.

Architecturally, Mesa’s developers made some important decisions about what to optimize for that make it different, from, say, Dremel:

Mesa explores a new point in the design space with high scalability, strong consistency, and transactional guarantees by restricting the system to be only available for batched and controlled updates that are processed in near real-time.

The system also helps Google in ways that, for instance, the open-source Hive data-warehousing tool for Hadoop couldn’t. And it also likely stands out from the Presto query engine, which Facebook developed in house to meet latency challenges that Hive couldn’t deal with. Facebook recently released Presto under an open-source license.

For one thing, Mesa might be particularly well suited to deployment across data centers around the world.

“The cloud computing paradigm in conjunction with a decentralized architecture has proven to be very useful to scale with growth in data and query load,” they wrote.

Jordan Novet contributed reporting.

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member