We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
Let the OSS Enterprise newsletter guide your open source journey! Sign up here.
Starburst, the company behind the open source Presto-based SQL query engine Trino, has raised $250 million in a round of funding as it looks to meet what it calls “growing demand for faster analytics” on decentralized data.
Trino, in a nutshell, helps companies carry out complex analytics on disparate data sources wherever it resides — it saves engineers from having to move, copy, or combine data in a centralized location. Trino can serve numerous purposes, from powering companies’ digital autonomy and data sovereignty efforts to gaining greater cost control through separating compute and storage.
With Trino, companies ultimately don’t have to worry about pooling their data in a single data warehouse, which is not only time-consuming but often costly due to so-called “egress” fees. As an open-source entity, Trino is also flexible and extensible, supporting any number of third-party integrations, which is partly why Trino claims thousands of users from companies such as Salesforce, Netflix, Shopify, and Comcast. But just like other open source products, Trino requires resources and ongoing technical support to deploy at scale — which is where Starburst enters the fray.
The Starburst story
Starburst’s foundations can be traced back to 2012, when a group of Facebook engineers developed a distributed SQL query engine called Presto to enable its data scientists and analysts to run faster queries on massive data sets. The social networking giant open-sourced Presto in 2013, but due to disagreements with the upper echelons at Facebook, Presto’s creators jumped ship and created a Presto fork called PrestoSQL — this was later rebranded as Trino to circumvent trademark issues.
Amidst this, Presto’s creators launched a commercial entity called Starburst, which offers a fully supported, production- and enterprise-grade Trino distribution with dozens of pre-built connectors, security, and services thrown into the mix. Following a $100 million fundraise last year, the company launched Starburst Galaxy, a fully managed, multi-cloud analytics SaaS offering, designed to minimize infrastructure management resources for data teams.
Starburst Galaxy meant that its customers, which includes global biotech giant Sophia Genetics, could now query data hosted on any of the “big three’s” infrastructure without moving the data from its original location, all wrapped up in a fully-managed product.
This runs contrary to a more centralized data approach popularized by the likes of Teradata over the past few decades.
“This [centralized] approach requires all data to move to a central location prior to analytics — it is expensive, slow, and relatively impossible to achieve,” Starburst’s director of engineering Colleen Tartow told VentureBeat. “Moving data requires copy management, and takes significant time, which delays decisions. To top it off, with the pace of data creation showing no signs of slowing down, it is simply not achievable to move all of your data in one place.”
Similar to just about every other company operating in the cloud realm, Starburst has benefited from pandemic-driven digital transformation efforts — the company said that it has tripled its customer count and annual recurring revenue (ARR) growth over the past 12 months.
“While accessing and analyzing distributed data was a pain before the pandemic, the rapid shift to digital drove an increased demand for accessing newer data sets, typically in cloud systems,” Tartow explained. “This demand required companies to solve for a better path to accessing and analyzing distributed data — so that new data can be brought rapidly into analytics workstreams.”
Prior to now, Starburst had raised around $164 million from notable backers such as Andreessen Horowitz, Salesforce Ventures, Coatue, and Index Ventures. Its latest cash injection, which values the company at $3.35 billion (roughly triple its valuation on last year), was spearheaded by Alkeon Capital, with participation from Altimeter Capital, B Capital Group, and most of its previous investors.
Alongside its funding, Starburst also introduced a “suite of new capabilities” to its enterprise product, designed to help businesses “build and share data products.” Specifically, this means a new module within the Starburst Enterprise web interface that enables data producers to create and maintain an array of data products for downstream users.
“It blends the power of Starburst’s analytical query engine with the discoverability and user-friendly capabilities of a data catalog, and seamlessly integrates with any governance tools to provide a secure, high-performance solution,” Tartow explained.
With another $250 million in the bank, Starburst is well-financed to push Trino as the default SQL query engine for major enterprises, at a time when commercial companies such as Ahana are also raising funds to commercialize the original Presto-branded open source project.
But more than that, it’s clear that Starburst is going all-in on the notion that decentralized data is the future. “Data mesh” is the name of the game, with companies able to analyze distributed sets of data across domains, clouds, and geographic locations.
“It’s time to move forward with decentralized data,” Tartow said. “Over the next ten years, I anticipate that data mesh will become widely adopted across enterprises. It’s a better path to analyzing distributed data at scale, and lives with every company’s data reality today.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.