We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
Databricks, a big data analytics software provider, today announced that it raised $1.6 billion in a series H financing round led by Counterpoint Global, with participation from BNY Mellon and ClearBridge. Andreessen Horowitz, Fidelity Management & Research, and Franklin Templeton also contributed, bringing the company’s total raised to $3.5 billion at a $38 billion post-money valuation.
Cofounder and CEO Ali Ghodsi says that the capital will be used to support Databricks’ product development, customer adoption, and the evangelization of “data lakehouse.” Data lakehouses — a term that came into vogue in 2020 — are data management architectures that combine data lakes, which store structured and unstructured data, with data warehouses, which perform queries and analysis. The goal is to unify data, analytics, and AI in one place, leveraging technologies that support large-scale data workloads.
“It is becoming increasingly clear that the data lakehouse is the architecture of the future. Lakehouse succeeds because it dramatically simplifies customers’ data platform, supporting business intelligence, data engineering, and AI,” Ghodsi told VentureBeat via email. “Instead of making enterprises move data between different systems, create many siloed copies of data, and enforce a lot of complex operations on the organization, we’re making that data more useful where it actually is. The lakehouse is the key to making it simple to unify all data workloads.”
Enterprises are increasingly adopting AI and automation as the pandemic transforms the way they do business. In an MIT Technology Review survey commissioned by Databricks, 83% of CEOs say that AI is a strategic priority for their company. Despite deployment challenges like talent gaps and training data prep, AI is projected to create $3.9 trillion in business value by the end of next year, according to Gartner.
Alongside C3.ai and Snowflake, which filed for IPOs in 2020, Databricks is one of the latest startups focused on analytics and AI to experience rapid growth. The San Francisco, California-based company was founded in 2013 by seven researchers at UC Berkeley’s AMPLab, who came to the realization that building a service for AI-powered analytics could be accomplished with open source tools like Apache Mesos, Alluxio, and Apache Spark (the one they created).
Databricks develops and maintains AI lifecycle management platform MLflow, data analysis tool Koalas, and Delta Lake, a service for working with Spark that provides automated cluster management and programming notebooks for analytics. In June 2020, the company launched a new product, Delta Engine, that layers on top of Delta Lake to boost query performance. And in November 2020, Databricks introduced Databricks SQL, which allows customers to run business intelligence and analytics reporting directly on data lakes.
“[T]he market is split into a ‘data’ bucket and an ‘AI’ bucket, largely for historical reasons,” Ghodsi said. “On one hand, there are vendors that do data management and data processing. It is great for data processing, but those companies have no significant AI or machine learning capabilities. There are startups, on the other hand, that do machine learning and AI. These companies are great for machine learning algorithms, but they actually are not in the business of processing massive petabytes of data. We’re the only vendor that combines those two into one product.”
Today, Databricks hosts millions of virtual machines for brands including Comcast, Condé Nast, H&M, and over 5,000 other organizations across health and life sciences, financial services, media and entertainment, retail, manufacturing, and public sector segments. For transportation company JB Hunt, Databricks helped migrate the company’s data warehouse to a Delta Lake instance on Google Cloud Platform, leading to a 99.8% speedup in freight recommendations delivered through JB Hunt’s digital marketplace. And for ABN AMRO, a European bank, Databricks launched a Microsoft Azure-hosted analytics environment, enabling the firm to deploy 50 different production use cases.
“Multiple sources of data are locked in silos across organizations: in applications, in relatively static data warehouses, in ill-defined data lakes, in open data marketplaces and flowing through event-driven systems. Organizations are struggling to take advantage of this often untapped wealth of useful information for new analytics methods, machine learning tools, and predictive decision systems,” Merv Adrian, Gartner Research VP, told VentureBeat via email. “Fully exploiting the promise of the new data assets combined with the existing ones, applying new tools and methods, and empowering both data scientists and business analysts is a key result of adopting the economics and operational model of the cloud.”
Ghodsi says that the pandemic accelerated Databricks’ momentum in three key areas: the cloud, open source, and machine learning. Recently, the company worked with several health care organizations and government agencies to analyze large volumes of data and perform analytics on the data, predicting outcomes to improve their operations. “Right now, companies are eager to migrate their data and data pipeline processes to the cloud faster, and we’re seeing interest from companies that have historically leveraged legacy on-premises vendors,” he added. “We’ve been working with customers to change contracts to fit their needs during the pandemic.”
Databricks’ annual recurring revenue currently stands at $600 million, up from $425 million at the end of the 2020 fiscal year. The company expects to grow its workforce of 2,300 employees to more than 3,000 by 2022, roughly a year after Databricks acquired data visualization startup Redash for an undisclosed amount.
Ghodsi previously told VentureBeat that future funding would fuel a merger and acquisition strategy with a focus on machine learning and data startups, as well as expanded partnerships with cloud companies. While he was mum on the timing of an IPO, Ghodsi said in an interview with The Register this summer that Databricks aims to be “IPO-ready” this year.
“By running simple AI algorithms on massive amounts of data … [customers can] find success,” Ghodsi told VentureBeat. “[Large tech] companies spend millions on talent and infrastructure to build their own proprietary data and AI systems that would ultimately lead to much of their success. Databricks was started to do the same for any company.”
Additional investors backing Databricks’ series H included the Regents of the University of California, funds and accounts managed by BlackRock, the Canada Pension Plan Investment Board, Coatue Management, GIC, Greenoaks Capital, Octahedron Capital, funds and accounts managed by T. Rowe Price Associates, Whale Rock, Alta Park Capital, Amazon Web Services (AWS), Arena Holdings, CapitalG, Discovery Capital, Dragoneer Investment Group, Gaingels, Geodesic, Green Bay Ventures, Insight Partners, Microsoft, and New Enterprise Associates.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.