Anyscale, a company promising to let application developers more easily build so-called “distributed” applications that are behind most AI and machine learning efforts, has raised $20.6 million from investors in a first round of funding.
The company has some credibility off the bat because it’s cofounded by Ion Stoica, a professor of computer science at the University of California, Berkeley who played a significant role in building out some successful big data frameworks and tools, including Apache Spark and Databricks.
The new company is based on an open source framework called Ray — also developed in a lab that Stoica co-directs — that focuses on allowing software developers to more easily write compute-intensive applications by simplifying the hardware decisions made underneath.
Ray’s emergence is significant because it aims to solve a growing problem in the industry, Stoica said in an interview with VentureBeat. On one hand, developers are writing more and more applications — for example AI- and ML-driven applications — that are increasingly intensive in their number-crunching needs. The amount of compute for the largest AI applications has doubled every three to four months since 2012, according to OpenAI — an astonishing exponential rate.
On the other hand, the ability of the processing hardware underneath needed to do this number-crunching is falling behind. Application developers are thus being forced to “distribute” their applications across thousands of CPU and GPU cores to factor out the processing workload in a way that allows hardware to keep up with their needs. And that process is complex and labor intensive. Companies have to hire specialized engineers to build this architecture, linking things like AWS or Azure cloud instances with Spark and distribution management tools like Kubernetes.
“The tools required for this have been kind of jerry-rigged in a way they shouldn’t be,” said Ben Horowitz, a partner at venture firm Andreessen Horowitz, which led the round of funding. That’s effectively meant large barriers to entry for building scaled applications, and it’s kept companies from reaping the promised benefits of AI.
Ray was developed at UC Berkeley, in the RISELab — successor to the AMPLab, which created Apache Spark and Databricks. Stoica was cofounder of Databricks, a company that helped commercialize Apache Spark, a dominant open source framework that helps data scientists and data engineers process large amounts of data quickly. Databricks was founded in 2013, and is already valued at $6.2 billion. Whereas Spark and Databricks targeted data scientists, Ray is targeting software developers.
“From a developer standpoint, you write the code in a way that it talks to Ray,” said Horowitz, “and you don’t have to worry about a lot of that [infrastructure].”
“Ray is one of the fastest-growing open source projects we’ve ever tracked, and it’s being used in production at some of the largest and most sophisticated companies,” Horowitz added. Intel has used Ray for things like AutoML, hyperparameter search, and training models, whereas startups like Bonsai and Skymind have used it for reinforcement learning projects. Amazon and Microsoft are also users.
Another Anyscale cofounder, Robert Nishihara, who is also the CEO, likens Anyscale’s mission with Ray to what Microsoft did when it built Windows. The operating system let developers build applications much more rapidly. “We want to make it as easy to program clusters [or thousands of cores] and scalable applications as it is to program on your laptop.”
Stoica and Nishihara say applications built with Ray can easily be scaled out from a laptop to a cluster, eliminating the need for in-house distributed computing expertise and resources.
To be sure, developing a company around an open source framework can be challenging. There’s no guarantee that the company can make money from an open framework that other companies can build around, too. Witness what happened with Docker, the company that built around Kubernetes, but which hasn’t been able to commercialize. Other companies stepped in and did it instead.
Stoica and Nishihara said they were confident they would avoid Docker’s fate, given Stoica’s background with Databricks, which he gave as an example of knowing how to commercialize smartly and aggressively. They said that they knew more about Ray than anyone else, and so are in the best position to build a company around it.
Moreover, the pair said they aren’t afraid of other companies that have been building so-called “serverless” computing offerings — for example, Google with Cloud Function and Amazon with AWS Lambda — that are tackling the same problem of letting people develop scalable applications without thinking about infrastructure. “That’s a very different approach, a very limited programming model, and restricted in terms of the things you can do,” Nishihara said of serverless. “What we’re doing is much more general.”
“These serverless platforms are notoriously bad at supporting scalable AI,” added Stoica. “We are excelling in that aspect.”
The two founded the company in June alongside Philipp Moritz and UC Berkeley professor Michael Jordan, and Anyscale has no product or revenue yet. Besides Andreessen Horowitz, investors in the round include Intel Capital, Ant Financial, Amplify Partners, and The House Fund. With the funding, Anyscale’s founders said, they will expand the company’s leadership team (the company has 12 employees) and continue to commit to expanding Ray.