We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
Munich-based ZenML, a startup providing an extensible and open source MLOps framework to accelerate and simplify the delivery of machine learning models, and research and production, today announced it has raised $2.7 million in a seed round of funding. The company plans to use the investment, which was led by Crane Venture Partners and multiple notable AI researchers, to strengthen its technology team and further build out its tooling suite for data scientists.
Despite the ever-evolving MLOps landscape, the process of taking a machine learning project to production or live environments continues to be hard. Unlike traditional applications, ML systems bring a lot of complexity with dependence on both code and data. Data, in particular, is hard to wrangle and can change in expected ways, affecting the performance of the model. As a result, data science teams have to handle a deluge of tooling options and processes to ship their model, which not only adds to the confusion and fragmentation but also requires multiple skill sets.
“Most tools separate workflows into islands that mainly concentrate on the early development phase for data scientists, or the later deployment phase, which is largely owned by engineering. This causes systemic failures in the entire system like a lack of reproducibility or provenance across the pipeline,” Hamza Tahir, cofounder of ZenML, told Venturebeat.
A standardization layer for MLOps
To solve this particular problem, Tahir started ZenML with Adam Probst in July 2021. The startup offers a tooling and infrastructure agnostic framework that acts as a standardization layer and allows data scientists to iterate on promising ideas and create production-ready machine learning pipelines.
Available as a lightweight Python library, ZenML’s framework enables data scientists to express their ML workflows as pipelines. The steps within can be defined as simple Python functions that could handle arbitrary tasks such as preprocessing data or training a model. Teams, could then easily plug and play their infrastructure and tooling needs right into their ML pipeline, with a few simple configuration changes.
“With ZenML, every ML project will have the same user experience as a simple Python project. The only difference is that you’re working on real machine learning use cases that instantly can be brought into production. Nobody will need to do the heavy lifting of setting up infrastructures or coordinating between DevOps teams and data scientists,” Tahir said.
While workflow automation tools are available to let users define workflows as pipelines, including players like Airflow, Prefect, and Luigi, ZenML claims to set itself apart by treating ML-specific artifacts like models, data drift, and feature statistics as first-class citizens. The framework then offers data scientists a path to solve complex problems such as reproducibility and versioning of data, code, and models.
“These tools are built on a hard-to-understand syntax, which often can be scary to the data scientist persona. We aim to do the exact opposite (with a unified syntax in familiar language) so our users can become more invested in working on their native solutions rather than learning how to use the tool they are using,” Tahir said.
Though ZenML is still in the early stages of development, the company claims to have seen a tremendous response, with over 1,000 GitHub stars and downloads growing 20% to 40% every week. It has also successfully handled paid projects from Airbus Defence and Space, focusing on object detection on new high-resolution satellite images.
“In the last few months, we have rewritten the ZenML codebase to be more robust and user-friendly, Tahir said. “We have also tripled our team in the space of a few months and released ZenML 0.5 that includes support for writing pipelines with standard artifacts like Tensorflow or PyTorch models with Kubeflow.”
The company plans to grow its team of MLOps technologists and expand the framework by integrating more tooling libraries to match the needs of data science teams across organizations. This would include libraries such as Evidently/WhyLogs/GreatExpectations for validation and BentoML/Seldon/KServe for deployment.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.