WhyLabs raises $4 million to grow AI and data monitoring platform

WhyLabs is launching out of stealth today with $4 million to grow its platform for data scientists who need help monitoring and troubleshooting problems they encounter with datasets or AI models. The goal is to help teams managing machine learning models save time and catch problems before they make trouble for businesses or customers. Though more businesses are finding ways to apply AI to their operations, many still encounter issues when trying to deploy machine learning in the wild. A 2019 IDC report, for example, found that for one in four companies, half of all AI projects fail.

The seed round was led by Madrona Venture Group, with participation from Bezos Expeditions, Defy Partners, Ascend VC, and the Allen Institute for Artificial Intelligence. The funding will be used to hire engineers and for product development, COO and cofounder Maria Karaivanova told VentureBeat in a phone interview. WhyLabs was founded in December 2019 and is based in Seattle. The company currently has nine employees and emerges from stealth after initial incubation at the Allen Institute of Artificial Intelligence.

WhyLabs CEO and cofounder Alessya Visnjic created the company after fixing machine learning issues that arose for Amazon's retail website and machine learning team when they were doing demand forecasting. WhyLabs is launching with an open source library in Python and Java for connecting with datasets in order to generate statistical summaries or fingerprints to follow AI and data performance. Those statistics allow the open source library to catch data quality problems like missing values or data type shift, data drifts, and distribution bias. Depending on the model, the library can track hundreds or thousands of features every hour or once a day, Visnjic said.

Also out today is a WhyLabs data monitoring dashboard. AI practitioners can get anomaly detection alerts so they're notified via apps like Slack, Microsoft Teams, or PagerDuty when a model deviates from the norm or an event occurs that negatively impacts data quality.

The open source library is meant to support the data science community to encourage use of automation for data health and monitoring, regardless of whether a process is in preproduction or already being used by customers. Karaivanova said WhyLabs attempts to differentiate itself from competing services offered by companies like Amazon Web Services by focusing on open source.

"I think the key to remember is that the big platforms are developing those tools, and some of them already have it -- like AWS has a modern monitoring and experiments offering. But you are very much siloed within the SageMaker environment. What we think is important is to provide this multi-model single pane of glass so no matter what a model is built on or where it resides, you can get the visibility of the entire ML operation," she said.

In other news in the emerging area of enterprise "MLOps," in June Domino Data Labs raised $43 million for its service that helps companies keep machine learning models up to date. In July, former Google and AWS engineers teamed up to launch Abacus.ai and help enterprise customers deploy AI at scale.

More