The power of MLOps to scale AI across the enterprise

To say that it's challenging to achieve AI at scale across the enterprise would be an understatement.

An estimated 54% to 90% of machine learning (ML) models don’t make it into production from initial pilots for reasons ranging from data and algorithm issues, to defining the business case, to getting executive buy-in, to change-management challenges.

In fact, promoting an ML model into production is a significant accomplishment for even the most advanced enterprise that's staffed with ML and artificial intelligence (AI) specialists and data scientists.

Enterprise DevOps and IT teams have tried modifying legacy IT workflows and tools to increase the odds that a model will be promoted into production, but have met limited success. One of the primary challenges is that ML developers need new process workflows and tools that better fit their iterative approach to coding models, testing and relaunching them.

The power of MLOps

That’s where MLOps comes in: The strategy emerged as a set of best practices less than a decade ago to address one of the primary roadblocks preventing the enterprise from putting AI into action — the transition from development and training to production environments.

Gartner defines MLOps as a comprehensive process that "aims to streamline the end-to-end development, testing, validation, deployment, operationalization and instantiation of ML models. It supports the release, activation, monitoring, experiment and performance tracking, management, reuse, update, maintenance, version control, risk and compliance management, and governance of ML models."

_{Delivering more ML models into production depends on how efficient preproduction is at integrating and validating data, systems and new processes specific to MLOps, combined with an efficient retrain feedback loop to ensure accuracy. Source: LinkedIn post, MLOps, Simplified! By Rajesh Dangi, Chief Digital Officer (CDO) June 20, 2021}

Managing models right to gain scale

V erta AI cofounder and CEO Manasi Vartak, an MIT graduate who led mechanical engineering undergraduates at MIT CSAIL to build ModelDB, co-created her company to simplify AI and and ML model delivery across enterprises at scale.

Her dissertation, Infrastructure for model management and model diagnosis, proposes ModelDB, a system to track ML-based workflows' provenance and performance.

"While the tools to develop production-ready code are well-developed, scalable and robust, the tools and processes to develop ML models are nascent and brittle,” she said. “Between the difficulty of managing model versions, rewriting research models for production and streamlining data ingestion, the development and deployment of production-ready models is a massive battle for small and large companies alike.”

Model management systems are core to getting MLOps up and running at scale in enterprises, she explained, increasing the probability of modeling success efforts. Iterations of models can easily get lost, and it's surprising how many enterprises don't do model versioning despite having large teams of AI and ML specialists and data scientists on staff.

Getting a scalable model management system in place is core to scaling AI across an enterprise. AI and ML model developers and data scientists tell VentureBeat that the potential to achieve DevOps-level yields from MLOps is there; the challenge is iterating models and managing them more efficiently, capitalizing on the lessons learned from each iteration.

VentureBeat is seeing strong demand on the part of enterprises experimenting with MLOps. That observation is supported by IDC's prediction that 60% of enterprises will have operationalized their ML workflows using MLOps by 2024. And, Deloitte predicts that the market for MLOps solutions will grow from $350 million in 2019 to $4 billion by 2025.

Increasing the power of MLOps

Supporting MLOps development with new tools and workflows is essential for scaling models across an enterprise and gaining business value from them.

For one thing, improving model management version control is crucial to enterprise growth. MLOps teams need model management systems to integrate with or scale out and cover model staging, packaging, deploying and models operating in production. What's needed are platforms that can provide extensibility across ML models' life cycles at scale.

Also, organizations need a more consistent operationalization process for models. How an MLOps team and business unit work together to operationalize a model varies by use case and team, reducing how many models an organization can promote into production. The lack of consistency drives MLOps teams to adopt a more standardized approach to MLOps that capitalizes on continuous integration and delivery (CI/CD). The goal is to gain greater visibility across the life cycle of every ML model by having a more thorough, consistent operationalization process.

Finally, enterprises need to automate model maintenance to increase yield rates. The more automated model maintenance can become, the more efficient the entire MLOps process will be, and there will be higher probability that a model will make it into production. MLOps platform and data management vendors need to accelerate their persona-based support for a wider variety of roles to provide customers with a more effective management and governance framework.

MLOps vendors include public cloud-platform providers, ML platforms and data management vendors. Public cloud providers AWS, Google Cloud and Microsoft Azure all provide MLOps platform support.

DataRobot, Dataiku, Iguazio, Cloudera and DataBricks are leading vendors competing in the data management market.

How LeadCrunch uses ML modeling to drive more client leads

Cloud-based lead generation company LeadCrunch uses AI and a patented ML methodology to analyze B2B data to identify prospects with the highest probability of becoming high-value clients.

However, ML model updates and revisions were slow, and the company needed a more efficient approach to regularly updating models to provide customers with better prospect recommendations. LeadCrunch's data science team regularly updates and refines ML models, but with 10-plus submodels and an ever-evolving stack, implementation was slow. Deployment of new models only occurred a few times a year.

It was also challenging to get an overview of experiments. Each model was managed differently, which was inefficient. Data scientists had difficulty gaining a holistic view of all the experiments being run. This lack of insight further slowed the development of new models.

Deploying and maintaining models often required large amounts of time and effort from LeadCrunch's engineering team. But as a small company, these hours often weren't available. LeadCrunch evaluated a series of MLOps platforms while also seeing how they could streamline model management. After an extensive search, they chose Verta AI to streamline every phase of ML model development, versioning, production and ongoing maintenance.

Verta AI freed LeadCrunch's data scientists up from tracking versioning and keeping so many models organized. This allowed data scientists to do more exploratory modeling. During the initial deployment, LeadCrunch also had 21 pain points that needed to be addressed, with Verta AI resolving 20 immediately following implementation. Most importantly, Verta AI increased model production speed by 5X and helped LeadCrunch achieve one deployment a month, improving from two a year.

_{Source: Verta AI.}

The powerful potential of MLOps

The potential of MLOps to deliver models at the scale and the speed of DevOps is the main motivator for enterprises who continue to invest in this process. Improving model yield rates starts with an improved model management system that can "learn" from each retraining of a model.

There needs to be greater standardization of the operationalization process, and the CI/CD model needs to be applied not as a constraint, but as a support framework for MLOps to achieve its potential.