Model routing: The secret weapon for maximizing AI efficiency in enterprises

As enterprises increasingly adopt AI technologies, they face a critical challenge: how to automatically select the best AI model for each task while optimizing performance and cost. Enter model routing, a cutting-edge approach that's quickly becoming a secret weapon for maximizing AI efficiency in the enterprise.

Model routing technology allows companies to dynamically choose the most appropriate AI model on a query-by-query basis, potentially revolutionizing how businesses leverage their AI resources. This approach not only enhances performance but also significantly reduces costs compared to relying on a single, all-purpose model.

One startup at the forefront of this technology is Martian, which has developed a large language model (LLM) router that's catching the attention of major players in the tech industry. In fact, Accenture, a global professional services company, recently announced an investment in Martian, highlighting the growing importance of model routing in enterprise AI strategies.

Accenture is set to integrate Martian into its switchboard services, which helps enterprises to select models. Martian emerged from stealth in November 2023 and has been steadily growing its technology over the past year. Alongside the Accenture deployment, the company is also rolling out a new AI model compliance feature as part of its router platform.

The Accenture switchboard to date has helped organizations to select models for enterprise deployment. What Martian adds to the mix is the ability to do dynamic routing to the best model.

"We can automatically choose the right model, not even on a task-by-task basis, but a query-by-query basis," Shriyash Upadhyay, co-founder of Martian, told VentureBeat. "This allows for lower costs and higher performance because it means that you don't always have to use a single model."

In a statement, Lan Guan, chief AI officer at Accenture commented that many of Accenture's clients are looking to reap the benefits of generative AI in a way that considers requirements, performance and cost.

“The capabilities of Accenture’s switchboard services and Martian’s dynamic LLM routing simplify the user experience and will allow enterprises to experiment with generative AI and LLMs in order to find the perfect fit for their business needs," Guan stated.

How Martian routes enterprise AI queries to the best model

Martian builds model routers that can dynamically select the best model to use for a given query.

The core technology behind the router focuses on predicting model behavior.

"We take a relatively unique approach in doing this, where we focus on trying to understand the internals of what's going on inside of these models," Upadhyay said. "A model contains enough information to predict its own behavior because it does that behavior."

The approach allows Martian to select the single best model to run, optimizing for factors like cost, quality of output and latency. Martian uses techniques like model compression, quantization, distillation and specialized models to make these predictions without needing to run the full models. The Martian routing system can be integrated into applications that use language models, allowing it to dynamically choose the optimal model to use for each query, rather than relying on a single pre-selected model. This helps improve performance and reduce costs compared to static model selection.

Why model routing should be an enterprise AI imperative

The idea of using the best tool for the job is a common business idiom, but what isn't as common is the knowledge in organizations that there are lots of very specific choices for AI.

"Often these large companies might have different organizations where some part of the org doesn't even know about the fact that there is this whole world of different models out there," Upadhyay said.

To actually use AI models effectively, Upadhyay emphasized that defining success metrics is critical. Organizations need to determine what are the metrics that actually define success and what the organization actually cares about in a specific application.

Cost optimization and return on investment are also critical. Upadhyay noted that organizations need to be able to optimize costs and be able to demonstrate some form of return on investment for model deployment. In his view, those are areas where model routing is essential as it serves both purposes.

Compliance is always a concern in an enterprise and that's an area that Martian is now taking on with its model router. The new compliance feature in Martian helps companies vet and approve AI models for use in their applications. Upadhyay said that the feature will allow companies to automatically set up a set of policies for compliance.

Enterprise AI model router could be a boon for Agentic AI

One of the driving use cases for AI model routing in enterprise use cases is the growing area of agentic AI.

With agentic AI, an AI agent will chain together multiple models and actions in order to achieve a result. Each step in an agent workflow depends on the previous steps, so errors can compound exponentially. Martian's routing helps ensure the best model is used for each step to maintain high accuracy.

"Agents are like the killer use case for routing," Upadhyay said. "It's a case in which you really, really care about getting steps right, otherwise you have this cascade of failures afterwards."

How Martian routes enterprise AI queries to the best model

Why model routing should be an enterprise AI imperative

Enterprise AI model router could be a boon for Agentic AI

More