Is it just hype? How investors can vet a company's AI claims

Almost every confidential investment memorandum (CIM) for a tech-driven enterprise includes the company’s mention of artificial intelligence (AI) or machine learning (ML) capabilities. But as with other investment buzzwords — such as "subscription revenue" — there is a tendency to use AI or ML to suggest complex, business-enabling, proprietary technology and processes to distinguish the offering as differentiated or technologically superior. This is often to garner higher valuation.

We’ve all heard examples of AI failures that make for good headlines and provide interesting cautionary tales. But as an investor, it can be just as frightening to learn that the AI capability that drove an above-market valuation is not much more than a spreadsheet with some marketing spin.

In our role as advisors to technology investors and management teams, we often encounter a question central to the investment thesis: Is the AI/ML the real deal? Here’s how to find the answer.

Make sure everyone’s speaking the same language

Varying interpretations of “artificial intelligence,” “machine learning” and “deep learning” can create confusion and misunderstandings, as the terms are often misused or used interchangeably. Think of the concepts this way:

Artificial intelligence is any system that mimics human intelligence. With this definition, AI could refer to any rules-based system or algorithm — as long as it’s being used to simulate intelligence. Chatbots are a perfect example.

Machine learning is a subset of AI. It relies on a mathematical model created using a large dataset and a training algorithm that allows the model to learn and evolve. For example, in Google Photos, you can tag pictures with the names of the people in them, and over time, Google gets better and better at identifying people on its own. This is a good example of machine learning.

Deep learning is a subset of ML that involves highly sophisticated models resembling the structure of the human brain. These models require millions of records to train but can often equal or outperform humans at specific tasks. For example, the AlphaZero deep learning program remains undefeated at chess.

Digging deeper

You need to dig deeper than these broad, general terms to see how legitimate a company’s AI/ML technology is. You need to understand: What problem is being solved? What AI/ML technologies are used to solve it? How and why does this solution work? Does the solution provide a competitive edge over other approaches?

Let’s say you’re looking at investing in a new company in the hypothetical LawnTech space.

If the CIM describes the company’s HornetNest app as an “AI system for hornet eradication,” you’d want to dig more deeply with the technical product team to understand the underlying components and process. Ideally, you’ll end up with an explanation that sounds more like this:

“We use a YOLO-based object detector with a Kalman filter to identify, count, and track hornets in real time. Data is fed into an anomaly detector that automatically alerts customers when we see behavior that suggests a new nest may be present within a 50-yard radius. Through an exclusive partnership with Orkin, we have compiled the world's largest training set of images, allowing us to predict the presence and location of new hornet nests more accurately than anyone else.”

This level of detail is needed to understand the sophistication, value, and defensibility of a company’s AI/ML assets.

Evaluate the whole picture

AI isn’t just one thing. It’s the product of six critical components essential to AI value. The degree to which these elements operate effectively together can help you separate the highest-value AI from the less legitimate.

The team

This is perhaps the most valuable asset and determinator of long-term success. In particular, having a strong data science team led by a seasoned chief data scientist opens the door to best-in-class AI.

The data

ML relies on training data to make the models. High volumes of data, especially proprietary data that competitors can’t access, create a significant competitive advantage and barrier. As a very rough rule of thumb, you need tens of thousands of training records for traditional ML; millions for deep learning.

The training process

There are basic training processes and advanced techniques, including automated machine learning (AutoML), hyperparameter tuning, active learning and weak supervision. A company’s ability to use these advanced techniques leads to reduced costs and improved quality.

Operational excellence

Beyond training the AI, it’s important to understand its overall care and feeding. You’ll want to understand the quality assurance, testing and error decomposition processes. When weaknesses are identified, how is supplemental training data gathered? Additionally, suppose a strength of the AI is incorporating real-time feedback to enable reinforcement learning, or compiling a knowledge base to support decision-making. In these cases, processes must be actively managed to ensure optimal performance.

The models

Models are results of the team, the data and the training process. But, to be considered an asset, they still take appreciable time to create and optimize. The value of this component is determined by the number of models a company has and the sophistication of the models.

The AI development infrastructure

There is a difference between a company that has thrown together a few ML models and one with the infrastructure to automatically create, retrain, test and deploy models.

Understand where the company falls on the AI maturity scale

Based on a sample from the more than 2,500 tech companies our team has diligenced over the last two years, we’ve noted some fairly consistent indicators of AI maturity.

Around 10% of these companies fall into the category of "No AI." Despite what they say, it’s not AI. For example, software that optimizes container routing may not be AI but just a sophisticated traditional algorithm.

A further 10% fall into the category of "Non-proprietary AI." In these instances, the company is using only public domain models, or MLaaS cloud APIs, to leverage AI. An example would be using Amazon’s AI-based Textract API to recognize text or the public domain ResNet model to detect objects in images. This approach can be considered AI-based but does not require training data, a training process, data scientists or even a lot of knowledge about AI to implement. There would also be no competitive differentiator in this approach since any company can use the same public-domain assets.

The vast majority, about 75%, fall into the category of "Standard AI." What we see most often are companies that are training proprietary ML models using their own training data in combination with standard training algorithms. There is a broad range of sophistication in this class. At the simpler end of the range are companies that create linear regression models using a library like Python’s sklearn. At the more complex end are companies that design and create multiple deep learning models using TensorFlow and use advanced optimization techniques like hyperparameter tuning, active learning and weak supervision to maximize accuracy.

The final 5% falls into the category of "Leading-edge AI." These companies have gone beyond standard AI techniques and developed their own model types and training algorithms to push AI in new directions. This represents unique and patentable IP that has value in itself, and the models created by these companies can outperform competitors that have access to the same dataset.

It looks like the real deal — but is it right for you?

Once you understand the details of the AI itself, you’re better positioned to understand its impact on the investment thesis. There are two factors to consider here.

First, what is the value of the AI? Because “AI” can have widely-varying definitions, it’s important to take a holistic view. The value of a company’s AI assets is the sum of the six critical parts noted above: the team, data, training process, operational excellence, models, and development infrastructure.

Another way to look at AI’s value in a company is to ask how it impacts the bottom line. What would happen to revenues and costs if the AI were to disappear tomorrow? Does it actually drive revenue or operating leverage? And conversely, what costs are required to maintain or improve the capability? You’ll find AI can be anything from an empty marketing slogan to technology essential for a company’s success.

Second, what risks does the AI introduce? Unintentional algorithmic bias can pose reputational and legal risks to the business, creating sexist, racist, or otherwise discriminatory AI. In the case of credit, law enforcement, housing, education and healthcare, this type of bias is prohibited by law and difficult to defend against — even when it occurs unwittingly. Make sure you understand how the target has guarded against algorithmic bias and the steps you would need to take to prevent bias moving forward.

Privacy is another concern, with AI often necessitating new layers of privacy and security protocols. You need to understand how biometric data (considered personally identifiable information protected by data privacy laws) and sensitive images, such as faces, license plates and computer screens, are collected, used and safeguarded.

The true value of AI

The reality is that, in today’s tech landscape, most companies can legitimately claim some AI capabilities. The majority of the time, the AI fits our definition for “standard” maturity and performs as we expect it to. But when we looked more deeply into the “standard AI” category, we found that only about half of these companies were using best practices or creating a competitive differentiator that would be difficult for competitors to outperform. The other half had room for improvement.

Determining the value of AI requires both an in-depth look under the hood and a nuanced understanding of the AI's specific role in the business. Tech diligence, done by experts who’ve directly led AI teams, can help demystify AI for investors. The goal is to help investors understand exactly what they’re buying, what it can and cannot do for the business, what risks it introduces, and, ultimately, to what extent it supports the investment strategy.

Brian Conte is lead practitioner for Crosslake. Jason Nichols is a Crosslake practitioner and former director of AI at Walmart. Barr Blanton is Crosslake CEO.

Welcome to the VentureBeat community!

Our guest posting program is where technical experts share insights and provide neutral, non-vested deep dives on AI, data infrastructure, cybersecurity and other cutting-edge technologies shaping the future of enterprise.

Read more from our guest post program — and check out our guidelines if you’re interested in contributing an article of your own!