Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Artificial intelligence (AI) models are increasingly finding their way into critical aspects of enterprise use cases and broader adoption throughout the world.
One area where AI is finding a home is in the Transport Security Administration (TSA), one of whose responsibilities is screening baggage at airports across the U.S. An initiative currently underway within the Department of Homeland Security’s (DHS) Science and Technology Directorate, a program known as Screening at Speed (SaS), will, among other efforts, implement AI to help accelerate the baggage screening process.
Part of developing this screening system is testing and validating the AI models’ integrity, in terms of both reliability and the ability to withstand potential adversarial AI attacks. That’s where DHS is making use of CalypsoAI’s VESPR Validate technology.
“We’re really focused on testing and validation of machine learning (ML) models so that they can get safely and securely deployed,” CalypsoAI CEO Neil Serebryany told VentureBeat.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
The challenges of testing AI models for production use
A Gartner research survey released in August found that only half of all AI models built actually make it into production.
There are multiple reasons for this, including issues with testing and validation. According to Serebryany, testing and validation for AI models must consider both human and technical factors. In order to help someone get the confidence they need to deploy a model into production, there is a need to solve for the human side. Human factors include the ability to communicate information about where the model works, where it doesn’t work, and what its vulnerabilities are. On the technical side, Serebryany said that there is a need to help make the models as resilient and robust as possible.
Before starting CalypsoAI, Serebryany had worked in the government, where he noticed a growing focus on machine learning (ML). What he saw time and again were security challenges, including the need to make sure that adversarial machine learning attacks don’t negatively impact a model. Adversarial ML attacks use various techniques to deceive models into generating the desired outcome.
Strong enterprise demand for AI testing and validation
The need for AI testing and validation as well as protection against adversarial AI extends beyond government use cases.
Serebryany said that his firm has seen growing enterprise demand recently. In fact, in July, Gartner named the company a “Cool Vendor” for its scaling AI in the Enterprise capabilities.
“Organizations are trying to systematize how they understand the risk of their machine learning models and have a way of actually testing that risk in order to be able to validate their models,” Serebryany said.
He expects that the need to test and validate AI will become part of many organizations’ audit practices to help ensure regulatory compliance. For example, he noted that the EU is starting to introduce regulations for AI compliance that enterprises will need to deal with.
Serebryany says that his company is also seeing insurance companies who want to start insurance AI models. Insurance companies need to be able to understand the performance of those models against a real-world test set of conditions.
How CalypsoAI tests models for resilience
Serebryany explained that his company’s technology can fit into different parts of an AI development workflow.
CalypsoAI has a software development kit (SDK) that can work with an organization’s Jupyter notebook-driven machine learning processes. Alternatively, CalypsoAI can be just an independent testing and validation step along the way.
Serebryany explained that CalypsoAI can test a model without all the training data. Using a subset of data, CalypsoAI runs the model through a series of adversarial attacks and real-world scenarios.
“A lot of the testing challenges are around identifying the exact conditions folks want to deploy into and helping them understand what model will actually be performing in that condition set,” he said.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.