FICO’s answer to AI risk: A foundation model that scores every output for accuracy and compliance

FICO, which most people likely know as the company that determines credit scores, has been working with machine learning and AI models for years. However, the company decided to wait before releasing its own foundation models, opting instead to establish a level of trust and compliance for its financial services clients.

Now, FICO announced the release of two foundation models, FICO Focused Language (FICO FLM) and FICO Focused Sequence (FICO FSM). The company built both models entirely from scratch, relying on its decades-long expertise in financial data and algorithms for training. FICO also runs a GPU cluster.

FICO Focused Language mainly deals with conversations and the language aspect of finance to determine fraud and process documentation for loans. On the other hand, FICO Focused Sequence works best for transaction analytics.

Scott Zoldi, chief analytics officer at FICO, told VentureBeat in an interview that, despite years of building models—Zoldi and FICO hold several patents related to AI models—the company knew its offerings had to meet the strict trust and compliance needs of its main customers: banks and lenders.

“We needed to make sure that we were comfortable with the technology that would meet the responsible AI standard,” Zoldi said. “It had to be auditable, it had to be transparent and explainable, and we needed to come up with the technology that would monitor the outputs.”And so FICO created its Trust Score, which ranks model responses based on specific criteria to ensure the output remains accurate and grounded.

The trust layer

Zoldi said the Trust Score is a key component of what makes the two FICO models effective for heavily regulated industries, such as finance.

The Trust Score serves as a guardrail that indicates how closely a response aligns with its training data. It is similar to how some hyperscalers, such as AWS, offer guardrails and contextual grounding for larger, general-purpose models.

“So the Trust Score basically says we're gonna have an independent generative AI algorithm and that algorithm will be based on the data that was used to build the model,” Zoldi said.

Since FICO built the models from scratch, they own and have access to the exact datasets used to train the models. FICO also closely works with its customers to integrate their data, allowing them to tailor the models to their specific use cases. The Trust Score also takes into account context found in the data. So if the model is used to read through documentation about European financial instruments, the Trust Score can see if the response is relevant.

“We have a concept called a knowledge anchor, which says that unless you’re an expert, you usually don’t know whether the model hallucinates. So our approach here was to work with experts who would define which questions the model should respond to and the ways it should respond,” said Zoldi.

The Trust Score also takes that into account when ranking the response. A response with a high score indicates that it is accurate in terms of its data coverage and is not misleading. Responses with low scores may prompt the bank to review its data or refine how the model responds.

Of course, it is up to the financial organization to determine its risk tolerance, as some companies may be willing to accept outputs that do not score perfectly.

FICO Focused

FICO stated that the two models, Language and Sequence, can be considered small languages, part of an ongoing trend where enterprises prefer small models over larger general models. FICO FLM is less than 10 billon parameters, while FSM is smaller at less than 1 million parameters.

Since these models are small, Zoldi said, they can be used for more agentic purposes down the line.

FICO FLM works best on understanding the language used in transactions. Zoldi said it has two general use cases. The first is for compliance and communications. It understands the rules governing how financial institutions can and should provide information to customers and extract information from conversations. Zoldi said what is special about FLM is that since it monitors the back and forth between a bank and a person, it can detect if the customer is undergoing some financial hardship. The bank can tailor its approach to providing information to them, taking into account their economic position.

The second use of FLM involves underwriting, which is the act of offering a loan or capital to an individual or a business. The model can take into account the person’s interactions with the bank and review loan documentation.

FICO FSM deals with transaction data. Zoldi said it remembers the totality of a consumer or organization’s transactions and purchases so it can establish a pattern. However, it can also determine if a change in their buying patterns, such as large purchases, indicates that someone has stolen their credit card or linked it to a recent move.

Zoldi said traditional fraud monitoring models often forget details after some time. He used the example of an international visit. Someone could have regularly visited Kazakhstan for several years. The bank understood that the user had been in that country for a while and approved transactions there. After a period of not visiting, they go back to the country. The model, having forgotten the previously established pattern because of the length of time, begins flagging for fraud again. Zoldi said that is not the case with FSM.

“The architecture is different; it has something called a contrastive head and a supervised head,” Zoldi said. “The contrastive head says, is this transaction in or out of pattern, while the supervised head says, is this change in behavior fraud or not. The supervised task knows the probability of fraud, the fact that she has hardship, and we have to intervene.”

Zoldi added that the company created synthetic data to train the models, which masks personally identifiable information.

Domain-specific models

FICO’s thesis is that some industries would be best served with domain-specific models rather than repurposing a large language model that has a more general understanding of data. Financial institutions, Zoldi said, could be running several small, domain-specific models for different use cases, but all of these would focus on only one aspect of the business.

Zoldi noted a niche model would only know the information and knowledge it needs to, meaning it will not tap into anything that could introduce hallucinations.

However, it is often challenging and costly to develop a foundation model. Most companies opt to fine-tune an LLM from either OpenAI or Anthropic. Some companies, like Intuit, have released finance-specific models.

The trust layer

FICO Focused

Domain-specific models

More