Apixio is trying to help some insurers get more accurate reimbursement for patient care with the assistance of machine learning.

The company’s new service, called the Code Compliance Auditor, takes Apixio’s existing work on text mining medical records and applies that to helping insurance providers classify which of their patients have which diseases. That classification process, which is usually done manually by humans looking over medical records, is known as “coding,” since it involves matching written diagnoses with a set of numerical codes.

It’s designed to serve companies that offer Medicare Advantage plans, which provide supplementary coverage to people who are covered under Medicare. Those plans are provided lump sum payments from the U.S. government that are adjusted based on how many chronic diseases patients have and how severe those conditions are. Getting those payments and adjustments requires documentation in the form of coding.

Right now, health plan providers usually outsource their coding work to a third party, but Apixio says that it can let those companies keep coding in house with the assistance of machine learning. There’s a lot of risk involved, since incorrect coding can lead to fines.

“Here’s the thing, the Department of Justice has recently filed billion dollar lawsuits against [health plans like UnitedHealth Group] because these health plans are not doing the kind of due diligence and quality assurance that they should on these diagnoses they’re submitting from humans reading the medical records,” said Darren Schulte, the CEO of Apixio.

Apixio’s new feature will go through each medical record and extract the codes that it thinks are relevant to a particular patient, as well as the evidence for those diagnoses. The system then asks human employees to sign off on the resulting classifications.

Eighty percent of the records that Apixio handles come in as scanned or faxed documents, while 20 percent can be accessed directly through digital means. That presents special challenges to any sort of automated system, since the company needs to first run optical character recognition to extract the text from documents before doing any sort of machine analysis.

That extraction is often imperfect, because of markings on the scanned documents, typos in the original source, and possible malfunctions of the OCR system. To help combat that, Apixio trains its models on text that was ingested in the same way, so they learn to work around those types of errors.

Coding medical records presents a number of challenges for machine learning systems, according to Schulte. Patients’ files often aren’t organized with a table of contents or index, which means it’s up to the machine learning systems to determine what’s relevant. The system is designed to figure out which documents are test results and which are the result of in-person visits with a medical provider, which is key for accurately generating results.

On top of all that, Schulte said that Apixio’s system also works to make sure that humans reviewing the resulting codes aren’t just rubber-stamping the decisions of a computer that may not be accurate.

All told, Apixio says that a team of four people using this new system can audit the codes for 50,000 patients in as little as a week, which is a vast improvement in terms of speed.