Google proposes applying AI to patent application generation and categorization

Google asserts that the patent industry stands to benefit from AI and machine learning models like BERT, a natural language processing algorithm that attained state-of-the-art results when it was released in 2018. In a whitepaper published today, the tech giant outlines a methodology to train a BERT model on over 100 million patent publications from the U.S. and other countries using open-source tooling, which can then be used to determine the novelty of patents and generate classifications to assist with categorization.

The global patent corpus is large, with millions of new patents issued every year. It's complex as well. Patent applications average around 10,000 words and are meticulously wordsmithed by inventors, lawyers, and patent examiners. Patent filings are also written with language that can be unintelligible to lay readers and highly context-dependent; many terms are used to mean completely different things in different patents.

For all these reasons, Google believes that the patents domain is ripe for the application of algorithms like BERT. Patents, the company notes, represent tremendous business value to a number of organizations, with corporations spending tens of billions of dollars a year developing patentable technology and transacting the rights to use the resulting technology and patent offices.

"We hope that our [proposal] will help the broader patent community in its application of machine learning, including corporate patent departments looking to improve their internal models and tooling with more advanced machine learning techniques, patent offices interested in leveraging state-of-the-art machine learning approaches to assist with patent examination and prior art searching, machine learning and natural language processing researchers and academics who might not have considered using the patents corpus to test and develop novel natural language processing algorithms," Google data scientists Rob Srebrovic and Jay Yonamine wrote in a blog post. "Patent researchers and academics who might not have considered applying the BERT algorithm or other transformer based approaches to their study of patents and innovation."

As VentureBeat recently reported, businesses aren't the only ones that stand to benefit from AI with regard to patent processing. The U.S. Patent and Trademark Office (USPTO) built AI models for different categories of patents and then trained the models on text from patent abstracts. Separately, the USPTO's staff is using AI to more efficiently process patent applications. According to a spokesperson, the agency is now using a "leading RPA provider" to centralize its bot efforts and ensure a proper process and governance model that includes use cases, development, testing, and security before bots are deployed.

"We are working on adding AI tools to help route applications to examiners more quickly and to help examiners search for prior art," Andrei Iancu, U.S. Under Secretary of Commerce for intellectual property, said in an emailed response to VentureBeat in October. "We've also been active on the Trademarks side, exploring the use of AI to help find prior similar images and to identify what we call fraudulent specimens. We are exploring using AI to improve the accuracy and integrity of the trademark register."

More