Some studies suggest that the number of scientific papers published in English each year exceeds 3 million, which amounts to thousands per day. Perhaps unsurprisingly, it’s estimated that only half of those papers are ever read by anyone other than the author (or coauthors) and the publishing journal’s editors.
To help sift through the deluge, The Allen Institute for Artificial Intelligence — the research organization founded by late Microsoft cofounder Paul Allen — released Semantic Scholar in 2015, a public search engine that uses a combination of machine learning, natural language processing, and machine vision to highlight figures from and identify connections among computer science and biomedicine journal papers. Over two million users have adopted Semantic Scholar to date to analyze the academic literature, surfacing phenomena from male bias in clinical studies to the accelerating pace of China’s AI research. And now, the Allen Institute hopes to lay the semantic groundwork for the next few million users to come.
Semantic Scholar previously spanned 40 million total research papers, plus associated blog items, news reports, videos, and other resources. But starting this week, it’s more than tripling its reach to over 175 million papers in all fields of science, including natural sciences like biology, chemistry, geology, materials science, and medicine physics; social sciences including art, business, economics, geography, history, philosophy, political science, psychology, and sociology; and formal/interdisciplinary sciences such as computer science, engineering, environmental science, and mathematics. According to the Allen Institute, the expansion makes Semantic Scholar the world’s most comprehensive search engine for locating academic content.
“Scientific research is not advanced by search engines that operate in the same way that we use them today to shop for goods, find restaurants or look-up a news article,” said Allen Institute general manager Doug Raymond. “Successful scientific search must utilize AI to understand scientific papers and then enable researchers to go far beyond keywords to find the right information. This is what we have built with Semantic Scholar. We are now at a critical point in history where every scientist now has a powerful, free AI search engine at their fingertips.”
By way of background, Raymond, an Army veteran who developed machine learning initiatives for Amazon’s Alexa platform and Google’s search monetization division in the Asia-Pacific region, was hired to lead the Semantic Scholar project in March 2018. In December, the Allen Institute announced that it would partner with Microsoft to connect Semantic Scholar with the latter’s Academic Graph, a heterogeneous graph containing scientific publication records and citation relationships as well as authors, institutions, journals, conferences, and fields of study.
Thanks to the addition of the Academic Graph records, the number of included papers in Semantic Scholar grew to more than 173 million by August 2019. They also indirectly supported the development of Supp AI, a web portal atop Semantic Scholar that lets consumers of supplements like vitamins, minerals, enzymes, and hormones identify the products or pharmaceutical drugs with which they might adversely interact.
“All universities, all scientists today should be looking at the many ways that AI and deep learning can advance research and scientific progress,” said Allen Institute CEO Oren Etzioni. “We have proven that Semantic Scholar is capable of not only deep semantic search, but wide-scale research on everything from male bias in clinical studies to gender parity, or lack thereof, of published research across any scientific domain.”
According to the Allen Institute, Semantic Scholar now has over 6 million monthly active users.