Data extraction is a time-consuming business. Case in point: According to a recent IDC survey, data professionals spend approximately 75 percent of their time gathering and cleaning data and only about 25 percent finding insights from the data.
That’s where Import.io comes in. The Los Gatos, California startup, which uses machine learning to automate the extraction and processing of web data, today announced $15.5 million in Series B financing. London-based Talis Capital led the round, with participation from existing investors IP Group, OpenOcean, Oxford Capital, and Wellington Partners.
That follows a $13 million Series A round in 2016 and a $4.5 million seed round in 2013, bringing Import.io’s total haul to $33 million.
CEO Gary Read said the capital would be used to accelerate global growth and expand Import.io’s product offerings. He says that since the company’s incorporation in London in 2012 it has attracted more than 800 enterprise customers, who receive data from millions of web sources via its proprietary platform. In 2015 alone, it extracted data from over 5.5 billion web pages.
“Businesses around the world are losing trillions of dollars due to lack of timely access to high-quality data. In fact, IBM estimates that poor-quality data costs businesses in the U.S. more than $3 trillion annually,” Read said. “Import.io is committed to providing timely, high-quality data with little to no customer resource requirements. We empower our customer base of more than 800 companies to make business-critical decisions based on the data we provide every day, and we back that up with an aggressive service-level guarantee.”
Import.io’s novel machine learning solution not only extracts data, it prepares and integrates the data into clients’ analytics platforms and business applications, effectively transforming websites into APIs. For instance, startup StoryFit uses it to cull related books, movies, and television data across hundreds of thousands of web pages to generate predictive analytics for movie studios and book publishers. AudioLock, another customer, taps it to scan the web for unlicensed music content.
Import.io’s data-crawling suite offers clients more sophisticated features, too, such as the ability to merge info from multiple sources and create a common schema. It also offers reporting and visualization functions, including a comparison audit tool that shows how things have changed over time.
This approach sets it apart from competitors like Webhose.io, DeepCrawl, and others, according to Talis Capital’s Matus Maar.
“When we saw what Import.io was doing, we immediately understood the importance and recognized the game-changing capabilities of the solution,” he said. “We spoke to multiple Import.io customers who explained how important Import.io had become to their business and raved about the solution, support, and quality of the data provided.”