This sponsored post is produced by Tamr.
A recent Forbes Insight/Teradata survey of 316 large global company executives found that 47 percent “do not think that their companies’ big data and analytics capabilities are above par or best of breed.” Given that “90 percent of organizations report medium to high levels of investment in big data analytics,” the executives’ self-criticism begs the question: Why, with so many urgent questions to answer with analytics every day, are so many companies still falling short of becoming truly data-driven?
Here’s a look at the quandary, and some thoughts about what’s needed to liberate businesses from its effects.
The problems
1. Analytics projects start from the wrong place
Many analytics projects start with a look at some primary data sources and an inference on what kinds of insights they can provide. In other words, they take the available sources as a constraint, then go from there. Understandable, but running a project like this skips a crucial step.
Analytics projects must start with the business questions you’re trying to answer, and then move into the data. Leading with your data necessarily limits the number and type of problems you can solve to the data you perceive to be available. Stepping back and leading with your questions, however, liberates you from such constraints, allowing your imagination to run wild about what you could learn about customers, vendors, employees and so on.
2. Analytics projects end too soon
Through software, services, or a combination of both, most analytics projects can indeed get to the answers for the questions they’re asking at any given time. But I’d argue that a successful analytics project shouldn’t stop with the delivery of its answers. For all the software and services money they’re spending, businesses should expect every analytics project to arm them with the knowledge and infrastructure to ask, analyze, and answer future questions with more efficiency and independence.
3. Analytics projects take too long … and still fall short
Despite improved methods and technologies, many analytics projects still get gummed up in complex data preparation, cleaning, and integration efforts. Conventional industry wisdom holds that 80 percent of analytics time is spent on preparing the data, and only 20 percent on actually analyzing the data. In the Big Data Era, wisdom’s hold feels tighter than ever. Massive reserves of enterprise data are scattered across variable formats and hundreds of disparate silos. Integrating information for analysis through manual methods can significantly delay attempts to answer mission-critical questions.
Or worse. It can significantly diminish the quality and accuracy of the answers, with incomplete data risking incorrect insights and decisions. Faced with a long, arduous integration process, analysts may be compelled to take what they can (i.e., the cleanest data from the closest sources) -- leaving the rest for another day, and leaving the questions without the benefit of the full variety of relevant data.
The solutions
1. Getting better answers, faster
So what can companies like mine (Tamr) do for businesses awash in data and the tools to analyze it, but continuously frustrated by incomplete, late, or useless answers to critical business questions?
We can create human-machine analytics solutions designed specifically to get businesses more and better answers, faster and continuously. In other words:
- Speed/Quantity -- get more answers faster, by spending less time preparing data and more time analyzing it
- Quality -- get better answers to questions, by finding and using more relevant data in analysis -- not just what’s most obvious/familiar
- Repeatability -- answer questions continuously by leaving customers with a reusable analytic infrastructure
2. Data preparation platforms
Data Preparation platforms from the likes of Informatica, OpenRefine, and Tamr have evolved tremendously over the last few years, becoming faster, nimbler, and lighter-weight than traditional ETL and MDM solutions. These automated platforms help businesses embrace, not avoid, data variety by quickly pulling data from many more sources than was historically possible. As a result, businesses get faster and better answers to their questions, since so much valuable information resides in “long tail” data. To ensure both speed and quality of preparation and analysis, we need solutions that pair machine-driven platforms for discovering, organizing, and unifying long tail data with the advice of business domain and data science experts.
3. Cataloguing software
Cataloging software like Enigma, Socrata, and Tamr can identify much more of the data relevant for analysis. The success of my recommended Question First approach, of course, depends on whether you can actually find the data you need for answers. That’s a formidable challenge for enterprises in the Big Data Era, as IDC estimates that 90 percent of big data is “dark data”: data that has been processed and stored but is hard to find and rarely used for analytics. This is an enormous opportunity for tech companies to solve by building software that quickly and easily locates and inventories all data that exists in the enterprise and is relevant for analysis -- regardless of type, platform, or source.
4. Data engineering infrastructures
Finally, we need to take the final step of building persistent and reusable data engineering infrastructures that allow businesses to answer questions continuously even as new sources are added or your data changes. A business can do everything right -- from starting with the question, to identifying and unifying all available data, to reaching a strong, analytically-fueled answer. And it can still fall short of optimizing its data and analytic investment if they’ve not built, along the way, an infrastructure that enables repeatable analytics without starting from scratch.
As the Forbes/Teradata survey implies, collectively businesses and analytics providers have a substantial gap to close between “analytics-invested” and “data-driven.” If we follow the simple design vision of getting businesses more and better answers faster and giving them the infrastructure to answer them continuously in the future, we’ll close that gap much faster than we’d think.
Nidhi Aggarwal is Global Lead - Operations, Strategy, and Marketing at Tamr.
Sponsored posts are content that has been produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. The content of news stories produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact sales@venturebeat.com.
