As storage has become cheaper, those who generate data have grown used to hanging onto it, even if they (and their IT teams) don’t know what, if anything, they’ll be able to do with it later. Sales notes? Store them. Customer information? Store it. Multiple versions of last quarter’s marketing presentation? Financial spreadsheets? Audio files? Store it, store it, store it.
The ability to house virtually all the data employees create has sparked a growing problem most frequently known as “dark data.” With massive data stores they can’t effectively analyze, enterprises are opening themselves up to all kinds of risk, as well as losing out on opportunities to use that data to advance their businesses.
What is dark data?
Dark data refers mostly to unstructured information, which includes text documents, multimedia files, PowerPoint decks, spreadsheets, and more, and makes up about 80 percent of the data most companies create. When data is “dark,” it’s often because the organizations that own it lack the tools, infrastructure, or skills to effectively leverage it.
When data is dark, neither IT teams nor business users can, for example, analyze myriad sales and customer call notes to assess customer responses to a marketing campaign. And because that analysis isn’t possible, the marketing department can’t make informed adjustments to improve the results of its next campaign, the sales team gets fewer qualified leads and the business sees the negative effect on its bottom line.
Dark data represents missed opportunities for companies to learn more about their employees, customers, and businesses to decrease costs and increase productivity and profits. More concerning, however, might be the potential liabilities lurking in that dark data.
What are the risks of leaving data in the dark?
Whether or not an enterprise has the means to analyze or use its dark data is irrelevant when it comes to risk. If, for example, a company exposes personally identifiable customer information it didn’t realize was stored in its systems, “we didn’t know we had it” is an excuse that won’t mitigate legal responsibility (and will only exacerbate reputation damage). Legal and regulatory risks, however, are not the only issues organizations need to be concerned about when their data is in the dark.
Shrouded proprietary or intellectual data can cause problems when it is inadvertently disclosed, so enterprises need to know which files hold such information, who is accessing those files, and how often. The capability to access that information is not just about risk aversion; it’s also a matter of opportunity cost. Imagine what line-of-business users in human resources, finance, marketing, sales, and virtually every other department could achieve if they had the means to shed light on dark data.
How can companies illuminate dark data?
Alleviating the dark data challenge begins with ensuring that newly created data doesn’t go dark in the first place. Anytime an employee stores data, she (or the IT team charged with supporting her) should ask three questions:
- Is it accessible?
- Is it searchable?
- Is it usable?
IT can help ensure “yes” answers to those questions by putting enough structure in place to analyze data as it is created, so users can surface actionable insights and spot issues before they become problems.
Bringing dark data into the light is a critical issue for businesses. When companies don’t know what they have, who created it, who accesses it, what they used it for or when, they can’t protect against risk, and they can’t identify potentially valuable opportunities.
In an age when information is advantageous in every industry, it makes little sense for organizations to settle for doing business in the dark.
John Joseph is the president and a cofounder of DataGravity. Previously, he was vice president of marketing for storage solutions at Dell following its acquisition of EqualLogic, which he had joined in 2003.