Navigating the data quality conundrum: How to ensure your data meets organizational needs

Presented by BMC

Poor data quality costs organizations an average of $12.9 million a year. Organizations are beginning to recognize that not only does it have a direct impact on revenue over the long term, but poor data quality also increases the complexity of data ecosystems, and directly impacts the success of AI, machine learning and analytics efforts. Turning insight into action can go sideways when that insight is based on bad information, causing a reputational risk and a hit on customer engagement.

"Organizations have multiple initiatives, all competing for the same budgetary pool for investment, so they often struggle with how much to invest in data quality, and then what aspects of data quality to invest in," says Ram Chakravarti, CTO at BMC Software. "The question then becomes, what is good enough? That’s the data quality conundrum."

But high-quality data is a critical differentiating factor in a global business and tech landscape that is endlessly evolving. It improves operational efficiencies, drives strategic decision-making, transforms customer satisfaction, powers sustainability initiatives, reduces risk and more. By removing hurdles to self-service data, IT professionals can clear a space for real innovation.

Unfortunately, a significant number of data and analytics programs fail due to poor data.

The hurdles to high-quality data

"Data quality is a killer when companies don’t take adequate time to ensure its quality before their models are trained," says Chakravarti. "There are multiple reasons why companies are sitting on what I’d call a considerable amount of poor data."

The top reasons come down to inadequate data strategy, poor governance and a lack of collaboration between teams. Insufficient investment in data as an asset, and no business oversight on real-world data, are challenges that can often lead to shiny toy syndrome, in which decision-makers are overly focused on the latest technology at the expense of delivering value or organizational readiness. This can also result in an unclear approach to measuring success, hampering a team's trajectory and goal-setting.

Technical challenges abound, like the erroneous transformation of data as it moves from systems of record to systems of engagement and systems of insight. Projects face obstacles in deploying to productionat scale while maintaining flexibility, speed and customization, especially when thelimited talent poolin the industry often leaves data analytics teams scrambling to find members with the crucial combination of internal domain knowledge and deep technical knowledge.

Perhaps the most relevant issue in terms of AI and machine learning is that the established, traditional approaches to data quality do not lend themselves well to newer types of data, such as big data, streaming data, and the types of data used as inputs for advanced analytics. The data pipelines required to feed data and analytics projects have become too complex to manage manually. But the market for data analytics tools is fragmented, and the holy grail for data quality issues doesn't exist, making data quality a continuous, ongoing investment.

Operationalizing and monetizing data requires strategy and spend, and organizations that want to accelerate data-driven activities must also be able to estimate the potential business impact of—and justify the long-term investment in—data-supporting technologies, processes, and staffing.

How DataOps is eliminating obstacles to data quality

The struggle to refine, define, ensure reliability and suitability, and make use of the volumes of data collected, is growing. To kneecap the Goliath that is data quality challenges, companies need to address three areas: connecting the exploding number of data sources; addressing challenges in deploying to production at scale; and improving collaboration among teams.

It comes down to managing increasingly complex data pipelines so engineering and data teams can deliver reliable, high-quality data to the people who need it, when they need it. That's where DataOps, the application of agile engineering and DevOps best practices to data management, comes in to support and advance data-driven business outcomes. It helps organizations rapidly turn new insights into the kind of fully operationalized production deliverables that unlock business value from data. DataOps is critical for holistic, enterprise-wide use of data and its analysis for business insights and, ultimately, to deliver value back to the organization.

Automation, observability and orchestration are key to the success of DataOps. Automation reduces or eliminates human errors and is necessary to scale up efforts around data use in an enterprise-wide, democratized data scenario. Meanwhile, observability and monitoring help catch and resolve any issues that make their way into the pipeline. Orchestrators facilitate the ingestion, transformation, and delivery of accurate, reliable data, faster than ever.

Implementing successful data quality management

For DataOps to succeed, you need executive buy-in and support, Chakravarti says -- without it, your initiative is already sunk. Secondly, it cannot be an IT-led effort. It requires business ownership and accountability of data assets, or in other words, executive-level business owners, business data stewards, and a clear structure of governance for each of your necessary data sets, because not all data is alike.

Data assets should be classified based on how valuable they are, or in other words, how critical they are to targeted initiatives. That forms the basis, then, for mapping the business value of data, as well as the business impact of data quality improvement, for each of the data assets under consideration.

Once this is accomplished, form a list of initiatives aimed at improving data quality for each different data set, and set up a multi-horizon road map to systematically institute data quality improvements.

Execute on one or two quick wins in order to show value quickly, or if there's no low-hanging fruit, choose a challenge that could have a surprising amount of impact when it's solved. Once you’ve shown the quick wins, execute to prioritize the highest return on investment use cases.

"Execute in small steps, because different parts of the organization have different levels of readiness and acceptance," Chakravarti says. "Estimate the value at risk for the organization if something were to go wrong on account of poor data quality. If you put it in those terms, such as reputational risk, customer risk or revenue impact, it becomes relatively easy to get executive support. Taking the time to illustrate that in use cases for specific types of data can go a long way in securing executive support and buy-in."

Dig deeper: Learn more here about overcoming the challenges to high-quality data.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

The hurdles to high-quality data

How DataOps is eliminating obstacles to data quality

Implementing successful data quality management

More