3 trends driving data observability

Enterprise "data observability" is a hot space right now.

Over the past couple of months, investors have pumped $200 million into each of Cribl and Grafana Labs, two data observability startups, and lesser amounts into related companies like Acceldata and DeepFactor.

What's behind this frenzy?

Well, enterprise data systems are like a busy family household. From room to room, you have a complex ebb and flow of activity, with people coming and going, and doors opening and closing. Various inbound streams from utilities make it all go: water flowing through pipes, electricity, and Wi-Fi powering everything, and warm or cool air flowing through the vents.

The difference is that in the enterprise, the data deluge is increasing at an unprecedented rate.

At home, as in the enterprise, it's easy to take this complexity for granted day-to-day, but when something goes haywire, life can instantly grind to a halt. At home, this is why we have modern conveniences such as smart thermostats, connected appliances, and webcam security systems. These gadgets let us monitor what's going on in the home, be it a dead lightbulb or an unwanted intruder -- and then try to rectify the problem.

This ability to monitor and understand the system is the reason why data observability is one of the hottest topics in enterprise IT at the moment. To be clear, here is what we're discussing:

Monitoring: solutions that allow teams to watch and understand what is happening in their data systems, based on gathering predefined sets of metrics or logs.
Observability: solutions that allow teams why changes are happening in their systems, including answering questions that may not have been previously asked or thought of.

The home analogy is what Clint Sharp, cofounder, and CEO of data observability company Cribl, sometimes uses while trying to explain data observability in relatable terms.

"Observability is the ability to ask and answer questions of complex systems, including questions I may not have planned in advance," Sharp said, likening observability tools to a thermostat that will notify you if the temperature in your home suddenly goes dramatically higher or lower than expected.

"A harder question to answer is: Why did the temperature go awry?" Sharp said. "That can be a difficult thing to diagnose, especially if I'm doing it on a modern application with dozens of developers working on it and all kinds of complex interactions."

Data observability is about the 'why'

The "why" part is what data observability is all about, and it's what sets it apart from simply monitoring for problems -- meaning the "what" — within IT infrastructure and data systems. During the last few years, enterprises have begun shifting from mere data monitoring to data observability, and the trend is only now beginning to pick up steam.

By 2024, enterprises will increase their adoption rate of observability tools by 30%, according to research firm Gartner. And 90% of IT leaders say that observability is critical to the success of their business, with 76% saying they expect to see their observability budgets increase next year, according to New Relic's 2021 Observability Forecast,

This is good news for people such as Cribl's Sharp, whose startup is just one of many players in this fast-growing ecosystem. For its part, Cribl offers a centralized observability infrastructure that can plug into a vast array of data sources and observability tools. There are plenty of them out there: Splunk, Accel Data, Monte Carlo, Bigeye, and Databand are just a handful of the companies focused on data observability at the moment.

Data observability is a fast-growing discipline in the world of enterprise technology that seeks to help organizations answer one question: How healthy is the data in their system? With all the disparate (and often differently formatted) data flowing into, within, and out of enterprises, where are the potential weaknesses -- such as missing, broken, or incomplete data -- that could lead to a business-crippling outage?

Observability consists of five pillars

Good data observability includes:

Freshness, or how up-to-date the data tables are;
Distribution, or whether the data covers the correct range;
Volume, or the amount and completeness of data;
Schema, which monitors changes to data's structure;
Lineage, which identifies where data breaks and tells you which sources were impacted.

The cost of data outages can be enormous. From lost revenue and eroded customer confidence to reduced team productivity and morale, enterprises have a lot to lose when data pipelines break. As enterprise data systems grow more complex and multi-layered — with data flowing from a wide variety of sources and more people interacting with it — the need for observability is becoming increasingly urgent.

Good data observability is about more than just preventing a catastrophe. By applying observability best practices to their data stacks, enterprises can boost efficiency, speed up innovation, and even reduce IT costs by making it easier to optimize their data infrastructure and avoid unnecessary over-provisioning. It can even help with talent retention, as a well-oiled and problem-free environment keeps engineers and other team members happy.

It's no wonder enterprises are starting to take data observability seriously. So what's next for this up-and-coming space? Here are three major trends shaping the future of data observability.

Trend No. 1: AI supercharges data observability

Like many aspects of modern life, artificial intelligence is making its mark on enterprise data observability. In fact, many would argue that AIOps — or the use of AI to automate and enhance IT operations — is an essential requirement for true observability. At a high level, machine learning and other AI technologies can help teams more easily analyze large, interconnected sets of data. This automatically detects problematic patterns and zeroes in on the root of issues when they do occur.

Observability platform company Monte Carlo, for example, uses AI models to identify patterns in query logs, trigger investigative follow-up results, and look for upstream dependency changes to determine the cause of a given issue. Observe.ai, an observability tool for call centers, uses natural language processing and automatic speech recognition to transcribe and analyze customer service phone calls, while automatically flagging repetitive patterns, data shifts, and anomalies.

Trend No. 2: data standardization helps observability evolve

There's a reason that the schema of data is one of the five pillars of observability. With data coming from so many sources and in different formats, it's no wonder that variances in the structure of those datasets can cause mismatches and other data problems.

So enterprises are pushing for standardization. For example, OpenTelemetry is a new, open source framework that aims to tame some of the data chaos and make observability easier across different platforms, pipelines, and data sources. Through its collection of open, vendor-neutral tools, SDKs, and APIs, OpenTelemetry gives organizations a standardized way to collect telemetry data — the metrics, traces, and logs that make up the heart of data observability — and easily route that data between various services and data analysis tools.

Trend No. 3: data observability shifts further into the cloud

With more and more aspects of enterprise tech and operations happening in the cloud, it's no surprise that data observability would be shifting in that direction as well. Increasingly popular cloud data architectures such as Snowflake, allow enterprises to store and use their data in the cloud, while data virtualization and visualization tools make it easier for teams to make sense of that data.

The cloud is also becoming a friendlier place for data observability itself. Cribl, for example, recently announced a new feature called LogStream Cloud Enterprise, which allows companies to move sensitive data processing to the cloud in a way that protects the security of local data using cryptographically secured, zero trust tunnels.