How open, trusted edge can help improve data sharing and monetization

This article is part of the Technology Insight series, made possible with funding from Intel.

Data is valuable only insofar as you can trust it. If you can’t be confident about its origin or contents, then the information isn’t worth much.That's a big problem for businesses eyeing the 5G future and developing strategies for monetizing data generated at the edge. Project Alvarium, formed under the Linux Foundation, aims to help organizations disrupt today's edge business model by quantifying the privacy, accuracy, and security of data flowing into their networks using trust fabrics.

What's the problem?

Applications analyzing data from within a datacenter rely on certain hardware and software technologies as they move information around. The technologies are organized into trusted stacks and overseen by trained security experts. “Those stacks are highly proprietary and homogeneous,” explained Steve Todd, a Fellow in Dell’s IoT and edge computing solutions division. “They’re in one location and often run on equipment provided by one vendor.”

In contrast, the edge is immensely heterogeneous. Data crosses multiple networks from multiple vendors across multiple locations with different firewalls and configurations. So, an enterprise stack isn’t able to deliver the same level of trustworthiness to data from the edge. But its underlying principles can still be used to deploy applications against trusted data coming from edge devices.

Key points

Before data generated at the edge can be shared and monetized, it must be trustworthy.
Project Alvarium focuses on building trust fabrics to quantify the trust and confidence in data spanning heterogeneous systems.
The LF Edge umbrella includes other projects to enable an open, trusted edge.

Understanding data confidence fabrics

Some of the trust principles borrowed from enterprise storage systems are bound together in an open framework to become a data confidence fabric (DCF). Each trust insertion technology contributes to an overall confidence score which in turn allows organizations to act with measured risk across heterogeneous systems.

“The concept of a trust fabric will increasingly become critical in order to make reliable and non-damaging business decisions due to the ever-increasing volume and velocity of edge data, as well as the increasing risk of tainted data going undetected,” said Michael Morton, chief technology officer at Boomi.

Project Alvarium doesn’t reinvent the trust insertion technologies that make up a DCF. Rather, the project focuses on system-level trust, unifying existing and emerging technologies under a framework with open APIs to create refined confidence scoring algorithms.

Why does the industry need open trust fabrics?

Diversity at the edge seems to be the biggest reason to adopt an open solution. Today, you can deploy a Microsoft, AWS, or Google IoT stack and, from the sensor all the way up to their clouds, build a single platform. But then you’re locked in. Gartner says 75% of enterprise-generated data will be processed outside the traditional datacenter by 2022 -- up from less than 10% today -- reflecting the move to more distributed data. Given the amount of information we’re talking about, it’s hard to tell if any one solution is the best place to store it all permanently. An open edge makes it possible to normalize across a common infrastructure in a way that all parties can trust.

“We can give you open, interoperable data ingestion frameworks like EdgeX Foundry,” said Todd. “You can use open source storage systems like IPFS. And you can inexpensively build a data pool you keep close to your sensors.”

If it makes sense to move data over to Azure or AWS or Google and run analytics or some special application down the road, you can always do that. Otherwise, storing valuable data close to where it’s generated with open technologies gives you more options in the future.

An open trust fabric makes it easier for you to share or even monetize data across heterogeneous systems. You can define a policy for sharing data and services without sacrificing privacy. And because the DCF can include provenance metadata, the owner of a distributed record can trigger its deletion, helping satisfy compliance requirements.

Inserting trust along the way

Dell Technologies built the first DCF for inserting trust and calculating confidence in edge data’s credibility, and the code for that prototype will be used to seed Project Alvarium. The company’s model is representative of a robust DCF, though there’s nothing stopping you from rolling out a DCF using different ingredients. In fact, the project’s overview makes it clear there is no one DCF to rule them all; each organization can build its own fabric with technologies from the Alvarium framework.

Ideally, you do want to start trust insertion as close to the data source as possible. The first level in Dell’s example was a hardware root of trust on the edge device itself. Specifically, Dell used a Trusted Platform Module (TPM) on its Intel Atom-powered Edge Gateway 3000 to sign simulated data, which was then validated by a modified version of the EdgeX Foundry platform. From there, EdgeX was configured to reject all requests for data except for those coming from the DCF client software.

The third level of trust appended metadata to sensor information coming from the edge. That metadata could include policies for sharing the data, a history of where the data came from, and a confidence score.

Next, the DCF was configured to store data locally using the open source InterPlanetary File System (IPFS). Keeping data closer to its source enables lower-latency real-time analytics. An immutable storage system also ensures data isn’t tampered with, increasing its trust score. Record of the edge data was then registered into VMware’s open source blockchain for tracking. The multi-cloud distributed ledger ensured no single entity owned the trust.

One step closer to an open, trusted edge

The quest to make data sharable across boundaries is fraught with challenges. Trust is only one of them. Project Alvarium sets forth to annotate and label data as soon as it’s born so your application knows where it came from and how it was handled. But there are several other projects under the LF Edge umbrella formed to facilitate a more open, secure edge.

EdgeX Foundry, for instance, was used in the DCF prototype for data ingestion. According to Todd, “EdgeX Foundry gives us an open way to write the plug-ins that you need to connect to different sensors and protocols, to normalize data, and to treat it as an asset. We call that the open edge.”

Other LF Edge projects include the Akraino Edge Stack, Baetyl, EVE, and Fledge. Each technology has a slightly different focus and is backed by a community pushing that area of functionality forward. Because their APIs are open, you can mix LF Edge projects with commercial offerings if you find a solution that works better for your application.

It’s probably safe to assume that many organizations will want to use a combination of open and commercial elements for now. Dell says its DCF seed code will be donated early 2020. And Alvarium still has to make ties to other LF Edge projects (like Confidential Computing). But the project’s trust insertion technologies can already be deployed to improve the security and privacy of your heterogeneous edge environment, so now’s a good time to dig into the project’s mission and scope.

Resource: The Linux Foundation's Open Glossary of Edge Computing