This article is part of the Technology Insight series, made possible with funding from Intel.
Using the public cloud is like swimming in the ocean. Both are vast resources filled with potential –- and peril. Without proper precautions, even experts can be attacked and drown.
Despite these dangers, organizations increasingly rely on both to integrate multiple data sources for analytics. One big draw: seemingly bottomless trenches of data to help develop and train machine learning systems.
Old challenge, new twist
Many CISOs, CSOs, and CIOs continue to struggle to protect data from sophisticated cross-cloud orchestration and cross-tenant attacks, among others. It’s a modern variation of a familiar challenge: balancing security and privacy with usability. While placing and processing intellectual property on shared servers is fraught, experts say the risk can and must be managed.
That’s the aim of a new cross-industry effort, the Confidential Computing Consortium. Founded in 2019, the collaboration operates within The Linux Foundation. Its mission is defining and promoting adoption of confidential computing, which protects sensitive data within system memory, a new favored target for attackers. Backers include industry heavyweights Alibaba, ARM, Baidu, Google Cloud, IBM, Intel, Microsoft, Red Hat, and Tencent.
- Greater reliance on public clouds and edge for analytics and AI is driving the need for new, stronger security.
- The new Linux Foundation Confidential Computing Consortium is working to devise secure computing enclaves for in-use data.
- Microsoft, Google, Red Hat, and Intel are building confidential computing compatibility and tools.
New battleground: Data in use
Supporters say confidential computing helps keep data useful without sacrificing privacy and regulatory compliance.
Consider genomics, where researchers must process genome databases of well over 1TB. That data likely arrives encrypted, containing DNA information and the patient’s personal data. If the analytics application runs in a secure enclave, data can be decrypted safely. Personal metadata remains unviewable, even as needed data is processed.
Similar treatment might be given to stock trading data, banking transactions, blockchain transactions (as opposed to group validation), and healthcare information. Any data in which privacy must be maintained during aggregation can benefit.
According to proponents, confidential computing offers great promise for safely running applications on public clouds and on the edge.
Confidential computing lets untrusted third parties collaborate with data without providing visibility into it. Proponents say that could enable much broader and deeper partnerships between companies and institutions worldwide.
First areas of focus
The Confidential Computing Consortium “will bring together hardware vendors, cloud providers, developers, open source experts and academics to accelerate the confidential computing market; influence technical and regulatory standards; and build open source tools that provide the right environment for TEE development.” The organization will also anchor industry outreach and education initiatives.
Key projects include:
- Intel Software Guard Extensions (Intel SGX) Software Development Kit, designed to help application developers protect select code and data from disclosure or modification at the hardware layer using protected enclaves.
- Microsoft Open Enclave SDK, an open source framework that allows developers to build Trusted Execution Environment (TEE) applications using a single enclave abstraction. Developers can build applications once that run across multiple TEE architectures.
- Red Hat Enarx, a project providing hardware independence for securing applications using TEEs.
No place for data to hide
It’s hardly news that the public cloud remains beset with predators hungry for data at rest, in motion, and in use. Yahoo suffered major security breaches in 2013 and 2014, with more than one billion user records stolen. Apple’s iCloud hack exposed private celebrity files to public scrutiny in 2014. And of course, Cambridge Analytica illicitly scraped more than 80 million Facebook profiles prior to the 2016 election without users’ consent.
More recently, F5 Networks noted a 40% uptick in attacks, including campaigns against vBulletin servers and Oracle WebLogic servers. Moreover, threats to supervisory control and data acquisition (SCADA) systems in industrial settings as well as Internet of Things (IoT) device exploits are also rising, the firm says.
So why this effort now? The short answer: Current protective measures need to keep evolving for a cloud-converged, data-hungry world. With as-a-service options for applications and infrastructure continuing to gain popularity, more organizations need to protect more public data and intellectual property. Stringent new data privacy regulations like GDPR are another big factor.No matter how secure the application, data can still land in inquiring hands.
Consider how, in 2018, the U.S. enacted the Clarifying Lawful Overseas Use of Data (CLOUD) Act. It required U.S. data providers to preserve and provide any data subpoenaed by U.S. courts, even if that data is located abroad. The law works both ways; providers like Google and Microsoft must detail how they adhere to treaties that provide user data to governments outside the United States. Yet subpoenas aside, rogue administrators can still expose confidential data.
Hardware locks the castle
Consensus is growing that software alone cannot handle the growing complexity of these modern threats and demands. The thinking is this: If hardware is the ground under the server’s castle, security-hardened hardware presents attackers with tunnel-proof bedrock.
Hardware-based security also enables assistance from silicon-level accelerators to relieve the CPU from having to shoulder such burdens through software, thus improving system efficiency.
The industry has worked to enable hardware-based security for many years. The Trusted Computing Group, for example, released the first specifications for the Trusted Platform Module (TPM) in 2009. The next year, Intel debuted AES-NI encrypt/decrypt on Westmere-generation processors. Such measures have done a good job of protecting data at rest and data in flight from one location to another.
However, one serious weakness remained: data in use, meaning data being handled in system memory. That’s where the Confidential Computing Consortium is focusing major efforts.
Establishing trusted execution spaces
SGX (Software Guard Extensions) provide a framework for creating secure “enclaves” within RAM. These are invisible to the system (and thus any users or attackers), so data can be handled without risk of outside exposure.The capability is important because “in-flight” data located in RAM is almost always unencrypted, which leaves it vulnerable.
Much of today’s infrastructure stack is prone to attack from nefarious agents. SGX and trusted enclaves operate within system memory, out of sight and beyond access from would-be intruders.
With trusted enclaves, “data and operations are isolated and protected from any other software, including the operating system and cloud service stack,” explained Lorie Wigle, a VP in Intel’s architecture, graphics and software group, in a blog post. “Combined with encrypted data storage and transmission methods, TEEs can create an end-to-end protection architecture for your most sensitive data”
SGX enclaves provide a secure zone within RAM for handling this open data. However, using SGX requires custom coding of applications to make use of it.
Confidential Computing provides for trusted execution environments (TEE), meaning secure enclaves via SGX. But it does so in a way that allows unmodified applications to run in certain SGX-ready containers (such as Graphene, SCONE, or Haven) within SGX.
Open Enclave, open source
As the Open Enclave group defines it on their site, “Open Enclave SDK is an open source SDK targeted at creating a single unified enclaving abstraction for developers to build Trusted Execution Environment (TEEs) based applications.” Google’s Asylo and Red Hat’s Enarx provide similar frameworks and SDKs. The common denominator across all these projects is to make the cloud more secure.
“Software developed through this consortium is critical to accelerating confidential computing practices built with open source technology and Intel SGX,” said Intel’s Imad Sousou, corporate vice president and general manager, system software products, in a Linux Foundation statement. “Combining the Intel SGX SDK with Microsoft’s Open Enclave SDK will help simplify secure enclave development and drive deployment across operating environments.”
Azure shows the way
To get a sense of how confidential computing is already impacting the cloud and application development, look to Microsoft’s Azure confidential computing efforts.
Even before the Confidential Computing Consortium kicked off, Microsoft CTO Mark Russinovich noted in a May 9, 2018 blog post how his company deployed SGX-enabled Xeon processors in its East US Azure region for customers needing trusted execution enclaves.
Microsoft enclave APIs allowed developers to build and deploy C/C++ applications for trusted execution. And Microsoft Research worked with Azure to prevent any possible trusted execution data leaks. In addition, Microsoft has also announced Confidential Computing for Kubernetes, IPv4/IPv6 dual-stack, and KEDA 1.0.
Clearly, these are still early days for trusted execution, but the building blocks are cementing into place.
Organizations in need of cloud-based data aggregation and collaboration and/or the ability to trust the cloud as a platform for secure computing seem likely to embrace new confidential computing technologies. That will enable participants in the coming waves of analytics and AI to advance with less worry about the safety and privacy public data and intellectual property.