To further strengthen our commitment to providing industry-leading coverage of data technology, VentureBeat is excited to welcome Andrew Brust and Tony Baer as regular contributors. Watch for their articles in the Data Pipeline.
Confidential computing focuses on potentially revolutionary technology, in terms of impact on data security. In confidential computing, data remains encrypted, not just at rest and in transit, but also in use, allowing analytics and machine learning (ML) to be performed on the data, while maintaining its confidentiality. The capability to encrypt data in use opens up a massive range of possible real-world scenarios, and it has major implications and potential benefits for the future of data security.
VentureBeat spoke with Raluca Ada Popa about her research and work in developing practical solutions for confidential computing. Popa is an associate professor at the University of California, Berkeley, and she is also cofounder and president of Opaque Systems.
Opaque Systems provides a software offering for the MC2 open-source confidential computing project, to help companies that are interested in making use of this technology, but may not have the technical expertise to work at the hardware level.
Confidential computing’s journey
Popa walked through the history of confidential computing, its mechanics and its use cases. The problems that confidential computing is designed to address have been around, with different people working to solve them, for decades. She explained that as early as 1978, Rivest et al. acknowledged the privacy, confidentiality and functionality benefits that would stem from being able to compute on encrypted data, although they didn’t develop a practical solution at that time.
Join today’s leading executives at the Low-Code/No-Code Summit virtually on November 9. Register for your free pass today.
In 2009, Craig Gentry developed the first practical construction, an entirely cryptographic solution, called fully homomorphic encryption (FHE). In FHE, the data remains encrypted, and computation is performed on the encrypted data.
However, Popa explained that the FHE was “orders of magnitude too slow” to enable analytics and machine learning, and, although the technology has since been refined, its speed is still suboptimal.
A best of both worlds approach
Popa’s research combines a recent advancement in hardware that emerged within the past few years, called hardware enclaves, with cryptography, into a practical solution. Hardware enclaves provide a trusted execution environment (TEE) wherein data is isolated from software and from the operating system. Popa described the hybrid approach of combining hardware enclaves with cryptography as the best of both worlds. Inside the TEE, the data is decrypted, and computation is performed on this data.
“As soon as it leaves the hardware box, it’s encrypted with a key fused in the hardware…” Popa said.
“It looks like it’s always encrypted from the point of view of any OS or administrator or hacker…[and] any software that runs on the machine…only sees encrypted data,” she added. “So it’s basically achieving the same effect as the cryptographic mechanisms, but it has processor speeds.”
Combining hardware enclaves with cryptographic computation enables faster analytics and machine learning, and Popa said, that for the “first time we really have a practical solution for analytics and machine learning on confidential data.”
Hardware enclave vendors compete
To develop and implement this technology, Popa explained that she and her team at UC Berkeley’s RISELab “received early access from Intel to its SGX hardware enclave, the pioneer enclave,” and during their research determined that “the right use case” for this technology is confidential computing. Today, in addition to Intel, several other vendors, including AMD and Amazon Web Services (AWS), have come out with their own processors with hardware enclave technology.
Though, some differences do exist among the vendors’ products, in terms of speed and integrity, as well as user experience. According to Popa, the Intel SGX tends to have stronger integrity guarantees, whereas the AMD SEV enclave tends to be faster.
She added that AWS’ Nitro enclaves are mostly based on software, and do not have the same level of hardware protection as Intel SGX. Intel SGX requires code refactoring to run legacy software, whereas AMD SEV and Amazon Nitro enclaves are more suitable for legacy applications. Each of the three cloud providers, Microsoft, Google and Amazon, has enclave offerings as well.
Since hardware enclave technology is “very raw, they offer a very low-level interface,” she explained — Opaque Systems provides an “analytics platform purpose-built for confidential computing” designed to optimize the open-source MC2 confidential computing project for companies looking to make use of this technology to “facilitate collaboration and analytics” on confidential data. The platform includes multi-layered security, policy management, governance and assistance in setting up and scaling enclave clusters.
Confidential computing has the potential to change the game for access controls, as well. Popa explained that “the next step that encryption enables, is not to give access to just the data, but to some function result on it.” For example, not giving access “to [the] whole data, but only to a model trained on [the] data. Or maybe to a query result, to some statistic, to some analytics query based on [the] data.”
In other words, instead of giving access to specific rows and columns of data, access would be given to an aggregate, a specific kind of outpu,t or byproduct of the data.
“This is where confidential computing and encryption really comes into play… I encrypt the data and you do confidential computing, and compute the right function while keeping [the data] encrypted… and only the final result gets revealed,” Popa said.
Function-based access control also has implications for ethics because machine learning models would be able to be trained on encrypted data without compromising any personal or private data or revealing any information that might lead to bias.
Real-world scenarios of confidential computing
Enabling companies to take advantage of analytics and machine learning on confidential data, and enabling access to data functions, together opens up a wide range of possible use cases. The most significant of these include situations where collaboration is enabled among organizations that previously could not work together, due to the mutually confidential nature of their data.
For example, Popa explained that, “traditionally, banks cannot share their confidential data with each other;” however, with its platform to help companies take advantage of confidential computing, Opaque Systems enables banks to pool their data confidentially while analyzing patterns and training models to detect fraud more effectively.
Additionally, she said, “healthcare institutions [can] pool together their patient data to find better diagnoses and treatment for diseases,” without compromising data protection. Confidential computing also helps break down walls between departments or teams with confidential data within the same company, allowing them to collaborate where they previously could not.
Charting a course
The potential of confidential computing with hardware enclaves to revolutionize the world of computing was recognized this summer when Popa won the 2021 ACM Grace Murray Hopper Award.
“The fact that the ACM community recognizes the technology of computing on encrypted data … as an outstanding result that revolutionizes computing … gives a lot of credibility to the fact that this is a very important problem, that we should be working on,” Popa said — and to which her research and her work has provided a practical solution.
“It will help because of this confirmation for the problem, and for the contribution,” she said.