Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
General Data Protection Regulation (GDPR), the far reaching personal data privacy regulation going into effect Friday, aims to give individuals more control over their personal information. Among other requirements, it establishes the right of erasure – the right of an individual to request that any personal data about them be permanently deleted. This right, also known as the “right to be forgotten,” has caused some notable heartache in the blockchain community, as it appears to go against one of the fundamental underlying tenets of blockchain: immutability of posted information.
So how do we reconcile this right with the structural element of blockchain that essentially makes it impossible to delete data once it is entered onto the chain? It appears that there are only two leading viable solutions given current technology:
- Convincing regulators that “erasure” doesn’t have to mean data is literally deleted and that making data permanently inaccessible without deletion should produce the same effect
- Figuring out a way to use blockchain while keeping sensitive data “off chain.”
The final workable solution might be some combination of the two.
Making data inaccessible
“Hashing” is one of the fundamental elements of blockchain and, in very abbreviated terms, means that data is transformed in such a way that it cannot be reverse-engineered into its original state. GDPR limits the definition of personal data to information that is linked or could be linked to a specific person, with the understanding that if data is completely anonymized so that it cannot be re-linked to a person, even with additional external information, then it falls outside the scope of personal data and hence GDPR. Therefore, if all personal data that links to an individual is stored only in hashed form on a blockchain, an argument could be made that existence of the hashes on a chain does not constitute GDPR violation as it is sufficiently anonymized such that it falls outside of the definition of personal data.
While that certainly seems to have some reasonable basis to it, it is unclear if this argument in itself will definitively work and will need to be tested by the legal system even if mathematically sound. Some may assert that it is theoretically possible to obtain the underlying data through “brute force attack,” which is essentially where an attacker undertakes an extremely large number of guesses about what the data could be until they guess correctly, thereby exposing personal data.
In fact, there is some guidance, particularly by Article 29 Working Party, an independent European advisory body on data protection and privacy, that states that hashing is a form of “pseudonymization,” a technique for protecting personal data, rather than “anonymization,” a process that results in data so disconnected from the possibility of being linked to a specific individual that it is no longer considered personal data. It seems that this conclusion (Opinion 05/2014 on AnonymisationTechniques), at least partially, is based on a mathematical conclusion that hashing still leaves some small possibility of a successful brute force attack. Time will tell whether or not this conclusion is mathematically valid, but some would argue that the paper did not cover the whole spectrum of technological possibilities of hashing (such as “salted-hash” with secret key) and therefore comes to an incorrect conclusion.
Similarly, one could argue that encryption could be a method for implementing effective “deletion” of data that is workable on blockchain. Encrypting all personal data with a key and deleting the key in response to a request for erasure would render the data inaccessible to anyone, which in layman’s terms is the same as deletion. However, GDPR does not define what it means to “erase” something, so in the absence of a definition, legal conformity tends to be to revert to the literal reading of a word.
Keeping personal data off chain
Blockchain is a field of extreme innovation, and while allowing some data to be kept off chain may seem somewhat counterintuitive to the purpose of this technology, we may find that a compromise between network viability/utility and regulation is an optimal solution for some. While you are likely to forfeit some utility, convenience, or functionality of the network, figuring out a way to store personal data off chain so that you can delete it (in a literal sense) upon request may be a worthwhile sacrifice for businesses unwilling to live with regulatory uncertainty. To this end, some blockchain technologists are exploring structures that store personal data off chain but post references to that data on the chain.
We are in the early stages of paving the way for GDPR-compliant blockchains, particularly since most of the large existing blockchain networks are permissionless and fully decentralized, making it impossible to figure out who to hold liable even if there is liability. But as blockchain-based projects continue to commercialize and more networks evolve with identifiable stakeholders, it will be interesting to see which solution to the GDPR question dominates in this groundbreaking space.
Anna Fridman is cofounder and General Counsel for Spring Labs, the company behind the Spring Network.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.