In an era of big data and AI, what are the roles of decentralized internet and data storage concepts? The tensions and contradictions of these parallel developments were unpacked at SXSW in a compelling talk, Designing For the Next 30 Years of the Web, by Justin Bingham (CTO of Janeiro Digital) and John Bruce (Co-founder and CEO of Inrupt). They presented a whole new way of storing data and therefore breaking the current privacy paradigm, and their approach merits discussion outside of just one tech conference.
Decentralizing the web
Data is the core of the internet. However, as the internet has evolved, the way data is exchanged has shifted significantly from the intentions of one of its creators, Sir Tim Berners-Lee, who had envisioned an internet where information exchange did not include the transfer of actual data to the requesting party. Instead, he believed data would only exist with its owner and the internet would consist of links to it for reading and writing purposes.
That’s why Berners-Lee started the Solid project, which defines standards for a decentralized internet. Personal data is kept by the individual user and not stored centrally with each service supplier.
Late last year, open source startup Inrupt built an application based on the Solid standards that enables a more peer-to-peer internet with Personal Online Data Stores (Pods) for everyone. The Solid network is fully conceptualized around these Pods that contain all the data of one person, whether it be your bank account or latest social posts. In this case, the data referring to you is fully owned by you. Both Inrupt and people within the Solid Community provide Pods that run on their respective servers, but you can also create your own Pod on your own server for ultimate privacy. There is no central owner of these Pods since this would undermine the Solid principal.
One single integration
With Pods, all services, from your favorite taxi company to your insurance company, would communicate through one API with your personal data, each having separate read and write access to different parts of that data whilst reading and writing simultaneously. To cater to this, Inrupt started working with Janeiro Digital to create an open standard that all applications can work with. The beauty of this is that applications only need to learn one standard and integrate with the Pod to provide a data-driven service. Integrations between different services are no longer required.
Imagine writing an application that could combine and show posts from different social networks; one would have to retrieve data from each of them. Instead, if each of these social networks stored their posts in the Pod, this new application could simply be granted access to all posts by its owner, reducing the number of integrations to just one. Furthermore, if this new application wants to combine posts with other personal data, it could easily grab that information from your Pod.
Big data & AI
So how would big data, machine learning, and AI work within such a Pod-based world? All of these concepts rely heavily on centralized storage, and Pods are anything but that; especially when Pods are hosted all over the world, with no guarantees on network availability.
If data cannot be accumulated and needs to be fetched and interpreted over millions of Pods, how would it be possible to perform any machine learning without a significant performance penalty? And even if the data could be replicated and combined with more data, wouldn’t this then contradict the whole idea of Solid in the first place? And even if that is feasible, though temporarily, wouldn’t users simply set their preferences so their data couldn’t be used for data mining?
The big players
The companies that service a huge chunk of the current centralized internet are the ones that most rely on possessing our data. The majority of their turnover, which drives their shareholder value, is based on the data they collect from us; data they will never willingly give up for the purpose of the greater good. These companies will not embrace initiatives like Solid.
On the other hand, Bruce and Bingham also explained how Pods can introduce new benefits to companies and customers by enabling instant access to more data. One example is the combination of wearable data with that of an insurance policy, where the step-counter of your smartwatch could instigate a lower premium. (Of course, that’s an over-simplified example, since in the real world, the user would likely also have to consent to sharing other data, such as the food they purchase, which could then result in an increased premium.) All in all, it is likely companies will use Pods to trade consents, where certain services will only be available if certain consents are given. It is up to you to decide if the benefit is worth the trade-off. But how fragile will this freedom of choice be when it comes to basic services like healthcare?
The beauty of Solid lies with its simplicity, which showcases that it’s not compatible with current, complex website structures and their profit model of collecting data. The internet has become extremely vast and consists of many established platforms. Trying to change that will take an enormous amount of time, development effort, and most of all, goodwill. Having a completely new approach that disqualifies all existing applications out there can only succeed if it can grow to a similar size or bigger. Still, the Solid project is young; hopefully it will gain traction. Since the start of Inrupt, it has already seen a lot of attention, so the potential is certainly there.
Sandor Voordes is a Technical Director at Dept (Design & Technology Netherlands). A developer in the past, he now works as a technical lead and architect on large digital platforms. Sandor is a strong believer of a best of breed approach where solutions are a combination of several or more integrations. In the last 2 years, he was also part of the Dept task team to introduce GDPR and help Dept’s clients to become GDPR compliant.