Investigating privacy-respecting online identity, data ownership & control solutions

micheleminno · December 3, 2018, 4:05pm

Original title: Proposal for a centralised control and market space where we can edit our third-party accessible social media profiles

I know there are a lot of available tools and ongoing projects about it (see this thread and this site in particular). I’d like to add a proposal that I think goes a little further, closing the loop from users to social media.

Users, with their activity on social sites and apps, feed the social media AI algorithms with simple atomic data that then are computed into big complex user profiles, inferring a lot of information not explicitly readable on original user data. These profiles are then sold to third-party companies and entities, and they can be extremely relevant for our present and future life in any unimaginable negative way.

What I propose is to force the creation of a central user data control and market room (we can call it “VPE”, for validation, privacy and economic value), recognised by governments and social media themselves. Using the VPE app, users can see their full profiles, output of the AI algorithms powered by social media. For each social media they can edit any information in the profile, delete it, adjust its privacy setting. The closing loop is the following: once the profiles are validated by users, they feed back into the social media, who can from that moment use only that information, both internally for their AI engine and externally, selling them to third-party customers. The same would be for any noncommercial, political, etc. entity. The user will see in the VPE app how much their new edited profiles are worth in term of money, compared to the original ones. They will have to compensate social media for the less rich profiles with an amount of money, showed in the app. That would be the price for their privacy and ‘right to be forgotten’.

What do you think?

patm · December 3, 2018, 4:37pm

Sounds good. I’m intrigued by the idea of users paying for privacy. This would be part of each entity’s profit, and in the case of governments, it would be a form of revenue.

However, governments would already have such a database, right? For example, the social security database of the U.S.

aschrijver · December 3, 2018, 8:28pm

Yes, they are interesting ideas @micheleminno. Very complex in many ways, but - as you say - a lot of the groundwork is being done in a large number of projects.

With the regards to the pricing - the concept that you pay for the loss of value of your profile - is a very innovative idea. It is the opposite of the much-talked-about selling of your own personal data. This last one is discussed now and then on Hacker News. The consensus is that it is probably not attractive for users to do this, as there is so little money to be earned (your prices are in the correct range), and your most valuable data (name, address, phone, etc.) can only be sold once.

When you turn it around, it is a bit easier, more feasible. It is very hard to accurately determine price, though. But the social network could set the price using their own price models.

A thing to consider: If I want privacy guarantees, why wouldn’t I go with a social network that offers a paid subscription model? – though you are probably thinking of a model that would be acceptable to the big tech platforms, like FB.

The biggest issue, I think, is enforcement. How do you check whether the social media network provides all the aggregated data? And once they sell the data to 3rd parties, how do you still have any control over it? How do you know what was sold?

Answer is, you can only enforce with proper regulation, and goverment + the law + high fines for breaches of contract. Interesting in that regard is Estonia, the most digitalized country in the world, where you can do all your government services entirely online, except marrying and buying a house. They have created a framework of laws to support their digital services.

And another thing that is interesting, is the initiative started by Tim Berners-Lee - the founder of the internet - and his Solid framework. Contrary to what Estonia offers and to your idea, this technology is decentralized, but still gives you control of your own data. After long preparation there was a launch last month and first commercial initiatives have started. See: https://solid.inrupt.com/how-it-works and https://solid.mit.edu/

healthyswimmer · December 4, 2018, 6:15am

I’m putting my neck out here in a field I may be misunderstanding., But should we be careful to not create a market in something we should have a civil right to in the first place (our privacy)?

aschrijver · December 4, 2018, 10:27am

@healthyswimmer, I think both are needed and equally important. It is not an either/or choice. On the one hand civil rights (see the Digital Human Rights Declaration project idea) to lay the foundation, and ideas and projects similar to what @micheleminno proposes, to build the tools that comply to these same rights.

Note: Also I think both of these initiatives are too big to handle for our community, but that we can be the facilitators, the connecting / communication medium for them, and laying groundwork for furtile discussions.

micheleminno · December 4, 2018, 3:53pm

I hope also that we can try to design and maybe develop a prototype of this…

micheleminno · December 4, 2018, 3:58pm

Yes, and governments would be also controlled by users for the data they share, i.e. the number in the US representing how good are you in paying bills and payments, nowadays being shared with recruitment systems and so on (see the book ‘Weapons of math destruction’ by Cathy O’Neil).

Free · December 4, 2018, 10:13pm

Thank you this is an excellent proposal. However I personally wouldn’t want to have ANY profile at all with any company.

What’s the use of social media profiles to users? Social media is supposed to be about communication, not about profiles and spying. The latter were probably just created to make money by selling information about users to marketeers, and have little real use to users at all.

micheleminno · December 5, 2018, 8:36am

Thank you @Free, yes I agree, but we’re still far from that ideal situation. My proposal would be a possible first step in that direction.

aschrijver · December 5, 2018, 10:28am

It depends on what you see as a profile. On any platform where you are not wholly anonymous there is the need to store some information about you, if only your username and/or IP address, email for password recovery, etc.

(Note: A fully anonymous social network should be possible, where your profile is hung up to a (cryptographic) key provided by a trusted 3rd-party that vouches it relates to a real person, similar to what @micheleminno is proposing).

Depending on the features of the app or platform, more profile information is needed. Like e.g. an Email service that maintains a list of stored contacts for your convenience. Still this information could and should (as proposed) be under your full control, and preferably be stored somewhere outside of the platform itself.

In @micheleminno’s proposal I do not think that the monetary part of the solution - the value increase/decrease of the data - is the most relevant. I’d propose to drop that from the solution, as it provides no guarantees.

I see more value in a solution based on a combination of regulation and cryptography:

As a user of a certain platform or service I define a data contract that:
- Determines which data points the service provider is allowed to use
- Determines for what purposes the service provider may use my data (e.g. prohibit 3rd-party resales)
This data contract is signed with my personal secret key, and a key from the service provider
- Regulation prescribes that wherever my data is used, it must be accompanied with this signature
- If the signature is missing, or it is invalid, then the data contract is breached and you are in violation of the law

Maybe what I have just described already aligns with Solid from Tim Berners-Lee. Have to check that out still.

Note that I think that this cryptographic solution offers more benefits, e.g. in the fight against fake news. For this last subject I was thinking of creating a separate topic for it, but I can just as well post the outline of the idea here:

Cryptographic Keys and Key Providers

Every citizen in the world gets the opportunity to create one or more cryptographic keys that are in long-term storage at trusted key providers.
The key providers are decentralized, and there can be countless no. of providers. I can self-host my own provider, if I want
Other key providers offer the facility to backup keys from another location, so when you lose your keys, there are backups
Key providers also offer the ability to revoke and invalidate / delete keys, e.g. when one of them gets compromised / hacked

Keys and Identity

My internet freedoms allow me 3 possible ways to interact with the internet:

Anonymous identity
Pseudonymous identity
Validated identity

When anonymous, i need no key at all. Whatever information I submit cannot be traced to an identity. This type of information is untrusted. It can be fake news.
When pseudonymous, the information I submit can be traced to a valid key in a key provider
- The provider may store additional Claims regarding the identity
- Some of the Claims may be obtained / cached from other key providers
- The provider can also have links to other key providers that hold Claims about me
With a validated identity there is not only a valid key in a key provider, but authoritative Claims that prove my real identity
- The Authority of the key provider needs to be established.
- E.g. only a government key provider may have the authority to issue the claim of my Nationality

Fighting fake news

What is needed to fight fake news is:

A recognized key identity system as outlined above
Government regulation and laws for dealing with breaches / violations
Internet apps (e.g. social media platforms) and hardware basing the veracity of information on Keys + Claims

Some examples:

If I am a journalist, and I film a newsworthy event, then I want to have an USB stick with my Validated identity attached to my camera, so that everything I film is automatically signed, and cannot be altered in any way without becoming invalid.

If I am posting pseudonymous to a social network the Key and Claims could state that I am a real person, living in the UK, and working as professor at Oxford. The key providers at Oxford and of the UK government vouch for that fact.

Control of my profile

Back to the original post: I can use a pseudonymous identity key and have my profile fields as Claims attached to it, either for global use, or for a whitelisted number of platforms & services. If a platform infers some aggregated data from it (using AI or whatever) and does not post back that data to my key provider, then there are no Claims for it. The data is invalid and the platform is in breach of the law.

Before starting a project we need to do some research on what is already happening in this field. Maybe we need to bring existing initiatives closer together. A problem in the space of cryptography and decentralized web, is that it is very fragmented and many developments happen out of view of the mainstream.

A good resource for a Web of Trust is http://www.weboftrust.info/ and especially the research collected in a number of Github repositories:

Additionally there is the W3C Credentials Community Group:

The mission of the W3C Credentials Community Group is to explore the creation, storage, presentation, verification, and user control of credentials. We focus on a verifiable credential (a set of claims) created by an issuer about a subject—a person, group, or thing—and seek solutions inclusive of approaches such as: self-sovereign identity; presentation of proofs by the bearer; data minimization; and centralized, federated, and decentralized registry and identity systems. Our tasks include drafting and incubating Internet specifications for further standardization and prototyping and testing reference implementations.

The working group is evolving a number of standards such as Decentralized Identifiers (DIDs) and Verifiable Claims which elaborates on some indicative use cases:

(Note: Some of the work in this space is related to blockchain technology, which I am not very much a fan of… yet, at least)

aschrijver · December 5, 2018, 1:58pm

An additional complexity to the system outlined in previous post, and @micheleminno’s proposal, where the goal is to have full control of your own data: There are legitimate cases where you should not have full control.

If you can edit and approve every data point in your profile, then you filter out all the negatives and keep only positive facts about you. If you misbehave on a platform - or are an outright troll - then you should not be able to remove all the flags and reporting about your behavior.

To handle this in the system, the platform should have a Terms of Service where the rules can also be interpreted by code. The flags are a form or aggregated data, and this time - when sending it to your profile storage the platform attaches a data contract of their own to it, which you must accept. This contract could state that you cannot delete or edit this data as long as you are member of the platform, but that only you and the platform admins are allowed to read it.

But there are more, and different cases. If in real life you apply for a job and then your potential future employer could contact your boss from a previous position in your CV and ask about your positive and negative sites. If there are negatives you will not be able to suppress them. You can only react to them, if you are invited for a talk.

If you and your future employer used an automated platform to help with this - say LinkedIn - and it used the system outlined here, then how would that work?

It could work something like this: On behalf of your potential future employer a job evaluation request is sent to your former boss. Former boss fills in the request and adds a bullet list of positives and negatives, which are sent back to the platform upon submission. Though the request is signed by the Validated identity of your former boss, the informaiton in it only reflects his opinion of you (but it isn’t necessarily factual… she/he can hold a grudge against you). So the platform first sends the request as a number of Claims to your profile storage, and allows you - via the data contract - to attach your own opinion / reaction to each of the claims, beore it is sent back to your future employer. This way you are able to defend yourself. But you can also in turn smear your boss. So when you submit your reaction to the platform, it could be sent to both your future employer, as well as to your former boss.

This is quite a complicated process flow (and could have further steps than outlined), but it is also application-specific, and that is fine. The data contracts on this information exchange could state that the information may only be shared between the 3 parties involved, or risk violation of the law.

There are more cases where you should not be in control of your own information. If you are a convicted criminal, for instance, and you have just been released on bail. Another party should be able to find out if you are trustworthy before bestowing trust on you based on your data.

aschrijver · December 6, 2018, 8:56am

Would like to mention the open-source Unomi project just started at the Apache Foundation. It is a user profile server with some interesting aspects: Apache Unomi:

Apache Unomi is a Java Open Source customer data platform, a Java server designed to manage customers, leads and visitors data and help personalize customers experiences while also offering features to respect visitor privacy rules (such as GDPR)

Apache Unomi is also the reference implementation of the upcoming OASIS Context Server (CXS) standard to help standardize personalization of customer experience while promoting ethical web experience management and increased user privacy controls.

aschrijver · December 6, 2018, 3:37pm

This is a great video that you should watch to understand more of the underlying complexities, and what is already going on in the field of “Self-Sovereign Identity” - the mechanism that allows control of your own data:

There is also a shorter version of the above, but I think you need to longer one for better understanding:

JeDI · December 9, 2018, 4:23pm

There is a related effort started by Tim Berners-Lee called “SOLID” seeking to let users control their personal/profile data (mentioned by @aschrijver) . And a discussion for technologists in the area hosted by IEEE that may be of interest.
“ownership” of data, and “control” of data are critical aspects of the 21st century economy.

aschrijver · December 9, 2018, 10:35pm

Yes, @JeDI, I know about Solid. I mentioned it above. Do you have practical experience with it? I saw there is work in creating ReactJS components that incorporate the technology, hide the intricate complexities. Very interesting.

I am not sure if I want to sign up to IEEE, though they have many interesting publications. Is it worth it you think, or will I still bump into numerous blocked, paid articles?

zincfoam · December 10, 2018, 12:01pm

I love the overall thought level that has gone into this work, and the way that many use-cases and scenarios have been thought out. Ultimately, it’s a technical approach to a political problem, and that’s why I think it has no chance.

A realpolitik view of this brings up some crucial political/legal challenges:

“This data contract is signed with my personal secret key”
People can’t manage passwords reliably, It is unrealistic to think that personal key management would be used by anyone outside the tech industry.

“Regulation prescribes that wherever my data is used…”
We can’t even get rid of binding arbitration clauses in the USA. There’s no chance at all that the force of law will come to the side of consumers in this way. You’re assuming GDPR+ here, and that may be possible in EU, it’s simply not conceivable in the US now or in the foreseeable future.

“Every citizen in the world gets the opportunity to create one or more cryptographic keys”
This is such a deeply “western” POV that assumes so much about rights, law and culture. This concept is already illegal in places like China, The Middle East, and a pretty large swath of Southeast Asia. I’ll toss much of Eastern Europe into that mix as well.
Also a 3rd party recoverable key is a compromised key.

“only a government key provider may have the authority to issue the claim of my Nationality”

Again, in the US, we have no “national identity card” (arguably a passport is such a document, but only 42% of Americans hold a passport in the first place. Yes, we have state-level identity cards (drivers license and “non-driver” ID cards) - but you’re under no legal obligation to get any of these and I’d suggest that making identity cards of any kind “mandatory” would spark a political storm of epic proportions.

From the POV of the US Citizen, I think that the problems of identity and data are not technological, they are legal.

Our “digital person” does not have the same protections under our constitution as our physical person. The “3rd Party Doctrine” basically says once you give your data - directly or indirectly - to a 3rd party, you have no right or expectation of privacy.

There are no meaningful penalties for leaking data. Increasingly, there are no social penalties (the Ashley Madison breach came and went quickly enough).

So, I think until the weight of legal and financial pain is brought to bear on those who collect and mis-handle data, there is no need for more technological solutions that regular people can’t and won’t use.

aschrijver · December 11, 2018, 10:19am

Good points @zincfoam! Let me address each of them in turn. You may know about these concepts already, but I’ll add some additional explanation for others to understand too.

Handling keys by non-technical users

Cryptographic keys are different beasts than passwords. Cryptographic technologies mostly exist in a layer that is hidden from view of regular, non-technical users. You use them without being aware of them, like when you browse secure (HTTPS) websites. Under the hood keys, certificates and truststores do the work of ensuring your connection is secure. If two ProtonMail users exchange emails, then the service ensures that the mails are signed and encrypted. Etcetera.

Keys are not meant to be human-readable and memorizable (they are very long random-looking chararcter sets).

There is some management of keys that is similar to handling passwords, however, e.g. when the key exchange mechanisms in this ‘Web of Trust’ use public-key technology, then there are private keys that are strictly private and must be protected, while public keys can be exchanged freely.

A key provider must provide secure access to your keys, and could use a password mechanism for this, accompanied with e.g. two-factor authentication (like confirming access using your phone). When there is need to carry private keys around, then they could be stored on a bank card and protected by a pincode (where 3 failed access attempts locks the card). There are many methods to deal with keys securily.

Laws and regulation

There is no explicit need for all the appropriate laws and regulations to be in place. The Web of Trust technology can stand on its own. But it would certainly help is regulation was designed in support of the technology.

Besides the GDPR many countries already have other laws in place that could be applicable to breaches of trust. Like when you steal someone’s key and gain illegal access to personal information, then this may constitute a cybercrime.

Laws - if they exist - can be transformed to Claims in the technology layer. This means that as an end-user I can make an informed decision whether I trust a 3rd-party with my data. If the service I want to invoke can’t make any valid Claims, because e.g. the server is hosted in North Korea, then I can decide not to use it.

The amount and nature of the valid claims a sevice can provide thus establishes its Authoritativeness, its reputation, if you will.

Accessability and scope of Web of Trust

Yes, you are right in stating that large parts of the world do not have access to technology, like we do in the West.

An important point, however, is that the Web of Trust, pertains only to The Web i.e. the internet and those that already have access to it. The identity system outlined in my previous comment is not meant to be an universal system for identity that also extends to the ‘real world’ (outside of the web). Nothing changes there, and you have passwords, bank cards, birth certificates, etc. to prove your physical identity. Web of Trust is about your Digital Identity.

This does not preclude a translation of the Universal Declaration of Human Rights to the Digital Realm, that states that every person in the world has the right to have a Digital Identity.

Cryptography, identity and encryption

Do not confuse cryptography with encryption. They are different things.

I can use a key to sign content that was created by me, which allows other to establish with confidence that I was indeed the creator. This mechanism also extends to verify that the content I receive was tampered with and modfied by some man-in-the-middle, a nefarious actor. So keys establish Identity and Authenticity.

I am sure that countries such as China do not have a problem with the above use of key cryptography. Many governments including that of China, but also Australia (see: Australia anti-encryption law) and the US, however, have issues with Encryption. Under the guise of fighting terrorism they want to be able to spy on anyone’s information exchanges on the internet.

But encryption is an optional next step that can be achieved with key cryptography. Ensuring online privacy of communication (using encryption) is a universal right that we should fight for, but it does hamper the Web of Trust concept (though weak encryption, means weaker assurances of trust).

Issuing Claims and establishing Authority

You are once again right, about large parts of the world population not having government-issued identity cards. I should clarify that a governent Claim of your Nationality is just an example (therefore the ‘may’ in my sentence).

Anyone can issue Claims, and there may be more ways to establish your Nationality. Note that on many occasions you wouldn’t need to state that claim to establish trust. The Web of Trust in that respect is very similar to how trust works in the real world.

If I want to approach a friend of yours whom I don’t know, then - for her/him to trust me - it would be sufficient for me to show that person a valid Claim provided by you to me, stating that I am your friend.

If on the other hand I would be posting an article to Bloomberg, stating that I was “Barack Obama”, then Bloomberg would require me to provide a number of really strong Claims to prove that fact. If the only claimable fact was a server IP address in Nigeria, then Bloomberg would immediately reject my article (and flag me as untrustworthy).

Note that the Key Providers are decentralized, just as the web is inherently decentralized. This means that I could run my own key provider server, or host one with the people in my neighborhood. I can create as many keys as I want, but they are not of much value without claims attached to them.

To establish Nationality, instead of my government, my bank may be willing to provide it. The claim may be less trustworthy. Maybe you don’t trust my bank. It could also be provided by e.g. the UN, or Unicef, or any NGO or even commercial party, if I can convince them to provide me that claim (e.g. by showing registration papers of my city, or whatever).

Breaches of trust and Security

Last point, and also mentioned earlier, the technology and law go hand in hand. Insufficient law does not hamper the Web of Trust technology to be used.

Regarding actual breaches of the law: Data breaches and stolen identites (like Ashley Madison, Marriott, etc.) are very much in the news these days. More and more data breaches occur. Every hacker, government and commercial entity is out to get our personal data.

But these breaches mostly occur, because the systems where the information was stolen from, are inherently insecure. There is much sloppiness in their implementation in general, because monetary incentives prevailed when creating them, not privacy and security.

Embedding well-designed cryptographic solutions into these systems would greatly increase security and privacy. Cryptography is a very complex subject, and can be easily implemented the wrong way. But it is good to know that experts in these fields are pushing the technology and creating applications, libraries, projects and best-practices for application developers that hide these complexitiies.

Adoption of these solutions is important for the Web of Trust to come about, and this is a slow progress unfortunately, because of the need to standardization and interoperability of systems.

aschrijver · December 11, 2018, 6:57pm

The links above to standards like Decentralized Identifiers (DID) and Verifiable Claims are applied in the concept of Self Sovereign Identity.

Apparently all solutions of Self Sovereign Identities (SSI) assume some form of blockchain technology. And this - for me - is problematic, as I am not convinced and very sceptical about blockchain technology in general, and so are many others in the developer world. So far there are no real viable solutions based on blockchain. Maybe SSI is a good use case for blockchain, once / if that matures in the future.

SSI is therefore not part of my considerations if blockchains are involved! I asked a question about it in the DID W3C repository.

Free · December 14, 2018, 10:27am

Even if we own the data about us and keep it in a private basket (so to speak) there is still the issue of non-anonymous identifyers that are used to track and match us.

For example, your home IP, your wireless connection ids and so on are unique ids that point to just you (or your household). These are necessary to connect to networks. Some companies are using these to match people, for example by installing scanners in physical places to track every person’s device in that place. This is more commonly used in apps and websites, which then match that person to follow them around. These are then matched and a connection is made, you are in x locations or y app actions or z webpages therefore we can add this to your profile without knowing anything else about you besides these unique ids that you use to connect online.

aschrijver · December 14, 2018, 11:27am

Yes, that is true, and to a certain extent you will always place trust somewhere.

But note that with this new technology, and decentralization in general, it becomes much harder to faciltate this tracking of users and collecting their data without your knowledge.

Take for example the Fediverse (with e.g. twitter replacement Mastodon). Here you have 1,000’s of federated servers. A user only directly communicates with their own server (the one that has their user account), and everyone can host their own server, or you choose one that you trust. Your IP is not exchanged in cross-server synchronization (plus this communication is encrypted).

On top of that the software we use in future can incorporate functionality from VPN’s (target server sees different IP) and disallowing cookies (3rd-party tracking much harder) in standardized ways.

And to those places where you are sending your IP to, it becomes part of the data contract I outlined above.

That is how far technology can take you (which is quite far), the rest should be regulation and law.

Topic		Replies	Views
Forfeiting the Zeroth AI War, and control over our own data Security privacy , data , ai	4	1138	May 2, 2018
Privacy is fundamental to Humane Tech (and Democracy)! Privacy privacy , business-model , ad-based , addiction , activism	47	3894	December 6, 2018
Edge your life Privacy	3	774	August 5, 2019
Resisting Surveillance Capitalism Privacy attention , privacy , idea , collaborate , ethics	4	833	August 19, 2020
Potential Tech Regulation: Let users control the effectiveness / addictiveness of recommendations Humane design idea , policy	7	1079	April 12, 2019