Operationalizing Privacy in a Big Data Context

Operationalizing Privacy in the Big Data Context

Ann Racuya-Robbins

Operationalizing privacy is largely about understanding the nature of the data you are interested in analyzing. Understanding of the nature of the data involves intuition, ethics and some ICT technical knowledge. An average adult person has the ability to understand and make decisions on these matters.

Once the nature of the data is understood intuitively, ethically and in a general technical way privacy requirements for Big Data can be delineated in alignment with national standards, laws and regulations followed by the further specification of technical details that meet privacy requirements in deployment. Importantly since this technical knowledge is based on intuitive and ethical parameters the average person can understand the relationships between the privacy parameters and the technical and they should remain clear.

Privacy Risk Assessment, Management, Prevention and Mitigation

Goal: Privacy Preserving Information Systems using Big Data, Big Data Analytics

Scoping the Privacy Context – A Question and Answer Tree.

Pre-Big Data Processing/Analytics  also a Post-Big Data Processing/Analytics

QUESTION: The most important question is: Does your prospective data set(s) contain personal data*?

ANSWER: Yes.

FOLLOWUP QUESTION 1: How do you know?

FOLLOWUP ANSWER: Metadata Personal Data Tag

FOLLOWUP ANSWER: Provenance Report from Data Vendor or Agency

FOLLOWUP QUESTION 2: Can you verify? Reproduce?

FOLLOWUP ANSWER:

 

ANSWER: No.

FOLLOWUP QUESTION 1: How do you know?

FOLLOWUP ANSWER: Data Vendor Reports no Personal Data.

FOLLOWUP QUESTION 2: Can you verify?

FOLLOWUP ANSWER: No. Data Vendors maintains Proprietary  Status of Data

ANSWER: I don’t know.

FOLLOWUP QUESTION 1: How can you find out?

 

QUESTION: Does your data set contain “raw” data?

QUESTION: How large is your Data Set(s) Cluster?

< 100 gig

< 1.5 TB

< 100 TB

etc

QUESTION: Will more than one data set be linked and analyzed.

QUESTION: What is the anticipated rate of arrival of the data? At what velocity will the Data Set(s) Cluster be processed/analyzed?

QUESTION: Is the data irregular and of multiple data types?

QUESTION: Will the processing/analytics be used for real-time decision-making?

More QUESTIONS to be determined.

 

*Appendix

Definitions TBD

Personal Data/Information

Data Actions

             Collection

Processing

Use

Logging

Disclosure

Generation

Retention

Transformation

Transfer

Inference

Extrusion

Pollution

Manageability

 

Personal Data Metadata Tags

General Personal Data = PD-G

Very Sensitive Personal Data = PD-VS

 

Privacy Rights Risks, Harms and Mitigations (Controls)

Rights TBD

 

Harms

Appropriation: Personal information is used in ways that deny a person self-determination or fair value exchange.

Breach of Trust: Breach of implicit or explicit trusted relationship, including a breach of a confidential relationship

Distortion: The use or dissemination of inaccurate or misleadingly incomplete personal information

Exclusion: Denial of knowledge about or access to personal data. Includes denial of service.

Induced Disclosure: Pressure to divulge information.

Insecurity: Exposure to future harm, including tangible harms such as identity theft, stalking.

Loss of Liberty: Improper exposure to arrest or detainment.

Power Imbalance: Acquisition of personal information about person which creates an inappropriate power imbalance, or takes unfair advantage of or abuses a power imbalance between acquirer and the person.

Stigmatization: Personal information is linked to an actual identity in such a way as to create a stigma.

Surveillance: Collection or use, including tracking or monitoring of personal information that can create a restriction on free speech and/or other permissible activities.

Unanticipated Revelation: Non-contextual use of data reveals or exposes person or facets of a person in unexpected ways.

To Be Defined:

Data Inference

Extrusion

Pollution

Bias

Discrimination

Data Subjects Intellectual Property

 

Preventions, Mitigations, Controls

 

Big Data Guidelines Repository at WIPO or

 

Data Ownership in the Big Data Context

Who own the data in the Big Data context? What does ownership in the Big Data context mean?

 

Data Ownership –

Ann Racuya-Robbins 2016 06 10

Data ownership means that the data subject owns the majority of the revenues generated from data that emanates/ed from or was built upon the data subject’s data. A data subject is a living being. This kind of ownership would mean that the data subject has the authority over decisions including development and disposition of the data subject’s data.

Also see the Individuals Trust Frame Work

How to communicate justice and generosity

My thoughts are along the line of:
To IDESG Legal Counsel:
The Identity Ecosystem Framework the IDESG is working to create is in important respects a new kind of community and organization based on a set of principles agreed to across a broad and diverse set of stakeholders. We would like our agreements/contracts to reflect its unique character. In the IDESG and the IDEF it is well known that documents like Terms of Use are frequently too long and complex and are frequently clicked through without understanding or some would argue informed consent. The IDESG would like to innovate in creating TOU that more effectively communicate our character. To that end we would like to be able to have the essential liability protection, perhaps in the form of a disclaimer, but not one that is a buyer beware notice. Rather because we need and hope for broad adoption of IDESG NSTIC* guided services and products we want to indicate that as a community all of us including the IDESG are in this together not trying to simply gain some special advantage over our service/product participants and users.
We would like your guidance in how to balance these needs in our first product/service “the SALS” TOU in order to set the desired tone, and to provide a model of how IDESG policies will unfold in the future.

NSTIC Guiding Principles

Big Data Governance

Big Data Governance

However large and complex Big Data ultimately emerges to become in terms of data volume, velocity, variety and variability, Big Data Governance will in some important conceptual and actual dimensions be much larger. Data Governance will need to persist across the data lifecycle; at rest, in motion, in incomplete stages and transactions all the while serving the privacy and security of the young and the old, individuals as companies and companies as companies—to be an emergent force for good. It will need to insure economy, and innovation; enable freedom of action and individual and public welfare. It will need to rely on standards governing things we do not yet know while integrating the human element from our humanity with strange new interoperability capability. Data Governance will require new kinds and possibilities of perception yet accept that our current techniques are notoriously slow. For example, even as of today we have not yet scoped-in data types.

The reason we, so many of us, are gathering our energies and the multiplexity of our perspectives is that we know Big Data without Big Data Governance will be less likely to be a force for good. It may come to be said that the best use of Big Data is Big Data Governance.

What concept or concepts are powerful enough to organize, cohere and form an actionable way forward? Are we brave enough to push forward a few concepts for our discussion?  Some think data provenance, curation and conformance are the way forward. I agree with those that think this ground deserves a fifth V — Value.

The Human Trust Experience in an Era of Big Data

Consumer, Manager, Domain Expert Proposal
Subtopic: Unmet Big Data requirements

Ann Racuya-Robbins Image
tHTRX Logo graphic

1. Title
The Human Trust Experience (HTX) in an Era of Big Data

2. Point of Contact (Name, affiliation, email address, phone)
Ann Racuya-Robbins
World Knowledge Bank: Human Trust Experience Initiative

3. Working Group URL
https://www.humantrustexperience.net

4. Proposed panel topic: Unmet Big Data requirements

5. Abstract
The Human Trust Experience Initiative’s mission is to use Big Data to explore and lay the ground work for understanding the parameters, characteristics, attributes, information architecture, and reference and interaction models of the human trust experience in motion and at rest. Central premises of this work to be evaluated and interpreted are that:
• The human trust experience is foundational to Privacy, to the uptake of ICT innovation, education and the challenges of democratic governance.
• The human trust experience is a central component of all human labor and to individual and community well-being and survival.
• The human trust experience can be a measure and standard by which we understand and prioritize problem solving.

6. Working Group summary
• Create the human trust experience use case.
• Create the human trust experience context.
• Create a semiotics and information architecture of the human trust experience.
• Facilitate through CMS conversation about the tHTRX in a Big Data context.

7. Number of Participants, data working group began, frequency of meetings
December 2013

8. Target Audience
Individuals, Consumers and Producers of Big Data, Businesses, Government

9. Current initiatives
The Human Trust Experience Initiative

10. Specific Big Data Challenges:
Value, Valuation, Contextual Veracity, Identity, Pseudonymity, Anonymity, Privacy, Vetting, Contextual Vetting

11. Urgent research needs

12. Related Projects or Artifacts The Human Trust Experience: Informed Valuation Project

13. Big Data metrics (describe your data to make a Big impression)
Search, discovery, revelation, creation and analysis of the human trust experience from cyberspace data.

14. Keywords
human trust experience, value, valuation, informed valuation, informed contextual value, informed contextual valuation, contextual veracity, identity, pseudonymity, anonymity, privacy, risk management