Data Ownership in Big Data

Data is emergent—appears—from any subject.

The subject does not require engineering for data to appear.

In this way it can be said that the subject is the a priori necessary and sufficient condition, the cause of data. That is, data owes its existence to the data subject. The subject’s existential value is expressed and stored in data. In this way data ownership lies existentially in the data subject.

Data emergent and apparent from a subject is the primary common attribute among all subjects. The emergent appearance of data from any subject is more or less a priori compelling for any other subject. Each datum and the emergent data from a subject is existentially valuable to the subject itself.

The stored value of data is realized through—is inculcated in—the understanding and well-being of subjects. The subject’s well-being can be measured through economic, social, physical and many other beneficial attributes.

Since the subject is the cause of the data the subject must needs be the primary but not the sole or exclusive beneficiary owner of the data.

Devices require purpose driven engineering to interact with data from subjects.

Finding a common ground for life

Executive Summary

Today much of the data (also referred to as information and knowledge) in Big Data comes from the living individual or the once living individual enmeshed in a living world environment. Today much of the scientific community and the human population at large see the earth and even the stars and space in general as an interconnected web of the living and the once living. In fact in life science and biology an emergent third state of individual living, not just human, is beginning to be articulated as a dormant individual in addition to an alive or dead individual. The full implications of the dormant state of individual living is beyond the scope of the present Big Data work but will at every turn have an effect on the development and governance of Big Data.

It is fair to say that few developments in contemporary culture, technology and science reveal more about the intractable interconnectivity of living individuals than Big Data. Big Data often shows how and how much individual life needs other individual life and that the interconnectivity has an essential social aspect. It is fair to see organizations and groups as an expression of living individuals’ desire and necessity to interact, to form a social contract among ourselves. A social contact that is trustworthy and which holds our aspiration that our social contract be mutually beneficial to all individuals. In fact, while living individuals associate and re-associate continuously it is somewhat well understood that equality among living individuals particularly in American democracy must be an aspect of their individuality not their association. The ability to perceive life as an individual whole is a pivotal mystery around which our social contact, science and associations are based. The curiosity about living and individuals appears to be an organizing principle for all manner of investigations, the source of economic well-being and an engine to further understanding in all forms.

Today we know that the individual is comprised of many, perhaps an endless series of parts all of which bear more of less to describe the character of the living individual. It is fair to say that there exists among living individuals a sense of being impressed by, even overwhelmed by the muchness of living and all there is to understand and learn about life such that no term has caught on more to express this point in time than a time of Big Data. Even though we are but at the beginning of this undertaking.

Big Data is the creation of living individuals reflexing on their character through biology, life science, computation and information science. Without the ability to gather up, persistently record and review living, the full set of living details might well be lost. Ironically in a time of Big Data the term “life cycle” seems a particularly inadequate frame of reference to guide our investigations as notated above in the concept and life fact of dormancy.

From these features and the promise of greater understanding of living is born from Big Data a classic dilemma for living individuals—Who will own the wealth and power big data contains? Or, better yet, what is the best, most trustworthy and mutually beneficial way to share the wealth and power big data contains? What kind of social contract will we need and what technical implementations and controls will best ensure the well-being of living individuals.

Our apparent endless curiosity of the lifefact along with the promise of large economic returns from processing living individual attributes is creating stressors and pressures for inequity. Living individual attributes has been described as the new money.

Big Data is but the latest human invention that poses a challenge to the promise of an enduring and creative social contract and well-being. This social interconnectivity is the kernel of the ethical challenges and opportunities Big Data presents and that must be resolved.

Internal to the conundrum of Big Data is that the use of Big Data tends to create more data. Under current techniques and technologies, the more data and data sets on an aspect of living the less privacy for the living individual exists. Today there is no viable solution to creating and ensuring privacy for living individuals. And this is before fifty billion sensors have been released into the world. Yet living individuals must be able to weigh in on life without privacy before their privacy is removed. Currently there is no mechanism for living individuals to weigh in on privacy.

Big Data for all its rapidity and volume today requires thoughtfulness, self-examination and discipline in action and the development of new kinds of choices including ethical choices. A set of choices that makes concrete and real the principle that a living individual’s very mutability is a primary source of its value and that exploitation and inequity reduce the individual’s value. It is for this reason that the benefits of the ownership, value and governance of a living subject’s data must be drawn out and studied in a time frame adequate to go beyond limited self-interest. There must be standards and mechanisms for checks and balances to have effect.

All parties in this new domain—engineers, scientists, citizens, mothers, fathers and children need to understand and have a role in its governance if the fullest measure of Big Data’s value is to be realized.

At present many entities are engaged in a situation where individual living data is treated as a “free” natural resource where the first one that “captures” it has a profound and apparent irreversible advantage. There is a kind of a stampede and gold rush underway to “mine” it, to possess and own it. All of this is happening before the data has given anything back to the living or shown the capacity to add value to living and in a way that is commensurate.

Many—but not all—of the conundrums of Big Data such as ownership, value and privacy have created a third party stampede and tussling over the possession of the individual subject’s data.

Today in many sectors of our human community such as healthcare, financial services, education and government it is customary to assert—and not without reason—that more data about an individual subject will lead to better decisions about how to manage the business and innovation of that sector in profitable ways. In part this has led to competition to acquire more and more data about living subjects. In healthcare, for example, this often means that the most useful data is embodied in individual human behavior. Because most living is more or less individuated and the subject of the data, the problem of data privacy and governance is built into Big Data analytics and its usefulness. Certain key questions and concerns then are central to the aspirations and application of Big Data.

The purpose of this White Paper is to flesh out these questions, concerns, solutions and aspirations as they impact on living individuals and communities of living individuals particularly in the un-and-underserved and a few primary archetypal uses cases.

At present among the key questions to be highlighted are—

Does a particular use of Big Data increase the survival and well-being of living individuals?
What are the implications for life and living individuals from a sensor driven extractive and capturing approach to Big Data?
Is the Big Data Subject alive?
Should our solutions be the same for bad actors and the malevolent?
What other ways are there to harness the power and wealth of living details while furthering the freedom living needs for creativity. How to build the possibility for realizing these new ways into Big Data Analytics and the Reference Architecture?

THE CHALLENGE

How to govern and value the data that is captured and/or collected in databases and other ICT systems so as to realize the wealth inherent in large stores of data. This white paper will demonstrate solutions on two emblematic use cases clusters, first the un-and-underserved and 3 primary archetypal use cases while exploring the role and power of inclusion in data sets. (To be attached).

THE SOLUTION

The NIST Big Data Public Working Group practice guide <title> demonstrates how data scientists and citizens can instantiate NIST Big Data Reference Architecture (NBD-RA) to address the un-and- underserved, and 3 primary archetypal use case in the financial services, healthcare and nonprofit sectors.

This White Paper further demonstrates how the central questions and concerns including security, privacy, data governance, ownership and value can be addressed and supported throughout the governance of Big Data analytics lifecycle. This includes specifically how to interact with NBD-RA components – Big Data Definitions and Taxonomies, Big Data Governance including Provenance, Curation, Preservation and Processing, the Data Provider, Big Data Analytic Provider, Big Data Framework Provider, and Big Data Consumer and Big Data Subject’s Experience.

The guide:

Identifies key areas for innovation needed to sufficiently analyze the given
Identifies the use case characteristics needed to sufficiently govern and analyze the given dataset with (analytics )

Maps un-and-underserved and primary archetype characteristics to Big Data Analytic Provider
<others…>

BENEFITS

The white paper is based on an ethical approach and assumption that the benefits, including monetary from Big Data analysis need to be:

A: inclusive of all the living interests of the data subjects and stakeholders from which the benefits are drawn.

B: equitable to all the data subjects and stakeholders from which the benefits are draws.

C: should demonstrate through a metrics to be discussed in due course how the capacity for life and individual living is enhanced by and through the collection, development, analysis and governance of the data.

Implications for Life in a Time of Big Data

Executive Summary

Today much of the data (also referred to as information and knowledge) in Big Data comes from the living individual or the once living individual enmeshed in a living world environment. Today much of the scientific community and the human population at large see the earth and even the stars and space in general as an interconnected web of the living and the once living. In fact in life science and biology an emergent third state of individual living, not just human, is beginning to be articulated as dormant individual in addition to an alive or dead individual. The full implications of the dormant state of individual living is beyond the scope of the present Big Data work but will at every turn have an effect on the development and governance of Big Data.

It is fair to say that few developments in contemporary culture, technology and science reveal more about the intractable inter-connectivity of living individuals than Big Data. Big Data often shows how and how much individual life needs other life and that the inter-connectivity has an essential social aspect. This social inter-connectivity is the kernel of the ethical challenges and opportunities Big Data presents and that must be resolved.

Big Data for all its rapidity and volume today requires thoughtfulness, self-examination discipline in action and the development of new kinds of choices. A living individual’s very mutability is a primary source of its value but must not be the sourced for exploitation and inequity. It is for this reason that the benefits of the ownership, value and governance of a living subject’s data must be drawn out and studied in a time frame adequate to go beyond limited self-interest. Today this has created a situation where individual living data is treated as a natural resource where the first one that “captures” has a profound and apparent irreversible advantage. There is a kind of a stampede and gold rush underway to “mine” it, to possess and own it. All of this is happening before the data has given anything back to the living or shown the capacity to add value to living and in a way that is commensurate.

Many—but not all—of the conundrums of Big Data such as ownership, value and privacy are created a third party stampede and tussling over the possession of the individual subject’s data. Possession is not an appropriate measure of ownership once it has be taken from the original owner—the data subject be the third party a commercial enterprise, a government, a scientific endeavor, a clan.

At present among the key questions to be highlighted are—

What are the implications for life from an extractive and capturing approach to Big Data?

Is the Big Data Subject alive?

Can these questions and answering be built into the Big Data Analytics and Reference Architecture?

THE CHALLENGE

THE SOLUTION

The NIST Big Data Public Working Group practice guide <title> demonstrates how data scientists and citizens can instantiate NIST Big Data Reference Architecture (NBD-RA) to address the un-and-underserved, and 3 primary archetypal use case in the financial services, healthcare and nonprofit sectors.

The guide:

Identifies key areas for innovation needed to sufficiently analyze the given dataset.
Identifies the use case characteristics needed to sufficiently govern and analyze the given dataset with (analytics algorithm.)

Maps un-and-underserved and primary archetype characteristics to Big Data Analytic Provider

<others…>

BENEFITS

<The white paper is based on an ethical approach and assumption that the benefits, including monetary from Big Data analysis need to be:

A: inclusive of all the living interests of the data subjects and stakeholders from which the benefits are drawn.

B: equitable to all the data subjects and stakeholders from which the benefits are draws.

1 Introduction

1.1 Goals , Methods, Models, Dilemmas and Opportunities for Life

<Goals statement Big Data Goals for Life — Survival?

Today the world store of human life has grown greatly. It is not clear that any other form of life has increased as rapidly, except perhaps the microbes and other life that cohabitates on/in human life. This increase has brought with it many concurrent and emergent problems and opportunities for life, not only human but all life. These problems and opportunities have simultaneously brought to bear the limits of our creative capabilities in understanding human survival and the survival of life. Someones of us have yelled fire, and millions of people and their technology are looking for answers and understanding. Generally speaking this development is a good thing; on some level every life wants to survive and even flourish and thrive. The question and the context then becomes; Is our collective effort of gathering knowledge—data and information for the survival of life?

For now it is important not to be distracted nor to make too much of the differences in terminology here of data, information and knowledge, as if in our case, data is something fundamentally different from information and knowledge. It is not. It may be reasonable to point out that data and information are kinds of knowledge and/or contexts of knowledge without inferring that these contextual differences are greater than the common ground of knowledge. We could claim our subject to be Big Knowledge or Big Information. For now Big Data may suffice.

Implicit

For Whom

For What

For When

Principles

Projected

For Whom

For What

For When

Principles >

1.2 Approach

< Living Methods and Models

The Role of Thinking

The Role of Reflection

The Role of Metaphor and Mapping

The Role of Security

The Role of Privacy …Approach description >

1.2.1 Technologies Used

1.3 Benefits

Operationalizing Privacy in a Big Data Context

Operationalizing Privacy in the Big Data Context

Ann Racuya-Robbins

Operationalizing privacy is largely about understanding the nature of the data you are interested in analyzing. Understanding of the nature of the data involves intuition, ethics and some ICT technical knowledge. An average adult person has the ability to understand and make decisions on these matters.

Once the nature of the data is understood intuitively, ethically and in a general technical way privacy requirements for Big Data can be delineated in alignment with national standards, laws and regulations followed by the further specification of technical details that meet privacy requirements in deployment. Importantly since this technical knowledge is based on intuitive and ethical parameters the average person can understand the relationships between the privacy parameters and the technical and they should remain clear.

Privacy Risk Assessment, Management, Prevention and Mitigation

Goal: Privacy Preserving Information Systems using Big Data, Big Data Analytics

Scoping the Privacy Context – A Question and Answer Tree.

Pre-Big Data Processing/Analytics also a Post-Big Data Processing/Analytics

QUESTION: The most important question is: Does your prospective data set(s) contain personal data*?

ANSWER: Yes.

FOLLOWUP QUESTION 1: How do you know?

FOLLOWUP ANSWER: Metadata Personal Data Tag

FOLLOWUP ANSWER: Provenance Report from Data Vendor or Agency

FOLLOWUP QUESTION 2: Can you verify? Reproduce?

FOLLOWUP ANSWER:

ANSWER: No.

FOLLOWUP QUESTION 1: How do you know?

FOLLOWUP ANSWER: Data Vendor Reports no Personal Data.

FOLLOWUP QUESTION 2: Can you verify?

FOLLOWUP ANSWER: No. Data Vendors maintains Proprietary Status of Data

ANSWER: I don’t know.

FOLLOWUP QUESTION 1: How can you find out?

QUESTION: Does your data set contain “raw” data?

QUESTION: How large is your Data Set(s) Cluster?

< 100 gig

< 1.5 TB

< 100 TB

etc

QUESTION: Will more than one data set be linked and analyzed.

QUESTION: What is the anticipated rate of arrival of the data? At what velocity will the Data Set(s) Cluster be processed/analyzed?

QUESTION: Is the data irregular and of multiple data types?

QUESTION: Will the processing/analytics be used for real-time decision-making?

Privacy in a time of Big Data

Privacy in a Time of Big Data

Ann Racuya-Robbins

The emergence and existence of Big Data technologies and techniques have scoped the challenge of insuring privacy in contemporary life. It is fair to say that there is an inverse relationship, roughly speaking, between big data and privacy. That is as data scales up privacy challenges become more grave. The factors pressuring big data to scale, to get bigger, faster… are powerful including a hoped for competitive edge and speed and cost reduction of analytics to acquire these competitive edges under the name patterns. While the term patterns has gained currency in the field the term’s meaning is not so well understood. It is important to state clearly that the patterns that are sought are themselves data that contain or create an advantage. Understanding is itself an advantage. By advantage is meant largely a competitive commercial monetary advantage by a third party other than the data subject.

Privacy is a subject of individual life and living.

Privacy is an expression of biologic specificity. Privacy properly ensured and governed preserves innovation, creativity and living development. In this way privacy is a key ingredient of survival and successful maturation. The pervasion of data that has thrown open the loss of privacy carried in computer and ICT infrastructures is a relatively new phenomenon. The concern for privacy is a recognition of the broadening value of all individual life. A recognition of the dignity and richness of every life. A recognition that individual life is not rightly an object or property of another.

Privacy cannot be reduced to personal information i.e. name, address and/or other factuals. PII is an obsolete moniker for our subject.

Let us stipulate that we will in this first instance be referring to living individual adults. Living has many stages and forms that must to be addressed later.

Privacy is—living individual’s control over and freedom and refuge from data collection, capture, extraction, surveillance, analytics, predictions, excessive persuasive practices and communication of the living individual’s life, including external or internal bodily functions, creations, conditions, behavior, social, political, familial and intimate interaction including mental, neural and microbial functioning—unless sanctioned by civil and criminal law and when sanctioned only under protocols where the ways and means of collection, capture, extraction, surveillance, analytics and communication including new methods to emerge are governed by appropriate social cooperation principles and safeguards embedded in ICT infrastructures and architectures overseen by democratic courts and civil and community organizations and individuals peers charged with insuring proper conduct.

Living individuals own the data generated by or from their lives. Should revenues be generated from the collection, capture, extraction, surveillance, analytics and communication of the living individual’s data the majority of revenue generated from the living individual’s life belong to the living individual. Data ownership, provenance, curation, governance as well as the consequences of violations of privacy practices must be encapsulated in or within the data, be auditable and travel in encrypted form with the data. Where possible block chain techniques shall be employed as well as counterfactual strategies (processes) in engineering privacy.

Provenance is an accounting of the history of data in an ICT setting.

Next Steps

Define further Data Governance, Data Provenance, Data Curation, Data Valuation. Integrate the principles and practices outlined above into an archetypal Privacy Use Case(s) and articulate the Privacy Use Case as it proceeds through the reference architecture.

Data Ownership in the Big Data Context

Who own the data in the Big Data context? What does ownership in the Big Data context mean?

Data Ownership –

Ann Racuya-Robbins 2016 06 10

Data ownership means that the data subject owns the majority of the revenues generated from data that emanates/ed from or was built upon the data subject’s data. A data subject is a living being. This kind of ownership would mean that the data subject has the authority over decisions including development and disposition of the data subject’s data.

Also see the Individuals Trust Frame Work

Characteristics of Trust in a Time of Big Data

Implications for Life in a Time of Big Data
Goals, Methods and Models, Dilemmas and Opportunities
Ann Racuya-Robbins
February 20160229 —Spring 2016
1. Big Data Goals for Life — Survival?
Today the world store of human life has grown greatly. It is not clear that any other form of life has increased as rapidly, except perhaps the microbes and other life that cohabitates on/in human life. This increase has brought with it many concurrent and emergent problems and opportunities for life, not only human but all life. These problems and opportunities have simultaneously brought to bear the limits of our creative capabilities in understanding human survival and the survival of life. Someones of us have yelled fire, and millions of people and their technology are looking for answers and understanding. Generally speaking this development is a good thing; on some level every life wants to survive and even flourish and thrive. The question and the context then becomes; Is our collective effort of gathering knowledge—data and information for the survival of life?
For now it is important not to be distracted nor to make too much of the differences in terminology here of data, information and knowledge, as if in our case, data is something fundamentally different from information and knowledge. It is not. It may be reasonable to point out that data and information are kinds of knowledge and/or contexts of knowledge without inferring that these contextual differences are greater than the common ground of knowledge. We could claim our subject to be Big Knowledge or Big Information. For now Big Data may suffice. Later there will be time and effort applied to pinning the technological details of our project.
What makes data, knowledge or information Big? A hundred years hence?
What makes data Big Data? This is a second motive for our work here. To be sure one cause is simply the increase in human life population. This increase has created an increase in the volume of knowledge from data collected. This is the first characteristic identified in the NBDPWG Volume One Definitions. Because the data/information/knowledge comes largely from and in association with life it is full of variety another characteristic of Big Data. Life is at every instance various and significant, unique and changeable. Variety is a form of knowledge that changes over time. Knowledge of life that changes over time can be a picture, a life pattern. Highly detailed life patterns that change over time identify and are in aspects individual lives. Because of the volume and variety of knowledge from data there is both an apparent and real need for speed and velocity to understand this volume and variety. This apparent and real need for speed and velocity is both an intuitive and practical pressure being placed on technology to manage Bigness. Of course bigness is a relative and changeable term. More on this later. For today it might be more precise to say that human life is trying to find a strategy and technology for bringing together in an intelligible way differences in the speed and velocity of knowledge creation.

Implicit
For Whom
For What
For When
Principles
Projected
For Whom
For What
For When
Principles

2. Living Methods and Models
The Role of Thinking
The Role of Reflection
The Role of Metaphor
and Mapping
The Role Security
The Role of Privacy

3. Dilemmas and Opportunities for Life
Concurrency, Simultaneity, Parallelism and the Scientific Method
Uncertainty
Is it obsolete as an organizing principle?
Provenance
What history? From when?
Ownership
Orchestration and Orchestrator
Governance and Government
Emergence
PII

Individual Human Well Being in an Era of Intangible Dominance and Platform Economics

I am now beginning to wonder if we are chasing a false choice or dichotomy. Is the choice really between privacy based on individual rights vs individuals’ belief that they have been harmed? Is broader better or worse? If we are making a choice based on a fear that commercial interests won’t participate in the IDESG IDEF, what is that fear based on? Outside the digital divide if we go the highly automated (high velocity) route to identity management the issues and choices will likely become invisible to the individual. Risk management may help. As I understand this privacy approach it is based on both rights and harms. From the perspective of the individual human being do not harms require a higher burden of proof (in time and money) than rights? Without portability there are no remedies currently today so is not portability an essential piece of this our requirements. Human Rights

The sense that is emerging is that we need a conjunction of the rights and harms language along with a portability requirement.

How to communicate justice and generosity

My thoughts are along the line of:
To IDESG Legal Counsel:
The Identity Ecosystem Framework the IDESG is working to create is in important respects a new kind of community and organization based on a set of principles agreed to across a broad and diverse set of stakeholders. We would like our agreements/contracts to reflect its unique character. In the IDESG and the IDEF it is well known that documents like Terms of Use are frequently too long and complex and are frequently clicked through without understanding or some would argue informed consent. The IDESG would like to innovate in creating TOU that more effectively communicate our character. To that end we would like to be able to have the essential liability protection, perhaps in the form of a disclaimer, but not one that is a buyer beware notice. Rather because we need and hope for broad adoption of IDESG NSTIC* guided services and products we want to indicate that as a community all of us including the IDESG are in this together not trying to simply gain some special advantage over our service/product participants and users.
We would like your guidance in how to balance these needs in our first product/service “the SALS” TOU in order to set the desired tone, and to provide a model of how IDESG policies will unfold in the future.

NSTIC Guiding Principles

THE CHALLENGE

THE SOLUTION

BENEFITS

Executive Summary

1 Introduction

1.1 Goals, Methods, Models, Dilemmas and Opportunities for Life

1.2 Approach

1.2.1 Technologies Used

1.3 Benefits

Who own the data in the Big Data context? What does ownership in the Big Data context mean?

1.1 Goals , Methods, Models, Dilemmas and Opportunities for Life