Implications for Life in a Time of Big Data

Executive Summary

Today much of the data (also referred to as information and knowledge) in Big Data comes from the living individual or the once living individual enmeshed in a living world environment. Today much of the scientific community and the human population at large see the earth and even the stars and space in general as an interconnected web of the living and the once living. In fact in life science and biology an emergent third state of individual living, not just human, is beginning to be articulated as dormant individual in addition to an alive or dead individual. The full implications of the dormant state of individual living is beyond the scope of the present Big Data work but will at every turn have an effect on the development and governance of Big Data.

It is fair to say that few developments in contemporary culture, technology and science reveal more about the intractable inter-connectivity of living individuals than Big Data. Big Data often shows how and how much individual life needs other life and that the inter-connectivity has an essential social aspect. This social inter-connectivity is the kernel of the ethical challenges and opportunities Big Data presents and that must be resolved.

Big Data for all its rapidity and volume today requires thoughtfulness, self-examination discipline in action and the development of new kinds of choices. A living individual’s very mutability is a primary source of its value but must not be the sourced for exploitation and inequity. It is for this reason that the benefits of the ownership, value and governance of a living subject’s data must be drawn out and studied in a time frame adequate to go beyond limited self-interest. Today this has created a situation where individual living data is treated as a natural resource where the first one that “captures” has a profound and apparent irreversible advantage. There is a kind of a stampede and gold rush underway to “mine” it, to possess and own it. All of this is happening before the data has given anything back to the living or shown the capacity to add value to living and in a way that is commensurate.

Many—but not all—of the conundrums of Big Data such as ownership, value and privacy are created a third party stampede and tussling over the possession of the individual subject’s data. Possession is not an appropriate measure of ownership once it has be taken from the original owner—the data subject be the third party a commercial enterprise, a government, a scientific endeavor, a clan.

Today in many sectors of our human community such as healthcare, financial services, education and government it is customary to assert—and not without reason—that more data about an individual subject will lead to better decisions about how to manage the business and innovation of that sector in profitable ways. In part this has led to competition to acquire more and more data about living subjects. In healthcare, for example, this often means that the most useful data is embodied in individual human behavior. Because most living is more or less individuated and the subject of the data, the problem of data privacy and governance is built into Big Data analytics and its usefulness. Certain key questions and concerns then are central to the aspirations and application of Big Data. The purpose of this White Paper is to flesh out these questions, concerns, solutions and aspiration as they impact on living individuals and communities of living individuals particularly in the un-and-underserved and a few primary archetypal uses cases.

At present among the key questions to be highlighted are—

What are the implications for life from an extractive and capturing approach to Big Data?

Is the Big Data Subject alive?

Can these questions and answering be built into the Big Data Analytics and Reference Architecture?

THE CHALLENGE

How to govern and value the data that is captured and/or collected in databases and other ICT systems so as to realize the wealth inherent in large stores of data. This white paper will demonstrate solutions on two emblematic use cases clusters, first the un-and-underserved and 3 primary archetypal use cases. (To be attached).

THE SOLUTION

The NIST Big Data Public Working Group practice guide <title> demonstrates how data scientists and citizens can instantiate NIST Big Data Reference Architecture (NBD-RA) to address the un-and-underserved, and 3 primary archetypal use case in the financial services, healthcare and nonprofit sectors.

This White Paper further demonstrates how the central questions and concerns including security, privacy, data governance, ownership and value can be addressed and supported throughout the governance of Big Data analytics lifecycle. This includes specifically how to interact with NBD-RA components – Big Data Definitions and Taxonomies, Big Data Governance including Provenance, Curation, Preservation and Processing, the Data Provider, Big Data Analytic Provider, Big Data Framework Provider, and Big Data Consumer and Big Data Subject’s Experience.

The guide:

Identifies key areas for innovation needed to sufficiently analyze the given dataset.
Identifies the use case characteristics needed to sufficiently govern and analyze the given dataset with (analytics algorithm.)

Maps un-and-underserved and primary archetype characteristics to Big Data Analytic Provider

<others…>

BENEFITS

<The white paper is based on an ethical approach and assumption that the benefits, including monetary from Big Data analysis need to be:

A: inclusive of all the living interests of the data subjects and stakeholders from which the benefits are drawn.

B: equitable to all the data subjects and stakeholders from which the benefits are draws.

C: should demonstrate through a metrics to be discussed in due course how the capacity for life and individual living is enhanced by and through the collection, development, analysis and governance of the data.

1 Introduction

1.1 Goals , Methods, Models, Dilemmas and Opportunities for Life

<Goals statement Big Data Goals for Life — Survival?

Today the world store of human life has grown greatly. It is not clear that any other form of life has increased as rapidly, except perhaps the microbes and other life that cohabitates on/in human life. This increase has brought with it many concurrent and emergent problems and opportunities for life, not only human but all life. These problems and opportunities have simultaneously brought to bear the limits of our creative capabilities in understanding human survival and the survival of life. Someones of us have yelled fire, and millions of people and their technology are looking for answers and understanding. Generally speaking this development is a good thing; on some level every life wants to survive and even flourish and thrive. The question and the context then becomes; Is our collective effort of gathering knowledge—data and information for the survival of life?

For now it is important not to be distracted nor to make too much of the differences in terminology here of data, information and knowledge, as if in our case, data is something fundamentally different from information and knowledge. It is not. It may be reasonable to point out that data and information are kinds of knowledge and/or contexts of knowledge without inferring that these contextual differences are greater than the common ground of knowledge. We could claim our subject to be Big Knowledge or Big Information. For now Big Data may suffice.

Implicit

For Whom

For What

For When

Principles

Projected

For Whom

For What

For When

Principles >

1.2 Approach

< Living Methods and Models

The Role of Thinking

The Role of Reflection

The Role of Metaphor and Mapping

The Role of Security

The Role of Privacy …Approach description >

1.2.1 Technologies Used

1.3 Benefits

Privacy in a time of Big Data

Privacy in a Time of Big Data

Ann Racuya-Robbins

The emergence and existence of Big Data technologies and techniques have scoped the challenge of insuring privacy in contemporary life. It is fair to say that there is an inverse relationship, roughly speaking, between big data and privacy. That is as data scales up privacy challenges become more grave. The factors pressuring big data to scale, to get bigger, faster… are powerful including a hoped for competitive edge and speed and cost reduction of analytics to acquire these competitive edges under the name patterns. While the term patterns has gained currency in the field the term’s meaning is not so well understood. It is important to state clearly that the patterns that are sought are themselves data that contain or create an advantage. Understanding is itself an advantage. By advantage is meant largely a competitive commercial monetary advantage by a third party other than the data subject.

Privacy is a subject of individual life and living.

Privacy is an expression of biologic specificity. Privacy properly ensured and governed preserves innovation, creativity and living development. In this way privacy is a key ingredient of survival and successful maturation. The pervasion of data that has thrown open the loss of privacy carried in computer and ICT infrastructures is a relatively new phenomenon. The concern for privacy is a recognition of the broadening value of all individual life. A recognition of the dignity and richness of every life. A recognition that individual life is not rightly an object or property of another.

Privacy cannot be reduced to personal information i.e. name, address and/or other factuals. PII is an obsolete moniker for our subject.

Let us stipulate that we will in this first instance be referring to living individual adults. Living has many stages and forms that must to be addressed later.

Privacy is—living individual’s control over and freedom and refuge from data collection, capture, extraction, surveillance, analytics, predictions, excessive persuasive practices and communication of the living individual’s life, including external or internal bodily functions, creations, conditions, behavior, social, political, familial and intimate interaction including mental, neural and microbial functioning—unless sanctioned by civil and criminal law and when sanctioned only under protocols where the ways and means of collection, capture, extraction, surveillance, analytics and communication including new methods to emerge are governed by appropriate social cooperation principles and safeguards embedded in ICT infrastructures and architectures overseen by democratic courts and civil and community organizations and individuals peers charged with insuring proper conduct.

Living individuals own the data generated by or from their lives. Should revenues be generated from the collection, capture, extraction, surveillance, analytics and communication of the living individual’s data the majority of revenue generated from the living individual’s life belong to the living individual. Data ownership, provenance, curation, governance as well as the consequences of violations of privacy practices must be encapsulated in or within the data, be auditable and travel in encrypted form with the data. Where possible block chain techniques shall be employed as well as counterfactual strategies (processes) in engineering privacy.

Provenance is an accounting of the history of data in an ICT setting.

Next Steps

Define further Data Governance, Data Provenance, Data Curation, Data Valuation. Integrate the principles and practices outlined above into an archetypal Privacy Use Case(s) and articulate the Privacy Use Case as it proceeds through the reference architecture.

Data Ownership in the Big Data Context

Who own the data in the Big Data context? What does ownership in the Big Data context mean?

Data Ownership –

Ann Racuya-Robbins 2016 06 10

Data ownership means that the data subject owns the majority of the revenues generated from data that emanates/ed from or was built upon the data subject’s data. A data subject is a living being. This kind of ownership would mean that the data subject has the authority over decisions including development and disposition of the data subject’s data.

Also see the Individuals Trust Frame Work

Individual Human Well Being in an Era of Intangible Dominance and Platform Economics

I am now beginning to wonder if we are chasing a false choice or dichotomy. Is the choice really between privacy based on individual rights vs individuals’ belief that they have been harmed? Is broader better or worse? If we are making a choice based on a fear that commercial interests won’t participate in the IDESG IDEF, what is that fear based on? Outside the digital divide if we go the highly automated (high velocity) route to identity management the issues and choices will likely become invisible to the individual. Risk management may help. As I understand this privacy approach it is based on both rights and harms. From the perspective of the individual human being do not harms require a higher burden of proof (in time and money) than rights? Without portability there are no remedies currently today so is not portability an essential piece of this our requirements. Human Rights

The sense that is emerging is that we need a conjunction of the rights and harms language along with a portability requirement.

How to communicate justice and generosity

My thoughts are along the line of:
To IDESG Legal Counsel:
The Identity Ecosystem Framework the IDESG is working to create is in important respects a new kind of community and organization based on a set of principles agreed to across a broad and diverse set of stakeholders. We would like our agreements/contracts to reflect its unique character. In the IDESG and the IDEF it is well known that documents like Terms of Use are frequently too long and complex and are frequently clicked through without understanding or some would argue informed consent. The IDESG would like to innovate in creating TOU that more effectively communicate our character. To that end we would like to be able to have the essential liability protection, perhaps in the form of a disclaimer, but not one that is a buyer beware notice. Rather because we need and hope for broad adoption of IDESG NSTIC* guided services and products we want to indicate that as a community all of us including the IDESG are in this together not trying to simply gain some special advantage over our service/product participants and users.
We would like your guidance in how to balance these needs in our first product/service “the SALS” TOU in order to set the desired tone, and to provide a model of how IDESG policies will unfold in the future.

NSTIC Guiding Principles

The Terms of Big Data Getting it Right

Appendix B: Terms and Definitions

Big Data consists of extensive datasetsprimarily in the characteristics of volume, variety, velocity, and/or variability that results in new and unprecedented amounts and kinds of value; primarily economic and socialthat requires a governed scalable architecture for the efficient and fair storage, manipulation, analysis and realization of this new value to increase the capability of living, individual social good and the well-being of society as a whole.

The Big Data paradigm consists of the distribution of data systems across horizontally coupled, independent resources to achieve the governed scalability needed for the efficient processing and fair realization of the value inherent in extensive datasets.

Big Data engineering is based on technical paradigms that tend to ignore or remain silent on the societal consequences of Big Data; this is why governance is needed, to guide the technical paradigms to use advanced techniques that not only harness independent resources for building scalable data systems, but also use those advanced techniques to assure the just and fair realization of the societal value inherent in those datasets. Big Data engineering so guided will use advanced techniques that harness the value in independent resources for building governable and governed scalable data systems so that when the characteristics of the datasets require new architectures for efficient, fair storage, manipulation, analysis such architectures will also enable the fair realization of value for the capability of living, individual social good, and the well-being of society as a whole.

Data governance is part of an evolving and dynamic rule set for realizing the societal and economic value from datasets. Data governance involves but is not limited to risk management or administering, or formalizing, discipline (e.g., behavior patterns) around the management of data. Data governance is a reflection of the choices made among normative and competing values and ideals such as—efficiency, economic efficiency—autonomy, individual personal autonomy— distributive justice—corrective justice between the parties—fairness and the like—where parity or equality in bargaining power between the parties is the foremost aspiration.

Value refers to the inherent wealth, economic and social, embedded in any data set that must be governed in order to realize that wealth for all members of the society.

Best Practices for Human Attributes

How to Move towards Trustworthy ground with Human Attributes

Human Attributes—all the aspects of a life—in online transaction environments—should progress towards the creation of Standards for the attributes-lifecycle. Such Standards should include how to respect, care and creatively treat those attributes. I think this is the right direction.

I think there should be a base Standard of assurance that will allow for the greatest range of transactions by the greatest number of participants. More on this later. Such a base standard of assurance should be agreeable by all stakeholders including individuals. This will require individuals to better understand monetization of human attributes and the crucial complex of the meaning of human attributes.

To move towards and achieve Standards for the attribute lifecycle a central challenge and dilemma must be undertaken to transparently articulate the relationship between Personally Identifiable Information (PII), attributes over a lifecycle and attributes that create PII through aggregation, provenance or other time related processes. We must acknowledge that PII and attributes are, more or less, on a continuum. The truth needs to be told that privacy requirements are not meaningful without taking on this challenge. I have some suggestions for standards in this area that I would like to forward at the proper time.
Here lie many perils and much promise.

The Human Trust Experience and The Importance of Economic Inclusion

Critical to the success of the NSTIC Strategy and the IDESG is the breadth and depth of trust and confidence, innovation and economic progress it releases created through the standards and certification it accredits. As Obama said in his release of the NSTIC Strategy “…we cannot know what companies have not been launched, what products or services have not been developed…what we do know is this: by making online transactions more trustworthy and enhancing consumers’ privacy…we will foster growth and innovation, online and across our economy in ways we can scarcely imagine…ultimately, that is the goal of this strategy.”
Precisely because we cannot predict how innovations will emerge and from whom we cannot rightfully leave anyone out by making barriers for participation or designing solutions without questioning our assumptions. We don’t know if innovations will come from the self-employed, the small business, the large business or even the unemployed. Evaluation of issues of economic inclusion must be central to the development of the Identity Ecosystem Framework, the identification and authentication standards risk models (NSTIC Objective 1.2) and the administration of the standards development and accreditation (NSTIC Objective 1.4) and ultimately the promise of our democracy to govern and fulfill its promise of equal opportunity.