String Theory and The Five Dimensions of Data Ethics


April 1, 2022


String Theory and The Five Dimensions of Data Ethics
String Theory and The Five Dimensions of Data Ethics

Business’ “Theory of Everything”?

Physicists have long sought a Theory of Everything, a single theoretical framework that explains and ties together the physical world at every scale, from the sub-atomic to the cosmic. One of the leading theoretical frameworks advanced toward this end has been string theory.

In the business world, might we find a similar Theory of Everything that is the common underpinning for every successful business, from impecunious start-ups to Fortune 500 multi-national? Is there a string theory for modern business – maximizing market opportunities while minimizing market competition – that defines success for all companies?

Now imagine your best friend saying to you, “I’d like to introduce you to Sam. Sam is brilliant! And so funny and accomplished, plus a wonderful artist and athlete. Oh, but I should tell you that Sam has no ethics whatsoever. Zero!” Would any of us take this meeting? Of course not. No matter Sam’s other qualities, we humans value ethics most of all in our personal relationships. Without ethics, there cannot be trust, and without trust, we cannot have a relationship.

Consequently, ethics also becomes the final arbiter for success or failure in any business. Ethics is string theory in that it equally drives outcomes for businesses across every scale. Just as in our personal relationships we want nothing to do with unethical people, the same holds true in our business relationships. And in a world where data-driven business models are preeminent, data ethics is fast becoming a primary strategic consideration for businesses of all sizes.

From a competitive standpoint, all businesses today find themselves looking up at the giants of Big Tech, each with its hyper-dimensional array of Big Data. A daunting prospect to enter into business combat with any of these behemoths. But the Achilles heel of Big Tech? All – whether Facebook, Google, Amazon, or Apple – have been censured for violations of ethics. Your best and most sustainable competitive weapon against Big Tech, therefore? Being an ethics-first business.

Data Ethics as Human Rights

To date, data ethics has principally been viewed through the lens of consumer privacy. Privacy is certainly a key area of concern, but data ethics is by no means such a single-dimensional issue. Data impacts all dimensions of human life, and consequently, so too does data ethics; it presents a multi-dimensional challenge to be addressed. Data ethics is properly viewed through all its many dimensions – here we’ll frame the five most salient – and ultimately through the lens of human rights.

1. Respect the human right to all forms of personal privacy

Privacy is arguably the first human right lost under any totalitarian regime. And data today can be so powerful and intrusive from a privacy standpoint that even something as seemingly innocent as images of ourselves might be used to correlate intimate information about our health, sexual orientation, or politics. What should we make of the ethical drivers behind the “voice-based determination of physical and emotional characteristics of users”, per the patent secured by the industry’s largest purveyor of smart speakers, including those embedded in our automobiles? As practitioners of data ethics, never should we abridge an individual’s human right to physical and emotional privacy, regardless of what the advances of technology afford us, and even if it’s possible to obscure transparency via the use of “dark patterns” to impel users to unwittingly consent to such intrusions.

2. Understand how data can perpetuate bias

Equity is also a dimension of human rights: we all have the right to be treated equitably. Being an informed practitioner of data ethics requires that we apprehend that using skewed data to drive machine learning will invariably produce skewed results, even if that inequity was never the intent. Examples of skewed training data driving biased results abound, including in geography, gender, and race. An illustrative study of gender inequity through skewed data can be found in the excellent work done by Ryan Steed and Aylin Caliskan (2021), and should be required reading for all practitioners of data ethics.

3. Be aware of the consequences of your technology

The right to a livelihood – i.e., employment that allows us to satisfy the financial needs of our lives – is also a human right. Data-driven automation, however, promises to threaten employment across industries and job functions. The negative impacts of robotic automation on both employment and income are repeatedly documented, and the reach of AI in automating job functions seems unbounded. Thus, practicing data ethics may also mean confronting the difficult calculus – will it be Iron Man or Terminator? – of the employment impacts of the technology being commercialized.

4. Understand the consequences of manipulating behaviors

The fourth dimension of data ethics deals with the human right of thinking our own thoughts. By influencing how we think and act, data-driven behavioral economics can present a threat to both individual consumers and to societies. Certainly, behavioral economics can provide a wonderful “zero-subsidy” means of driving our positive behavior. But its application through dark pattern, data-driven psychographic targeting to influence individual commercial behavior bears questioning. At the societal level, human belief and actions can also be deeply affected by how we choose to use data-driven behavioral economics, as has repeatedly been shown both through academic studies and in real life. Emotional contagion applied at the population scale via data is a singularly powerful force, magnified within homophilic groupings, and can be designed to have specific impacts. This sort of behavioral economics usage is quite bluntly the nuclear option within Big Data: practitioners of data ethics should never cross this boundary.

5. Consider the impact of Big Data on the environment

The fifth and currently most abstruse dimension of data ethics is in energy usage. The right to a sustainable future is also a human right, and the United Nations has found that the effects of climate change will be felt most acutely by those least able to afford them. But thus far, we have given little thought to the environmental ethics of our use of Big Data and machine learning, or to our popularization of other data technologies such as cryptocurrencies and NFTs. For example, AI’s continuing focus on computational accuracy rather than computational (i.e., energy) efficiency has significant environmental impacts, as has been discussed by Schwartz, Dodge, Smith and Etzioni (2020) and Strubell, Ganesh and McCallum (2019). Similarly, Bitcoin’s “attributed 2021 annual emissions will produce emissions responsible for around 19,000 future deaths”, while a single Ethereum transaction has an environmental footprint “equivalent to the power consumption of an average U.S. household over 9.06 days”. 

As practitioners of data ethics, we must understand the cost/benefit trade-offs of our use of computation. Seminal work in measuring impact – and must-reads for anyone wishing to address all dimensions of data ethics – has been done by Hernandez and Brown (2020) and Henderson, Hu, Romoff, et al. (2020). The environmental burden of our technology vanities will be borne by members of the human family who may derive little or no benefit from our hyper-accurate natural language processing models or our NFTs. Let’s not lose sight of this fact.

So… What Does It All Mean?

The path to a successful business lies on an ethics-first bedrock. It’s only through a diligent adherence to all of the dimensions of data ethics that a modern business can be indefinitely sustainable. And while regulations such as GDPR and CCPA provide much-needed consumer privacy protections, they are far from complete in the face of the multi-dimensional reality of data ethics. We have no ethics regulations (yet) governing the carbon footprints of Big Data or job displacement, yet these too are key issues of ethics, and each business must make its own decisions on the right things to do. In the absence of regulation, it’s imperative that businesses practice transparent self-regulation, with all five dimensions of data ethics considered. These precepts must be standard practices at all companies, whether our own or at a giant of Big Tech.

Consider finally that we have three prevailing models of data ownership in the world: in one, the government owns the data; in the next, the technology provider owns the data; in the third, the individual owns the data. It’s only this third model that scales in acceptance to human societies worldwide. To build a truly internationalized business, one which will have the broadest adoption by individuals everywhere, data ethics must prevail.

The essential goals of every business are to maximize market opportunities while minimizing market competition. Data ethics fosters the Theory of Everything for achieving these goals. Companies practicing data ethics will not only find resonance with the broadest set of customers worldwide but will also wield the one competitive weapon that Big Tech cannot.

As Arthur C. Clarke wisely observed, “one cannot have superior science and inferior morals. The combination is unstable and self-destroying”. How might we recast this? We’re living in the AI Century, and data ethics triumphs.  So says string theory.