By Shomit Ghose | June 5, 2019

Iron Man vs Terminator Revisited

In the landscape of artificial intelligence, we still find ourselves balanced precariously between the twin outcomes of Iron Man and Terminator.  In the former, we adopt technologies to empower our human selves — as with the Marvel comic book character — to overcome our foremost challenges.  In the latter, technology is deployed without deliberation or discernment — our inadvertent Skynet and Terminator — resulting in a dystopian world where the ruthless economics of full automation prevail.

All technology, AI included, exists for a single purpose: to make life better.  To do so, it must deliver fundamental human rights, including food, education, healthcare, a livelihood, a clean environment, and not least of all, the right to personal privacy. Technology should not make life worse; it should not rob us of any human right. But today, our data trails (produced by anything with an electrical power source), and the AI that deciphers them enable us and threaten us in equal measure. No better example exists than Facebook with its continuing stumbles. But Facebook is just a symptom of a larger issue in technology and society: data privacy and the explicit need — today! —  for ethics in AI.

Intrusiveness via Data Variety

Not all data trails are equal. Thus far, we’ve had justified concern about the protection of our “strong” data. “Strong” data is data that is strongly identifying in personal detail: my name, driver’s license number, salary, address.  “Weak” data, on the other hand, is of wide variety and of seemingly meaningless utility: what movies I’ve watched, my views on spending for space exploration, whether I like to fish, own a flashlight, or whether or not I voted in the election. But due to its sheer volume and variety, and the curve-fitting magic of machine learning, weak data can be correlated to reveal even more personal detail about me than the strong data. Indeed, the weak data listed above can be correlated to my educational level, gender, political orientation, race, and income level, respectively. Weak data is anything but weak, revealing intimate details of our personal identities and beliefs.

“If we do not start to proscribe the unethical uses of data and AI today, dystopia will fast be upon us, and the resulting social and regulatory backlash will interdict the good uses of data along with the bad.”

Shomit Ghose

Crucially, weak data has been deployed to bring new benefit to underserved populations. Companies such as SmartFinance, JUMO and Zest Finance correlate weak data signals to bring financial services to the underbanked.  In healthcare, non-traditional data trails have been used to not only assist in the diagnosis of conditions as diverse as depression and Parkinson’s disease, cognitive function decline, pancreatic adenocarcinoma and arrythmia, but even to enable self-help to people suffering from anxiety and depression. Data-driven AI is the world’s most ubiquitous agent of change, serving as the enabling platform to beneficially “status-hack” those who have previously been neglected by business and society.

It is because of the positive potentials of data and AI that the negative potentials bear such immediate attention. If we do not start to proscribe the unethical uses of data and AI today, dystopia will fast be upon us, and the resulting social and regulatory backlash will interdict the good uses of data along with the bad.

Appreciating AI’s Impact

We are accelerating into an increasingly data-driven world, but legacy businesses, governments and private citizens alike may not fully appreciate its implications. We rely more and more on automated (frequently real-time) decisions rendered by AI without a clear understanding of what is being done, to whom, and why.

Automobiles, for example, are undergoing the transition from being driver-centric platforms (with a premium on mechanical engineering) to passenger-centric platforms (with a premium on data and AI). This brings clear implications on the technology of the automobile, and also on the limitless opportunities in its business model. As well, automobiles present a prime example of the negative effects (and unintended impacts) of AI.  Sweden’s Volvo was confounded in detecting kangaroos (a naturally occurring part of the training data set in Australia, but definitely not in Sweden), while autonomous vehicles have shown uniformly poorer performance in detecting pedestrians with darker skin tones.

Similarly, ethical impacts have been felt — unintended though they may have been — from the AI behind consumer lending, gender classification in facial analysis, Facebook’s targeting of job advertisements, and Amazon’s recruiting efforts.  UNESCO’s recent report, “I’d Blush If I Could”, further points out the gender ethics gap within AI.

Big Brother: More Than a Pesky Online Ad

The public discourse on Internet privacy tends to focus on data encryption or on those annoying online ads following us about as we browse. But in the end, these are trivial manifestations of the ethics and privacy threats posed by data and AI. The ultimate definition of privacy is being able to keep your thoughts and beliefs private from outside parties. The threat to privacy is when a government or profit-maximizing entity knows what goes on inside your head and goes even further by influencing what goes on inside your head.

Weak data correlations as described above can reveal all manner of personal detail. Concerned about Apple selling iTunes listening data to third parties? Your music preferences can be related to a “wide array of personality dimensions (e.g., Openness), self-views (e.g., political orientation), and cognitive abilities (e.g., verbal IQ)”.  Ever done a “meaningless” Facebook “Like”?  Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. 

Encrypted data is no guarantor of privacy: it does not prevent the holder of the data from reaching intimate conclusions about you. And the issue of privacy is far deeper than a mere shoe ad appearing on every Web page you visit. Consider that the US Constitution guarantees “the right of the people peaceably to assemble.” If you become aware of Facebook’s patent linking your loan-worthiness to the credit ratings of the friends in your social network, does your right to assemble online with whomever you choose then become abridged? 

Faced with an incomplete picture of AI and weak data, unwieldy opt-in privacy policies, written impenetrably, and mobile apps that leak data, we as individuals are clearly overmatched.

What Would Bezos Do?

“Good intentions never work, you need good mechanisms to make anything happen.”


Jeff Bezos

Smart speakers are becoming ubiquitous in our private spaces, including in our homes and cars. These devices present a significant, 24×7 privacy threat. Not only can employees of profit-maximizing companies access the data, but the state of AI technology is such that the data trails in speech are deterministic (and privacy-compromising) in everything from interpersonal dynamics to an individual’s physical and mental state. It should not surprise us that Amazon has patented the “voice-based determination of physical and emotional characteristics of users.”

Notably, the Campaign for a Commercial-Free Childhood has recently combined with other organizations to lodge a request with the US Federal Trade Commission to investigate Amazon’s Echo Dot for Kids for violating the Children’s Online Privacy Protection Act. 

Privacy and ethical issues extend beyond AI analysis of voice to facial analytics as well.  Facial images can be analyzed to determine the state of our health — including patented technology from both Google and Amazon — and our sexual orientation. There’s a spate of academic research worldwide on determining our emotions (i.e., knowing what’s going on inside our heads) via AI facial analysis. The ethics of this type of AI is an open question. Walmart has patented technology that analyzes video “to identify customers and measure customer biometric data,” to detect things such as customer dissatisfaction, with the customer biometric data then possibly being correlated to transaction data.

Potential sources of privacy threats continue to ramify. Even State Farm, a century-old company in the somnolent business of insurance, has patented the aggregation and correlation of “home data, vehicle data, and personal health data associated with the individual” for “life management purposes.” What should we then expect with the bonanza of 5G data that will flow to Google’s Stadia gaming platform knowing that “commercial video games can be useful as ‘proxy’ tests of cognitive performance at a global population level”? 

It can be argued that the first right lost under an oppressive regime — the regime’s best instrument of control — is the right to privacy. AI in the hands of profit-maximizing corporations is also threatening, the “good intentions” of those corporations notwithstanding. To take Mr Bezos’ counsel, let’s put some “good mechanisms” in place for the purposes of guaranteeing AI ethics and our privacy. 

Potemkin Village Privacy

Some of the most risible calls for data privacy regulation now come from those companies most able to intrude upon our privacy. Cynically, we might take a cue from Amazon’s 2018 federal tax obligation: $0 in taxes due on $11.2 billion in profit. In other words, data monopolies may be confident that they understand data and AI far better than any regulator ever will, and that they’ll be able to happily run rings around the regulations without breaking any laws. Not only will these large players be able to hide behind regulatory compliance, but the laws will prove burdensome enough for smaller companies, ending up acting as a competitive barrier.

The Data Transfer Project touted as “an open-source, service-to-service data portability platform” by Facebook, Google, et al., is unlikely to achieve its stated aim given its assertion that “[data] portability should not extend to … data collected to improve a service, including data generated to improve system performance or train models that may be commercially sensitive or proprietary. This approach encourages companies to continue to support data portability, knowing that their proprietary technologies are not threatened by data portability requirements.”  So consumer data is portable, except when the company decides that it’s not.

Redefining Monopolies for the AI Century

“The common law secures to each individual the right of determining, ordinarily, to what extent his thoughts, sentiments, and emotions shall be communicated to others.”

The Right to Privacy, Samuel D. Warren & Louis D. Brandeis, Harvard Law Review, Dec. 15, 1890

US Supreme Court Justice Louis Brandeis is celebrated for his work a century ago both in fighting monopolies and for individuals’ “right to be let alone.” Many of the ethical issues we face today in the age of AI stem from the fact that too much data is concentrated in too few monopolistic hands, allowing unprecedented data-driven intrusions into personal privacy. The data-driven economy vitiates existing anti-trust regulation, and the time has now come to redefine what constitutes a monopoly.

Heretofore, if a company controlled all of the sources of supply it had a monopoly on pricing, which merited government anti-trust sanction. Today, if a company controls all of the sources of data it has a monopoly on privacy, which constitutes a greater threat to the individual than simply a product’s price. Data monopolies too merit government anti-trust sanction.

Unchecked, today’s data monopolies will be the only beneficiaries as they continue to accrete ever greater volumes of data — with ever deeper intrusions into our privacy — through their own initiatives, by partnerships with traditional businesses, and through partnership with public entities.  Data monopolies also crowd out new entrants — who are unable to wield data in sufficient volume — thereby shutting down competition. Competition is a necessary component for any healthy economy, and without competition we may never get ethical business practices.

In addition to updating the definition of monopoly and anti-trust regulations, existing privacy regulations need be strengthened. We need to extend from data “protection” regimes to ones that prohibit the unethical use of data. If it’s illegal for a company to ask personal questions in a job interview — politics, religion, health, sexual orientation — should not the same prohibitions apply to the semantics of our data streams? All businesses, online and offline, should explain what data they’re collecting, and provide an explicit list of the psychographic conclusions they’ll draw from that data.

Finally, data privacy is an issue of ethics, not profit. Calls for imposing a supplemental tax on data monopoly companies via a “data dividend”, while well-intentioned, miss the point. Our privacy should not have a clearing price. Corporate practices that are unethical and intrude upon individual privacy should be prohibited, not just taxed.

Avoiding the Mistakes of the 20th Century

Businesses are viable only if they are sustainable. If a business is unethical it will, in the end, be unsustainable. With this in mind it becomes clear that the only way to build a sustainable, data-driven, AI-based business for the 21st Century is to engineer ethics into the business model from Day 1.

Amazon has succeeded in commoditizing those who sell the products, thereby creating a monopsony in e-commerce. The gig economy now threatens to commoditize human labor. Is ride-hailing a monopsony too, having turned us all into “meat robots” who are merely keeping the seats warm till the real robots arrive and start driving (more cheaply) in our place? And if so, is the ride-hailing business model sustainable? We “meat robots” will find our livelihoods threatened in more and more fields as the real robots solve their battery power problem.  When that day arrives, will it be ethical (and sustainable) to obsolete entire classes of employment for the sole purposes of cost-efficiency?

It’s incumbent on all of us to drive the cause of ethics within AI. Governments can consider new regulations (akin to the Glass-Steagall Act) and new regulatory bodies (similar to the FDA). Academics — well-versed in the technologies, laws and social impacts — can take policy positions. Employees can apply duties of ethical care, just as in other professions. New entrants into the data economy, both start-ups and legacy companies, can found their value propositions on data ethics. And existing data monopoly companies can become part of the solution, not the problem.

Critically, “self-regulation” is the most important factor. In the face of technology’s meteoric velocity, corporate and governmental regulations will never keep pace. It’s only through the application of our own moral compasses each and every day that we can hew to the path of ethical AI; it’s about knowing to do what’s right and not just doing what’s possible.

The greatest breakthrough needed in the field of AI is not in technology but in ethics. If we don’t incorporate ethics into AI today, before we pass AI’s event horizon, we will never be able to do so.  The dawn of the last century saw enormous breakthroughs in technology: in physics, in chemistry, in engineering. Alas, these advancements often found their employment not in the banishment of hunger, disease, ignorance and poverty, but in war, pollution, oppression and inequity. Let’s not let the 21st Century’s breakthroughs in data and AI run a similar course.