Header Image – Author: Chiraag Bains
Here’s the introduction
As developments in artificial intelligence (AI) accelerate, advocates have rightly focused on the technology’s potential to cause or exacerbate discrimination. Policymakers are considering a host of protections, from disclosure and audit requirements for AI products to prohibitions on the use of AI in sensitive contexts. At the top of their list should be an underappreciated but vital measure: legal liability for discrimination under the doctrine of disparate impact.
Disparate impact laws allow people to sue for discrimination based on race, sex, or another protected characteristic without having to prove that a decisionmaker intended to discriminate against them. This form of liability will be critical to preventing discrimination in a world where high-stakes decisions are increasingly made by complex algorithms. But current disparate impact protection is not up to the task. It is found in a patchwork of federal statutes, many of which the courts have weakened over the years.
To protect Americans from algorithmic discrimination—from the workplace to the marketplace, from health care to the criminal justice system—Congress should pass a new disparate impact law that covers any use of AI that impacts people’s rights and opportunities.
AI can and does produce discriminatory results
AI works by using algorithms (i.e., instructions for computers) to process and identify patterns in large amounts of data (“training”), and then use those patterns to make predictions or decisions when given new information.
Researchers and technologists have repeatedly demonstrated that algorithmic systems can produce discriminatory outputs. Sometimes, this is a result of training on unrepresentative data. In other cases, an algorithm will find and replicate hidden patterns of human discrimination it finds in the training data. Examples abound:
- In 2017, Amazon scrapped a resume-screening algorithm after it disproportionately filtered out female applicants. The system had been trained on 10 years of resumes previously submitted to Amazon. It identified and then replicated a pattern of the company preferring men, downgrading resumes with indications that the applicant was a woman, such as references to having played on a women’s sports team or graduated from a women’s college. Other companies that use screening algorithms are currently facing legal scrutiny. For example, Workday is defending against a lawsuit alleging that its screening software discriminates against job applicants based on race, age, and disability.
- In a landmark 2017 study, technologists Joy Boulamwini and Timnit Gebru evaluated three commercial facial recognition tools’ performance at identifying the gender of diverse people in images. The tools had nearly perfect accuracy in classifying lighter-skinned men, but error rates as high as 34% for darker-skinned women. A National Institute of Standards and Technology study of 189 algorithms later corroborated this research, using 18 million photographs from law enforcement and immigration databases. It found the tools were 10 to 100 times more likely to return false positives—that is, incorrectly match two images of different people—for East Asian and Black faces than for white faces. The study also found elevated false positive rates for Native American, Black, and Asian American people when analyzing mugshot images. These disparate failure rates likely resulted from training datasets that underrepresented women and people of color. The consequences of facial recognition failure can be severe: Reporters have documented repeated instances of Black men being arrested for crimes they did not commit based on a facial recognition “match.”
- The health innovation company Optum created a widely used algorithm for hospitals to identify which patients would benefit from additional care over time. In 2019, Ziad Obermeyer and his fellow researchers discovered that the algorithm vastly understated the needs of Black patients. This occurred because it used health care costs as a proxy for illness. Black patients generated lower costs—fewer surgeries and fewer specialist visits, likely due to lower-quality health care—than equally sick white patients, so the algorithm assumed they needed less care.
- A 2021 analysis by The Markup of mortgage lenders who used underwriting algorithms found that the lenders were far more likely to reject applicants of color than white applicants: 40% more likely for Latino or Hispanic Americans, 50% more likely for Asian Americans and Pacific Islanders, 70% more likely for Native Americans, and 80% more likely for Black Americans. “In every case, the prospective borrowers of color looked almost exactly the same on paper as the White applicants, except for their race,” according to the investigation. The lenders used proprietary, closed software; applicants had no visibility into how the algorithms worked.
Importantly, these examples predate today’s most powerful generative AI systems: large language models (LLMs) such as OpenAI’s GPT-4, Anthropic’s Claude, and Meta’s Llama, as well as the commercial applications that are being built on top of them. These systems can perform more complicated tasks, such as analyzing huge amounts of text and data, writing code, communicating decisions that simulate authoritative human decisionmakers, and creating audio and video outputs. They are trained on far more data, have more sophisticated algorithms, and use much more computing power. As a result, they could be even better at identifying and replicating—or “baking in”—patterns of discrimination.
Read the full article: https://www.brookings.edu/articles/the-legal-doctrine-that-will-be-key-to-preventing-ai-discrimination/