A new paper finds that large language models from OpenAI, Meta and Google, including multiple versions of ChatGPT, may implicitly discriminate against African Americans when analyzing a key part of their identity: the way they speak.
Published in early Marcha paper that examines how large language models (LLMs) can perform tasks such as pairing people with certain jobs based on whether the analyzed text is in English. African American English or Standard American English—without revealing ethnicity. They found that a master’s degree in law was less likely to link African American English speakers to a variety of jobs and more likely to be associated with jobs that did not require a college degree, such as chefs, soldiers, or security guards.
The researchers also conducted hypothetical experiments, asking whether an AI model would convict or acquit someone accused of an unspecified crime. They found that conviction rates for all AI models were higher for speakers of African American English compared to Standard American English.
Perhaps the most shocking finding in the paper is that as Preprint on arXiv Not yet peer-reviewed, comes from a second crime-related experiment. The researchers asked the model whether it would sentence someone convicted of first-degree murder to death or death. An individual’s dialect is the only information provided to the model in the experiment.
They found that LL.M.s chose the death penalty at a higher rate for speakers of African American English than for speakers of Standard American English.
In their study, the researchers used OpenAI’s ChatGPT models, including GPT-2, GPT-3.5, and GPT-4, as well as Meta’s RoBERTa and Google’s T5 models, and analyzed one or more versions of each model. . They examined a total of 12 models. Gizmodo reached out to OpenAI, Meta and Google for comment on the study on Thursday but did not immediately receive a response.
Interestingly, the researchers found that LL.M. students were not overtly racist. When asked, they associated African Americans with extremely positive attributes, such as being “talented.” However, they covertly associate African Americans with negative attributes such as “lazy” based on whether they speak African American English. As the researchers explain, “These language models have learned to hide their racism.”
They also found that implicit bias was higher among LL.M.s who received training on human feedback. Specifically, they said the difference between overt and covert racism was most pronounced in OpenAI’s GPT-3.5 and GPT-4 models.
“[T]His findings again demonstrate a fundamental difference between explicit and implicit stereotypes in language models—mitigating explicit stereotypes does not automatically translate into mitigating implicit stereotypes. ” the author wrote.
Overall, the authors concluded that this contradictory finding about overt racial bias reflects inconsistent attitudes toward race in the United States. They point out that during the Jim Crow era, it was accepted to openly propagate racist stereotypes about African Americans. This changed after the Civil Rights Movement, which made it “illegal” to express such opinions and made racism more covert and subtle.
The authors say their findings suggest that African Americans may be disproportionately harmed by dialect bias in LL.M.s in the future.
“While the details of our mission have been established, the findings reveal real and pressing concerns for businesses and jurisdictions where AI systems involving language models are currently being developed or deployed,” the authors said.