Welcome to DU!
The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards.
Join the community:
Create a free account
Support DU (and get rid of ads!):
Become a Star Member
Latest Breaking News
General Discussion
The DU Lounge
All Forums
Issue Forums
Culture Forums
Alliance Forums
Region Forums
Support Forums
Help & Search
General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsChatGPT Replicates Gender Bias in Recommendation Letters
A new study has found that the use of AI tools such as ChatGPT in the workplace entrenches biased language based on genderhttps://www.scientificamerican.com/article/chatgpt-replicates-gender-bias-in-recommendation-letters/
No paywall encountered here. If so, try the archive https://archive.is/8adfs
Generative artificial intelligence has been touted as a valuable tool in the workplace. Estimates suggest it could increase productivity growth by 1.5 percent in the coming decade and boost global gross domestic product by 7 percent during the same period. But a new study advises that it should only be used with careful scrutinybecause its output discriminates against women.
The researchers asked two large language model (LLM) chatbotsChatGPT and Alpaca, a model developed by Stanford Universityto produce recommendation letters for hypothetical employees. In a paper shared on the preprint server arXiv.org, the authors analyzed how the LLMs used very different language to describe imaginary male and female workers.
We observed significant gender biases in the recommendation letters, says paper co-author Yixin Wan, a computer scientist at the University of California, Los Angeles. While ChatGPT deployed nouns such as expert and integrity for men, it was more likely to call women a beauty or delight. Alpaca had similar problems: men were listeners and thinkers, while women had grace and beauty. Adjectives proved similarly polarized. Men were respectful, reputable and authentic, according to ChatGPT, while women were stunning, warm and emotional. Neither OpenAI nor Stanford immediately responded to requests for comment from Scientific American.
The issues encountered when artificial intelligence is used in a professional context echo similar situations with previous generations of AI. In 2018 Reuters reported that Amazon had disbanded a team that had worked since 2014 to try and develop an AI-powered résumé review tool. The company scrapped this project after realizing that any mention of women in a document would cause the AI program to penalize that applicant. The discrimination arose because the system was trained on data from the company, which had, historically, employed mostly men.
The researchers asked two large language model (LLM) chatbotsChatGPT and Alpaca, a model developed by Stanford Universityto produce recommendation letters for hypothetical employees. In a paper shared on the preprint server arXiv.org, the authors analyzed how the LLMs used very different language to describe imaginary male and female workers.
We observed significant gender biases in the recommendation letters, says paper co-author Yixin Wan, a computer scientist at the University of California, Los Angeles. While ChatGPT deployed nouns such as expert and integrity for men, it was more likely to call women a beauty or delight. Alpaca had similar problems: men were listeners and thinkers, while women had grace and beauty. Adjectives proved similarly polarized. Men were respectful, reputable and authentic, according to ChatGPT, while women were stunning, warm and emotional. Neither OpenAI nor Stanford immediately responded to requests for comment from Scientific American.
The issues encountered when artificial intelligence is used in a professional context echo similar situations with previous generations of AI. In 2018 Reuters reported that Amazon had disbanded a team that had worked since 2014 to try and develop an AI-powered résumé review tool. The company scrapped this project after realizing that any mention of women in a document would cause the AI program to penalize that applicant. The discrimination arose because the system was trained on data from the company, which had, historically, employed mostly men.
You can download the paper at arxiv: https://arxiv.org/abs/2310.09219
License CC zero, free to share https://creativecommons.org/public-domain/cc0/
Abstract:
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters
Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, Nanyun Peng
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content, including professional documents such as recommendation letters. Though bringing convenience, this application also introduces unprecedented fairness concerns. Model-generated reference letters might be directly used by users in professional scenarios. If underlying biases exist in these model-constructed letters, using them without scrutinization could lead to direct societal harms, such as sabotaging application success rates for female applicants. In light of this pressing issue, it is imminent and necessary to comprehensively study fairness issues and associated harms in this real-world use case. In this paper, we critically examine gender biases in LLM-generated reference letters. Drawing inspiration from social science findings, we design evaluation methods to manifest biases through 2 dimensions: (1) biases in language style and (2) biases in lexical content. We further investigate the extent of bias propagation by analyzing the hallucination bias of models, a term that we define to be bias exacerbation in model-hallucinated contents. Through benchmarking evaluation on 2 popular LLMs- ChatGPT and Alpaca, we reveal significant gender biases in LLM-generated recommendation letters. Our findings not only warn against using LLMs for this application without scrutinization, but also illuminate the importance of thoroughly studying hidden biases and harms in LLM-generated professional documents.
Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, Nanyun Peng
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content, including professional documents such as recommendation letters. Though bringing convenience, this application also introduces unprecedented fairness concerns. Model-generated reference letters might be directly used by users in professional scenarios. If underlying biases exist in these model-constructed letters, using them without scrutinization could lead to direct societal harms, such as sabotaging application success rates for female applicants. In light of this pressing issue, it is imminent and necessary to comprehensively study fairness issues and associated harms in this real-world use case. In this paper, we critically examine gender biases in LLM-generated reference letters. Drawing inspiration from social science findings, we design evaluation methods to manifest biases through 2 dimensions: (1) biases in language style and (2) biases in lexical content. We further investigate the extent of bias propagation by analyzing the hallucination bias of models, a term that we define to be bias exacerbation in model-hallucinated contents. Through benchmarking evaluation on 2 popular LLMs- ChatGPT and Alpaca, we reveal significant gender biases in LLM-generated recommendation letters. Our findings not only warn against using LLMs for this application without scrutinization, but also illuminate the importance of thoroughly studying hidden biases and harms in LLM-generated professional documents.
InfoView thread info, including edit history
TrashPut this thread in your Trash Can (My DU » Trash Can)
BookmarkAdd this thread to your Bookmarks (My DU » Bookmarks)
4 replies, 414 views
ShareGet links to this post and/or share on social media
AlertAlert this post for a rule violation
PowersThere are no powers you can use on this post
EditCannot edit other people's posts
ReplyReply to this post
EditCannot edit other people's posts
Rec (3)
ReplyReply to this post
4 replies
= new reply since forum marked as read
Highlight:
NoneDon't highlight anything
5 newestHighlight 5 most recent replies
ChatGPT Replicates Gender Bias in Recommendation Letters (Original Post)
usonian
Nov 2023
OP
This makes no sense. No one would use the word "beauty" in a recommendation letter.
honest.abe
Nov 2023
#3
dalton99a
(82,035 posts)1. Artificial Intelligence is artificial.
usonian
(10,192 posts)2. It's trained on our garbage heap, the internet.
I posted some time back that the latest craze is firewalling all your data to either train AI models or sell it to someone else who will.
https://democraticunderground.com/100218325136
TLDR:
https://tonysull.co/articles/web-3-is-here/
The web is moving from an ad platform to an AI training platform.
and that's a big deal.
---
The GIGO rule still holds: Garbage in, Garbage out.
honest.abe
(8,707 posts)3. This makes no sense. No one would use the word "beauty" in a recommendation letter.
I think this is bogus and click bait.
RockRaven
(15,241 posts)4. Garbage in, garbage out.
People are biased and shitty.
So input a huge data set of what people have generated on their own... and the computers will be biased and shitty.