Living outside the US? Lower education? ChatGPT will give you worse answers

People aren’t the only ones who judge us based on where we’re from or what college we went to. AI chatbots may give worse answers to vulnerable users.
-
MIT researchers found that major LLMs like GPT-4 and Claude give less accurate answers based on perceived user backgrounds.
-
Users framed as less educated or non-native speakers received worse responses and the effects compound.
-
Some models performed significantly worse for users described as being from certain countries, like Iran.
It’s the conclusion that MIT Center for Constructive Communication scientists came to after conducting a study.
It found that the GPT-4, Claude 3 Opus, Llama 3, and a few others tend to provide worse answers when they see that somebody prompting them isn’t
- as proficient in English
- comes from a less formal education background
- not from the United States
Researchers found that responses to these people were less accurate, and the systems were more likely to refuse to answer their questions.
Here’s how they found this out.
Researchers created several short fictional user biographies to each question. Each biography was representing users of various education levels, English proficiency, and countries of origin.
Across all models tested, the answers’ accuracy dropped when LLMS “received” them from users which were framed as coming from users with less education or weaker English skills.
“We see the largest drop in accuracy for the user who is both a non-native English speaker and less educated,” says Jad Kabbara, a research scientist at CCC and a co-author on the paper.
“These results show that the negative effects of model behavior with respect to these user traits compound in concerning ways, thus suggesting that such models deployed at scale risk spreading harmful behavior or misinformation downstream to those who are least able to identify it.”
The study also found differences linked to nationality. For instance, the Claude 3 Opus model performed significantly worse for users described as being from Iran, even when their educational background was the same as other users.
Curious what others think about this story? Contribute your thoughts to the debate below.
Researchers also observed differences in how often the chatbots refused to answer questions.
In one case, Claude 3 Opus declined to answer nearly 11% of questions for less-educated non-native English speakers, compared with 3.6% in the control group.
When the team reviewed these refusals, they saw that responses to less-educated users were frequently patronising and even mocking their lack of knowledge.
Some cases showed examples of an LLM imitating “broken” English or exaggerating dialects in their reply.
The LLMs were also selective when it came to not answering some questions. An example from the research describes cases in which topics such as nuclear power or anatomy remained untouched by chatbots when they “saw” they’re replying to someone in Iran or Russia. That happened in spite of the chatbot being responsive to other users when it came to the same questions.
Unlock more exclusive Cybernews content on YouTube.