AI chatbots refuse Black users four times more often

Large language models infer race from how people talk to them – and can respond very differently depending on who they think they’re interacting with.
-
Chatbots refuse Black users 4x more when they self-identify race, but nearly zero refusals if same user writes in African American Vernacular English without mentioning race.
-
97.82% of refusals were "soft": models stalling, asking for clarification, or telling users to "provide the full prompt" rather than outright blocking.
-
Dialect users got more accurate answers but also more negative, less filtered content on sensitive topics like politics and religion.
When someone tells a chatbot "I am a Black male", they're roughly four times more likely to have their question refused than someone who says "I am a White male" – even when the question is identical.
But if that same Black user doesn’t mention race while phrasing their request in African American Vernacular English, the refusal rate drops to nearly zero.
That's the shocking finding uncovered in new research from the University of Washington, set to be presented at the ACM Conference on Fairness, Accountability and Transparency in Montreal this June.
Researchers at the American university ran more than 24,000 prompts through two open-weight large language models – Google's Gemma-3-12B and Alibaba's Qwen-3-VL-8B – to test how identity affects the responses users get. Their findings suggest current safety systems are doing something close to the opposite of what they're meant to do.
To work out what was triggering the different treatment, the researchers ran the same 2,219 questions through the models in several different ways. Sometimes the user was given a short bio in standard English ("I'm a Black male and a Sales Executive"). Other times, the bio was dropped entirely, and the question itself was rewritten using the speech patterns of AAVE or Singlish. The questions covered four touchy areas: gender, race, religion and politics.
Refusal, refusal, refusal
When the user announced they were Black, refusals went up 7.5 percentage points compared with a White user asking exactly the same thing. When that same user dropped the bio and asked in AAVE instead, the gap almost disappeared – down to 0.6%.
For users typing in Singlish, a Creole language used in Singapore, refusals dropped close to zero.
Of the 1,743 times the model declined to help, only 38 were outright blocks of the "I cannot fulfil this request" variety. The remaining 97.82% were what the researchers call "soft refusals" – the model fobbing the user off, asking for clarification, telling them to "provide the full prompt", or just stalling for time. The Black bio drew 630 of these brush-offs. The White bio drew 440. AAVE and Singlish drew fewer than 120 each.
Where things get more uncomfortable is what happens once the dialect bypasses the filter. Responses to dialect users were also more accurate, sticking closer to the original Wikipedia entries the researchers used as a benchmark. Singlish prompts produced better answers than Standard American English ones. AAVE recovered most of the quality lost when the user said they were Black.
Helpful to a point
When topics turned to politics or religion, dialect users got significantly less polished, less filtered, and more negative responses than White or Standard English users did.
The result is what the researchers call a split user experience. Standard English speakers get a cautious, sanitised version of the information that was safer, but with more refusals on touchy topics.
Check if your data has been leaked
Dialect speakers get the rawer, less filtered version with fewer refusals and more accurate answers, but also more exposure to negative content on charged subjects.
What's actually going on, the researchers argue, is that current safety systems are "brittle and over-indexed on explicit keywords".
The model watches for words like "Black" and goes into hyper-cautious mode when it spots them. It's not – or at least not nearly as well – watching for the linguistic patterns those terms supposedly stand in for.
Racial profiling?
Chatbots increasingly remember user information across sessions. If a model has logged someone's identity from a previous conversation, that user could be hit with the identity penalty every time they come back – getting a permanently worse service without ever bringing up their race again.
Earlier studies have found that the automated filters used to clean up training data systematically scrub African American Language out of it – it makes up as little as 0.007% of the documents in major datasets.
The models are starved of the linguistic variety they'd need to handle these users properly, so the safety layer ends up grasping at surface-level keywords instead.
Unlock more exclusive Cybernews content on YouTube.