AI is permeating the foundation stones of society, from criminal justice to medicine to art. In doing so, it has opened a debate that’s currently raging.
Rather than being a dispassionate, existential algorithm hailed as removing bias from decisions, AI has been described as dystopian and a danger to humanity. But is this hyperbole in action, or are AI algorithms as biased as made out?
The biased human
Bias isn't just an AI issue. Human society is based on social norms that rely on evolutionary behaviors to conform to these norms. These behaviors help to establish group cooperation, making cognitive bias an inherent part of being human.
You might fancy yourself as an independent thinker, a thoughtful and objective exception to the rule. You can do tests like Harvard's Implicit Association Test (IAT) to see how biased you are. Our in-built biases are reasoned to be due to "cognitive shortcuts" that help us make fast decisions. The result is that these biases have seeped into pretty much everything humans do, for example:
Medicine has been historically biased without AI. Gender bias in medicine has a long history, with clinical trials often consisting only of males. But male and female bodies are fundamentally different. This is borne out in how our bodies respond to disease and medication. For example, autoimmune conditions impact 8% of the population, but 75% of those affected are female. The FDA and other medical authorities are encouraging clinical trial developers to include more women.
In recruitment, bias has a similarly long history. Affinity bias is when an interviewer chooses a candidate most like themselves. Biased recruiters are why diversity programs have come about. However, bias in recruitment (and other areas) is often unconscious. Bias comes in many forms; unconscious sexism, racism, and ageism are common themes found in attempts to remove bias from recruitment.
So, if bias is human, can we really expect AI to be non-biased?
How bias creeps into AI algorithms
I recently received a preprint of an academic paper exploring the bias in LLMs (Large Language Models). The paper is from two researchers in evolutionary psychology/cultural evolution, Alberto Acerbi and Joseph Stubbersfield. The paper experimented to determine the existence of bias in LLMs.
The experiment used a principle of cultural evolution called "transmission chains." Transmission chains are useful for observing how information is passed between people and if any changes occur during the transfer. The transmission chain model has been used to understand how cumulative and systematic biases influence cultural transmission and evolution. There’s some interesting work on this aspect of cultural evolution, well worth looking up.
But back to the preprint and LLM biases. The researchers asked the LLM to summarize a story and then present the summarized version to itself in the next step, then continue this process iteratively. The researchers were looking to see if the LLM had the same biases inherent in humans. The researchers reveal that the LLM does indeed have the same biases as humans – this fact may not be so surprising. However, there were some important differences between the LLM and human biases, particularly around gender-bias as the authors summarize:
“On the other side, for the same reason, those biases could be more difficult to recognize, and they could have consequential downstream effects, by magnifying the pre-existing human tendencies. We might anticipate that, without human intervention, LLMs could enable negative gender stereotypes to persist in potentially harmful ways.”
This worrying consequence of the magnification of biases has some serious implications for humans. Suppose we imbue our systems with LLMs and use them to make decisions. In that case, any inherent bias, no matter how insignificant it may seem, will magnify over time, compounding bias to generate inaccurate or harmful decisions.
The transmission chain experiment shines a light on compounding errors and bias build-up, but the many places where bias enters AI are the very places where preventing bias can begin. Here is a flavor of some types of AI bias:
Data bias: if the data the algorithm is trained on is biased, the AI system will inherit that bias, and the biases will be perpetuated. An obvious example is if the training data is mostly from a specific demographic of people, the AI system may need help in understanding or responding accurately to inputs from other demographics. There are numerous examples of this type of bias: from an automated soap dispenser that seemed unable to deliver soap to anyone other than those with white skin to voice-activated digital assistants who can't understand regional accents. I can attest to the latter as I have yet to find a virtual assistant to understand my accent.
Algorithmic bias: the algorithms and techniques used in developing AI systems can cause bias. During the Covid pandemic, a form of algorithmic bias was identified in a system used to mark the exams of school kids. The algorithm was found to be biased against poor children, giving those children lower grades than expected and as compared to their wealthier peers. The design choices, feature selection, or optimization objectives of an algorithm, like any software system, can inadvertently introduce or amplify biases in the system's behavior.
Representation bias: AI models can reinforce existing societal biases and stereotypes present in the data used to train the AI system. Any imbalances or inaccuracies in the training data will be reflected, and as we have seen in the transmission chain experiment, those biases may be amplified; this is of particular concern if used in the justice system, where it could lead to unfair or discriminatory outcomes.
Confirmation bias: AI systems may exhibit confirmation bias by prioritizing or favoring information that aligns with pre-existing beliefs or assumptions present in the training data. This bias can result in the system reinforcing existing perspectives rather than providing objective or diverse viewpoints. An infamous example is an AI tool developed by Amazon to help in recruitment. The tool, dropped by Amazon in 2017, was blatantly sexist, preferring male candidates as it was trained using a male-dominated resume data set.
Can we do anything to remove AI bias?
Of course, the question must be, can we prevent bias from slipping into AI systems? People beget AI systems. AI is trained using human data, and humans develop the algorithms that utilize this data. It may not always be this way, but we have some control for now.
Data is paramount to building unbiased AI. Synthetic data is finding some traction in addressing bias – a synthetic dataset is not actual data but maintains the statistical characteristics of real data. So synthetic data is a win-win for AI and bias removal.
Synthetic data use in AI can help to ensure that data sets are more representative of a diverse group, the synthetic data set being corrected for balance. Sometimes synthetic data is used to 'fill in the gaps' of a training set that is known to contain bias. But you must be careful, as synthetic data can become something it is trying to prevent if badly designed.
Using a multidisciplinary approach to addressing bias in AI is opening new pathways to improve accuracy and diversity. The earlier paper provides insights into how AI bias can be exacerbated. This paper comes from the social sciences, not computer science. Areas such as the behavioral sciences and ethics can inform algorithm design and better use of data and human-machine interaction.
The bottom line is that artificial intelligence still depends on humans to define its boundaries. We just need to understand where those boundaries exist.
An important thing to remember comes from the world of cybersecurity: anyone who has worked in cybersecurity will know as soon as you close one door, a hacker will open another. The world of AI is no different. OpenAI has added mechanisms such as human feedback to try and remove some of the bias and ensure that images generated by DALL.E are more representative of the world's diversity. However, as OpenAI adds restrictions to ChatGPT, methods to circumvent those restrictions appear – just Google "how to circumvent ChatGPT restrictions."
A last word from ChatGPT
I’ll leave the last word to ChatGPT4. I asked the LLM about bias in AI, and it said this:
“Addressing AI biases is an ongoing challenge in AI development. Researchers and practitioners are actively working on techniques such as bias detection, data preprocessing, algorithmic fairness, and diverse training data collection to mitigate and minimize biases in AI systems. Ethical guidelines, regulatory frameworks, and inclusive practices also play a crucial role in ensuring fairness, transparency, and accountability in AI deployment.”
Thanks, ChatGPT. Hopefully, we humans will heed your advice.
Your email address will not be published. Required fields are markedmarked