Images that distort AI’s vision affect humans too

A recent study by Google reveals that slight alterations made to digital images – which are aimed at misleading computer vision systems – can also impact the way humans perceive them.

A group of researchers have concluded a series of experiments on the effects of adversarial images on human perception.

What’s an adversarial image? Well, they’re images with subtly altered individual pixels, causing an AI model to misclassify the image contents completely. For example, an image of a vase could be slightly tinkered with – in a way that’s not visible to the human eye – to make the AI categorize it as a cat.

In a digital image, each individual pixel in an RGB image is on a 0-255 scale, representing the intensity of individual pixels. An adversarial attack – an intentional deception – can be effective even if no pixel is modulated by more than 2 levels on that scale.

While human perception of the adversarial image will not be affected by such irrelevant noise, artificial neural networks (ANN) could be easily misled. One possible explanation is that machine perception is heavily influenced by texture, whereas human perception is guided by shape.

Although there are assumptions that humans are impervious to adversarial perturbations that fool ANNs, research published in Nature Communications presents evidence that human judgments can be systematically influenced by them.

Researchers performed controlled behavioral experiments with ANN and humans. To start with, they took a series of original images and carried out two adversarial attacks on each to produce many pairs of perturbed images. The model misclassified the perturbed image with high confidence.

ANN experiment
Left: ANN correctly classifies the image as a vase, but when perturbed by a seemingly random pattern across the entire picture (middle), with the intensity magnified for illustrative purposes – the resulting image (right) is incorrectly, and confidently misclassified as a cat. | Source: Google

Following this, the researchers presented pairs of pictures to human participants and posed a specific question: "Which image appears more cat-like?" Even though neither image resembled a cat, participants were required to make a choice, often expressing a sense of arbitrariness in distinguishing between nearly identical images.

If brain activations are insensitive to subtle adversarial attacks, people would choose each picture 50% of the time, on average. However, researchers found that the choice rate was reliably above chance for a wide variety of perturbed picture pairs, even when no pixel was adjusted by more than 2 levels on the 0-255 scale.

ANN experiment
Left: Examples of pairs of adversarial images. The top pair of images are subtly perturbed, at a maximum magnitude of 2-pixel levels, to cause a neural network to misclassify them as a “truck” and “cat”, respectively. A human volunteer is asked, “Which is more cat-like?” The lower pair of images are more obviously manipulated, at a maximum magnitude of 16-pixel levels, to be misclassified as “chair” and “sheep”. The question this time is, “Which is more sheep-like?” | Source: Google

The discovery highlights a similarity between human and machine vision. However, while researchers found that adversarial manipulations reliably altered human perception, the change observed is not as drastic as the complete shift in classification decisions that are seen in artificial neural networks.

“Our primary finding that human perception can be affected – albeit subtly – by adversarial images raises critical questions for AI safety and security research, but by using formal experiments to explore the similarities and differences in the behavior of AI visual systems and human perception, we can leverage insights to build safer AI systems,” the scientists said.

According to them, their findings could guide future research aimed at enhancing the resilience of computer vision models by aligning them more closely with human visual representations. Assessing human vulnerability to adversarial perturbations may serve as a valuable metric for evaluating the alignment of various computer vision architectures.

More from Cybernews:

Don’t trust links with known domains: BMW affected by redirect vulnerability

Most consumers will ditch a brand that’s suffered a data breach

Barrick Gold breach exposes thousands of Social Security numbers

What is cyber kidnapping - explained

Microsoft pushes for Copilot with a new dedicated keyboard key

Subscribe to our newsletter

Leave a Reply

Your email address will not be published. Required fields are markedmarked