Artists fighting back against AI by poisoning their images


There are tools available that can poison data and cause a malfunction of AI models. The question is: is using them a justified response by artists to copyright infringement or a potential menace to cybersecurity?

In October, the University of Chicago researchers introduced "Nightshade," a data poisoning technique strategically crafted to disrupt the training process of AI models.

The study suggests that Nightshade and similar tools that disrupt AI could serve as a last line of defense for content creators against web scrapers, emphasizing implications for both model trainers and content creators.

ADVERTISEMENT

In their research paper, researchers explained that Nightshade is a type of data poisoning attack that can manipulate the training process of text-to-image generative models. Unlike previous understandings of poisoning attacks, this research demonstrates that successful attacks don't necessarily require the injection of millions of poison samples.

AI poison
Image generated from different prompts by a poisoned model where concept “dog” is poisoned. Without being targeted, nearby concepts are corrupted by the poisoning. SD-XL model poisoned with 200 poison samples. Source: Research Paper

Nightshade focuses on prompt-specific poisoning attacks, optimizing poison samples to appear visually identical to benign images with corresponding text prompts. These optimized poison samples can significantly affect the stability and performance of generative models with just a small number of injections, disrupting their ability to generate meaningful images.

The researchers tried an attack on Stable Diffusion's newest models and an AI model that they built. When they gave Stable Diffusion just 50 poisoned dog images and asked it to create more dog images, the results became warped, with the creatures having too many limbs and cartoonish faces. With 300 poisoned samples, an attacker can fool Stable Diffusion into producing dog images that resemble cats.

Along with Nightshade, there are other tools currently available in the market, such as Glaze and Aspose, to disrupt AI training on the artists’ imagery.

AI poison
Examples of images generated by the Nightshade-poisoned SD-XL models and the clean SD-XL model. Source: Research paper

A threat to cybersecurity?

The poisoning technique is disrupting how AI models work, so this could have cybersecurity implications.

ADVERTISEMENT

One of the researchers, Ben Y Zhao, told MIT that there is a potential risk involved that threat actors might abuse the data poisoning technique for malicious purposes.

Nevertheless, he emphasizes that to cause significant harm to larger and more robust models, attackers would require thousands of poisoned samples, considering these models are trained on billions of data samples.

Creators’ rights controversy

With AI legislation still in its inception phase, AI companies have received backlash from creators for using their content to train AI models.

AI companies like Stability AI and OpenAI, which have created generative text-to-image models, allow artists the option to exclude their images from training future model versions. However, artists argue that this gesture is insufficient.

A lawsuit filed by visual artists in January 2023 accused Stability AI, Midjourney, and DeviantArt of misusing their copyrighted work in connection with the companies' generative artificial intelligence systems. However, the lawsuit was rejected by a California federal court in October 2023. The ruling claimed that the images the systems created likely did not infringe the artists' copyrights.

The stir caused by AI delving into the realm of human creativity has resonated across many different industries. For example, last year, thousands of members of the Writers Guild of America (WGA) went on strike, asking for regulations for AI use as one of their demands. The strikers wanted to ban AI from writing or rewriting literary material.

ADVERTISEMENT