Here's how many people it takes to weaponize TikTok's "Not Interested" button and degrade everyone's recommendations

The systems that underpin our social media, video and music platforms can be disrupted by as few as 40 coordinated users, according to new research.
-
Just 40 coordinated users weaponizing the "Not Interested" button can degrade recommendation quality by 20% for everyone on platforms like TikTok and Spotify.
-
Adversarial reports don't break the system – they tighten it, eating into the algorithm's "risk budget" and forcing it to filter more aggressively, resulting in up to 80% repeated content.
-
Attackers can't suppress specific content (like certain creators or topics) because the math doesn't know what's being flagged – only that users marked it "not interested."
-
Solution exists: calibrating safety thresholds individually per user instead of across the entire platform neutralizes attacks while cutting repeated content by 99%.
Almost everything we consume is filtered through algorithms these days – and when they occasionally go wrong, or take a sharp turn, as we saw with the takeover of Twitter by Elon Musk in October 2022, we notice just how much control we cede to computers.
But we ought to care more, because new research suggests a coordinated group representing as little as 1% of users on a platform can degrade the quality of recommendations shown to everyone else by up to 20%, just by weaponising the "Not Interested" button to game the algorithm, according to new research.
The findings, published by Giovanni De Toni at Italy's Fondazione Bruno Kessler and colleagues at the European Commission's Joint Research Centre, suggest that a new generation of what are often called "risk-controlling" recommender systems may be more vulnerable than previously thought.
These systems are designed to give users provable guarantees about how much unwanted content they encounter, but they may be more vulnerable than we think to small, organised groups acting in bad faith.
Warping the world’s view
Recommendation engines shape what nearly half the world's population sees online, from videos on TikTok to playlists on Spotify. To rein in their tendency to amplify harmful or unwanted material, platforms have increasingly handed users interface-level controls designed to pick their own feeds.
The newest systems use a statistical technique called conformal risk control to mathematically bound, in expectation, how often unwanted items appear in a user's feed.
But that mathematical technique has a weakness that can be exploited. "Their reliance on aggregate feedback signals makes them inherently susceptible to coordinated adversarial user behaviour," the authors write, in what they describe as a pre-deployment audit of the technology.
Using a publicly released dataset from the Chinese short-form video platform Kuaishou, the team simulated what would happen if a small collective of users decided to coordinate their reports.
They discovered a group of just 40 users, each flagging at most 1% of the items they were shown, dragged a standard recommendation quality metric called nDCG down by 20% in the worst case. Recall, which measures how well the system surfaces relevant items, fell by up to 60% in some scenarios.
Silent but deadly
What makes it worse is that adversarial reports don’t break the system's safety guarantees; they tighten them.
Each strategically reported item eats into the system's "risk budget", forcing the algorithm to filter more aggressively across the entire user base. In some experiments, up to 80% of the videos shown to users were repeats.
While the ease with which a feed could be waylaid is a concern, there is one form of mischief-making the systems can avoid. Users coordinating can’t reliably suppress a specific category of content – such as videos about a particular topic or creator, because the mathematical problem that’s being exploited never knows the content of items that get flagged.
Still, there’s a risk for platforms because the EU's Digital Services Act requires very large online platforms to assess risks arising from "intentional manipulation" of their services, including coordinated activity.
Solving that issue is important so that platforms don’t fall foul of regulatory missteps.
The authors have found a solution: calibrating the system's safety threshold for each user individually, rather than across the whole population, helps avoid the issue. In their simulations, the attacks became neutralised while still meeting the safety guarantees – and the friction was reduced by cutting the share of repeated items by roughly 99%.
Unlock more exclusive Cybernews content on YouTube.