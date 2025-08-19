Even though facial recognition is used ever more widely each year, researchers now say these systems have mostly been tested in the lab and thus aren’t actually as effective on the street.

It’s among us, but it hasn’t really been tested among us. Yep, facial recognition technology is being deployed en masse, and its advocates claim the systems are performing extremely well (and better).

Typically, the US National Institute of Standards and Technology’s (NIST) Facial Recognition Technology Evaluation (FRTE), a performance benchmark, has been used to justify the deployment of facial recognition systems in the US, the UK, and elsewhere.

But University of Oxford academics Teo Canmetin, Luc Rocher, and Juliette Zaccour now say that the benchmark is actually problematic.

According to their post on the Tech Policy Press website, these tests reflect performance in laboratory conditions rather than real-world achievements because on the street, the technology has already failed multiple times.

For example, a Black man named Robert Williams was wrongfully arrested in Detroit in 2020 after being misidentified by facial recognition software. Police later admitted they were misled by a poor-quality surveillance image.

In 2024, Shaun Thompson, a London activist, was also wrongfully identified by live facial rec tech as a criminal suspect and “aggressively” stopped by police.

The facial rec technology has already been shown multiple times to be unreliable for people of color, women, and older folks. Researchers and activists have long claimed that the potential for errors is too great and that mistakes could result in the jailing of innocent people.

Still, “despite these repeated failures, the technology is rapidly being integrated into our daily lives, in airports, retail stores, and policing,” the Oxford academics point out in their article.

Moreover, the technology’s deployment is often justified by impressive accuracy statistics. For the latest and best-performing models, standardized evaluations now report figures as high as 99.95% accuracy. That’s the aforementioned FRTE benchmark.

“While invaluable for tracking advances in the field, these benchmarks are not necessarily a good choice for anticipating how systems cope with real-world challenges,” say the researchers, though.

That’s because these are lab evaluations conducted in controlled settings, “creating a misleading picture of how these systems truly perform when confronted with diverse, messy, and unpredictable real-world environments.”

According to the Oxford trio, these types of evaluation fail to reflect real-world conditions, where images may be blurred or obscured – like a rainy street or a crowded stadium, for example.

Besides, the data sets used are simply too small, which allows for a greater chance of misidentification. Finally, the benchmark datasets fail to reflect real-world demographic diversity.

Rather unsurprisingly, though, using facial recognition is by now pretty common across police departments worldwide.

In London, an independent review of the Metropolitan Police's live facial recognition trials recently found that only eight of 42 matches could be confirmed as absolutely accurate.

But last week, the Home Office still announced the rollout of 10 new live facial recognition vans to seven forces across the country, equipping officers with this cutting-edge – and controversial – technology to catch criminals.