An age-old problem: how age estimation tools let kids see porn


Is age estimation software up to the task of protecting our kids? If not, what can be done to improve the method and stop children from accessing inappropriate content?

I was recently talking to a young family member, let's call him Dan, about technology. As he is in his mid-teens, we discussed access to age-related content and the UK's Online Safety Bill.

Dan is a big fan of YouTube. He told me he could see any over-18 video on YouTube as he always passed the age estimation checks. He demonstrated this, and sure enough, he could see an explicit video meant for over 18s. Dan is several years under 18.

You may say, "Well, teenagers, you gotta love 'em; they'll always find a way to see stuff they're not supposed to."

But this isn't really the issue. Sure, even when I was a teen, we found "dirty mags" that we really shouldn't see, we asked our older cousins to buy us bottles of cider from the off-license, and so on. However, there is no excuse for any global tech giant to allow kids to see inappropriate content. Identifying someone's age to ensure they view age-appropriate content should be a fundamental part of humanity's tech maturation.

However, this isn't about YouTube, it’s about age-estimation software failing kids. But can age be accurately determined using software alone, or should we rely on other social cues and processes to control the impact of access to inappropriate content? Is the answer to age checks a social-technical one?

Why do we need age-estimation?

It goes without saying that accessing online content can open a can of worms. When children enter the (often) murky online world, they do not have the experience and knowledge of an adult to navigate safely.

Even adults struggle with some of the nastier sides of social media. However, minors can be exposed to disturbing images that can be deeply upsetting and even affect their worldview and how they interact with others.

Regulations and laws can help establish the framework to protect children from the more toxic nature of the internet, and in particular, social media nasties.

In the UK, this takes the form of the Online Safety Bill, intending to make the online world safer for children. This laudable goal is in various stages of execution worldwide.

Another example of a regulation that includes the protection of minors is the EU's Digital Services Act (DSA). Several states in the USA are enacting laws covering age restrictions. However, the situation is somewhat complicated in the USA, with requirements varying across states.

Many recent regulations suggest the use of age verification or age estimation tools. The UK Online Safety Bill, for example, says that sites that display pornographic images must use tools that are “ highly effective in establishing whether a user is a child or not.” The problem is that the basis for establishing age is not as easy or as accurate as it may seem.

Age estimation or age verification?

Age estimation is not age verification. Estimating someone's age has become a tool of choice because it achieves better privacy and improved usability.

Unlike age verification, age estimation does not require personal data to be shared. Instead, the software typically uses an image of the individual's face to estimate their age. The estimation parameters are based on AI and advanced algorithms.

This balance between privacy and usability comes at a cost. For example, many age estimation tools proudly state they do not hold copies of people's faces. However, this is a double-edged sword as you cannot then later cross-reference the access to inappropriate content with the user – again, like the purchase of alcohol by an older cousin, someone who is over the age of 18, or at least looks older than 18, can provide access before handing the device to a child.

The age-old problem of human beings being really good at circumventing barriers means that anything short of tying a person to an event or transaction will be bound to fail, at least some of the time. The counterargument is that you can't ever be 100% fail-proof, which is true. However, in the case of protecting children, this bar should be as high as possible.

Humans estimating age

We all know that humans circumvent technology's best efforts, but how good are humans at estimating age in real life? If we know this, we may have clues to better designing digital versions of age estimation.

As a young woman in my 20s, I was routinely asked to leave pubs as I looked underage (those days are long gone).

Age estimation in real life is an intrinsic method used by humans to determine the age of someone in our cohort. As such, human-human age estimation occurs constantly in everyday transactions.

However, my trouble getting served in pubs when I was 25 may be unusual. Research by Liebst et al. at the University of Copenhagen has found that real-world age estimation is surprisingly accurate. The paper looked at "naturalistic observation" of age, i.e., observing people in their natural surroundings. The researchers found that age estimations by human subjects were accurate and reliable across the raters assessing age.

However, the paper stresses that accuracy may be attributable to a "range of contextual and behavioral information" available to age raters. This is an important factor to note – having multiple variables may also be the key to improving age estimation performed by software tools.

Interestingly, the study found a "tendency to overestimate the age of younger persons and underestimate the age of older persons." The paper found estimation errors for faces were around five years, and for voices, around ten years. The paper also notes estimation errors based on ethnicity, age, and sex. Weighting an age estimation algorithm with these factors will help enhance accuracy.

Indeed, many age estimation software tools go some way to caveating accuracy by suggesting that the tool's configuration reflects these potential errors.

A recent example demonstrates the importance of correctly weighting age estimation. A misconfiguration by OnlyFans of its age estimation technology led to non-compliance with age restrictions. Instead of the suggested limit of blocking anyone who looks under 23 years old, the firm had set the limit to 20 years old. OFCOM is currently investigating OnlyFans to see if they are in dereliction of their duty to prevent under-18s from accessing the site. Still, the 23-year-old or under criterion seems a blunt tool.

Improving on the natural world

Age estimation technology is an intrinsic part of online life, and its continued improvement in estimation capabilities is a must. However, the next generation of age checks will be a risk-based approach to age estimation. The Liebst et al. paper demonstrates that even humans, who have evolved to recognize facial subtleties, must utilize multiple variables to improve accuracy. But what about computer vision and AI in age estimation? Is this seemingly ultimate method of checking someone's age actually in need of support?

Instead of relying on AI face training to solve the age estimation issue, we should look at multiple other variables, like behavior and even dress, to identify someone's age. Adding additional variables based on the transaction's risk level may provide the checks and measures that age estimation based on face alone cannot. However, as is always the case, this is where the Ying hits the Yang. Usability is one of the most important aspects of face-based age estimation. A quick face check makes you almost 100% accurate in estimating that 13–17-year-olds are under 25. A risk-based approach could be the way to balance usability and accuracy.

However, if your 14-year-old can access OnlyFans or an upsetting video on social media, then the 99%+ estimation accuracy means nothing. This critical aspect of the digital world is about to evolve further. If this stops my young family member from seeing inappropriate content, this is great, but it may also stop a young person from being groomed and abused, in which case it is essential.