Researchers extract audio from still images and silent videos

What if you could hear photos? Impossible, right? Not anymore – with the help of artificial intelligence (AI) and machine learning, researchers can now get audio from photos and silent videos.

Academics from four US universities have teamed up to develop a technique called Side Eye that can extract audio from static photos and silent – or muted – videos.

The technique targets the image stabilization technology that is now virtually standard across most modern smartphones.

To avoid blurry photos, cameras have small springs that hold the lens suspended in liquid, and an electromagnet and sensors then push the lens in equal and opposite directions to reduce camera shake.

What Side Eye does is analyze how camera lenses adjust to movements caused by sound waves, and extract that information from the recorded photo or video to reproduce the original sound.

That’s because, whenever someone speaks near a camera lens, it causes tiny vibrations in the springs and bends the light slightly. The angle changes almost imperceptibly, but it is definitely possible to detect them, researchers said.

The audio reconstruction accuracy of the Side Eye technique varies from 80% to 99% based on the amount and complexity of the sound that’s reconstructed, Kevin Fu, professor of electrical and computer engineering and computer science at Northeastern University, said.

Researchers say that Side Eye currently doesn't work with speech from human voices and was only tested with sound from powerful speakers. But it isn’t hard to imagine this being possible in the future, of course – and from the cybersecurity perspective, it’s quite dangerous.

“Our analysis and experiments with ten smartphones demonstrate how malicious parties with knowledge of camera hardware structure can extract fine-grained acoustic information from recorded videos, achieving digit, speaker, and gender recognition,” researchers said.

This is why their paper is called “Side Eye: Characterizing the Limits of POV Acoustic Eavesdropping from Smartphone Cameras with Rolling Shutters and Movable Lenses.”

The general expectation of smartphone users is that no information can be stolen through sound when smartphone microphone access is disabled – but they easily grant camera access to apps simply because they’re not aware of the possibility of acoustic eavesdropping.

But if the app is malicious (and there are loads of them), trouble awaits, researchers say. Yes, the audio is very muffled but machine learning and AI can help clear it up and turn it into a useful bit of information.

Countermeasures include using lower-quality cameras and holding the devices away from speakers. Adding vibration-isolation dampening materials is also advised, and the manufacturers keen on securing future camera devices should focus on camera design improvement, the paper said.

Researchers extract audio from still images and silent videos

More from Cybernews