Baby wears camera for 18 months to teach language to AI

While ChatGPT had the entire internet to train on, researchers have tried limiting an AI platform’s data input to only that which a newborn has. Surprisingly, the AI learned to recognize objects just as well as the baby.

A group of scientists at New York University used 61 hours of footage from a camera attached to an Australian baby named Sam to gather experiences from the infant’s perspective.

Sam wore the camera for about one hour twice a week, from six months old until approximately two years of age. The researchers used a neural network inspired by the brain's structure. The input that the neural network received consisted of roughly 1% of the baby’s waking hours.

The training material included frames from the video recordings of the baby’s environment and transcribed words spoken to Sam. In total, the model was exposed to 250,000 words and corresponding images depicting various activities like playing, reading, and eating.

The model used a contrastive learning technique, where an AI learns to identify which images and words tend to occur together and which do not. The AI learned only by building associations between the images and words it saw together, and it was not programmed with any other prior knowledge about language.

To assess the AI's performance, researchers conducted a word-to-image matching test, a common method used to evaluate children's language skills. The AI achieved a 62% accuracy rate, significantly surpassing the 25% expected by chance and similar to another AI model trained on 400 million image-text pairs.

The AI accurately identified unfamiliar instances of words like 'apple' and 'dog' that humans identify easily, achieving a 35% success rate on average. Its ability to recognize objects out of context was stronger for items occurring frequently in the training data.

However, scientists agree that real-world language learning is a much more complex process than their AI model experienced. The model’s training was limited to still images and written text, so it lacked some crucial interactions that babies have in real life.

For example, the AI model struggled to learn the word ‘hand.’ Learning about the hands is one of the first essential achievements of babies. Unfortunately, the AI model had no access to this groundbreaking discovery.

The research published in Science journal on February 1st, 2024, shows that AI could help in understanding how humans learn language in infancy. The fact that AI models learned the language on a limited training dataset brings AI models closer to real-life situations.

While Large Language Models such as ChatGPT train on billions of data points, babies learn on the limited data they have access to in their cribs and their interactions with family members.