What One Dad Learned When He Recorded 230,000 Hours Of His Son Learning To Talk

"What's emerging is an ability to see new social structures and dynamics that have previously not been seen."

Most parents hope to catch their child's first words on film, but Deb Roy, a tenured professor at MIT and former chief media scientist at Twitter, had something a bit different in mind when he brought his newborn son home from the hospital. He wanted to understand the process of how a child learns language, which is something he recounted during a 2011 TED Talk.

To do that, the new dad (along with his wife, Rupal) installed cameras and microphones throughout his house to provide continuous capture of all the goings-on. Over the course of three years, the equipment recorded 8-10 hours per day and amassed a massive dataset consisting of 90,000 hours of video and 140,000 hours of audio. From that data, Roy was able to chronicle how his son first said "gaga" to mean water soon after he turned one. Over the course of the next half-year, the microphones captured how the toddler slowly learned to approximate the proper adult form, "water."

By the time Roy's son turned two, he had learned 503 words, and the video and audio data helped shed a light on why and how the young child learned certain utterances before others. One interesting pattern Roy found was that each time his son learned a word, caregiver speech would systematically dip to a minimum, making language as simple as possible, and then slowly ascend back up in complexity. To put that another way, while Roy's son was obviously learning from his linguistic environment, the environment was also learning from him.


Roy also examined the visual aspect, which helped him understand how social interactions affect the way language is learned, and established the connection between language and events. While the data shows the word "water" was spoken predominantly in the kitchen, the word "bye" was uttered mostly near the entrance to Roy's home.

This connection between language and events was then applied to public media, and helped illustrate how television influences what people converse about on social media. In analyzing how television impacts social media conversations, Roy noticed the emergence of several interesting patterns. "A piece of content, an event, causes someone to talk. They talk to other people. That drives tune-in behavior back into mass media, and you have these cycles that drive the overall behavior," he explained. 

Roy also noted that sometimes it's content — such as President Obama's 2011 State of the Union Address — that drives the conversation. "We can X-ray and get a real-time pulse of a nation, real-time sense of the social reactions in the different circuits in the social graph being activated by content," he noted.

On a larger scale, Roy drew a connection between conversation and context, and how that relationship is illuminating new social structures. "As our world becomes increasingly instrumented and we have the capabilities to collect and connect the dots between what people are saying and the context they're saying it in, what's emerging is an ability to see new social structures and dynamics that have previously not been seen," he concluded. "It's like building a microscope or telescope and revealing new structures about our own behavior around communication. And I think the implications here are profound, whether it's for science, for commerce, for government, or perhaps most of all, for us as individuals."


Subscribe to our newsletter and get the latest news and exclusive updates.