The human speechome project

Authors:
Deb Roy;Rupal Patel;Philip DeCamp;Rony Kubat;Michael Fleischman;Brandon Roy;Nikolaos Mavridis;Stefanie Tellex;Alexia Salata;Jethran Guinness;Michael Levit;Peter Gorniak
Affiliations:
Cognitive Machines Group, MIT Media Laboratory;Communication Analysis and Design Laboratory, Northeastern University;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory;Cognitive Machines Group, MIT Media Laboratory
Venue:
EELC'06 Proceedings of the Third international conference on Emergence and Evolution of Linguistic Communication: symbol Grounding and Beyond
Year:
2006

Citing 2
Cited 13

The affordance-based concept

The affordance-based concept
Semiotic schemas: A framework for grounding language in action and perception

Artificial Intelligence - Special volume on connecting language to the world

Baby steps: evaluation of a system to support record-keeping for parents of young children

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A human-machine collaborative approach to tracking human movement in multi-camera video

Proceedings of the ACM International Conference on Image and Video Retrieval
Towards surveillance video search by natural language query

Proceedings of the ACM International Conference on Image and Video Retrieval
Grounding spatial prepositions for video search

Proceedings of the 2009 international conference on Multimodal interfaces
An immersive system for browsing and visualizing surveillance video

Proceedings of the international conference on Multimedia
Grounding spatial language for video search

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Social sensing: obesity, unhealthy eating and exercise in face-to-face networks

WH '10 Wireless Health 2010
Pervasive sensing to model political opinions in face-to-face networks

Pervasive'11 Proceedings of the 9th international conference on Pervasive computing
Embedded capture and access: encouraging recording and reviewing of data in the caregiving domain

Personal and Ubiquitous Computing
Lullaby: a capture & access system for understanding the sleep environment

Proceedings of the 2012 ACM Conference on Ubiquitous Computing
A high accuracy, low-latency, scalable microphone-array system for conversation analysis

Proceedings of the 2012 ACM Conference on Ubiquitous Computing
A high speed transcription interface for annotating primary linguistic data

LaTeCH '12 Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
The User's Touch: A Design Requirement for Smart Spaces

International Journal of Advanced Pervasive and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Human Speechome Project is an effort to observe and computationally model the longitudinal course of language development for a single child at an unprecedented scale. We are collecting audio and video recordings for the first three years of one child's life, in its near entirety, as it unfolds in the child's home. A network of ceiling-mounted video cameras and microphones are generating approximately 300 gigabytes of observational data each day from the home. One of the worlds largest single-volume disk arrays is under construction to house approximately 400,000 hours of audio and video recordings that will accumulate over the three year study. To analyze the massive data set, we are developing new data mining technologies to help human analysts rapidly annotate and transcribe recordings using semi-automatic methods, and to detect and visualize salient patterns of behavior and interaction. To make sense of large-scale patterns that span across months or even years of observations, we are developing computational models of language acquisition that are able to learn from the childs experiential record. By creating and evaluating machine learning systems that step into the shoes of the child and sequentially process long stretches of perceptual experience, we will investigate possible language learning strategies used by children with an emphasis on early word learning.