Naive physics, event perception, lexical semantics, and language acquisition
Naive physics, event perception, lexical semantics, and language acquisition
Mean Shift: A Robust Approach Toward Feature Space Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Maximum-Likelihood Approach to Visual Event Classification
ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
Visual Event Classification via Force Dynamics
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
When push comes to shove: a computational model of the role of motor control in the acquisition of action verbs
Experience-based language acquisition: a computational model of human language acquisition
Experience-based language acquisition: a computational model of human language acquisition
Hi-index | 0.00 |
This paper introduces an open computational framework for visual perception and grounded language acquisition called Experience-Based Language Acquisition (EBLA). EBLA can "watch" a series of short videos and acquire a simple language of nouns and verbs corresponding to the objects and object-object relations in those videos. Upon acquiring this protolanguage, EBLA can perform basic scene analysis to generate descriptions of novel videos.The performance of EBLA has been evaluated based on accuracy and speed of protolanguage acquisition as well as on accuracy of generated scene descriptions. For a test set of simple animations, EBLA had average acquisition success rates as high as 100% and average description success rates as high as 96.7%. For a larger set of real videos, EBLA had average acquisition success rates as high as 95.8% and average description success rates as high as 65.3%. The lower description success rate for the videos is attributed to the wide variance in the appearance of objects across the test set.While there have been several systems capable of learning object or event labels for videos, EBLA is the first known system to acquire both nouns and verbs using a grounded computer vision system.