Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Hi-index | 0.00 |
In this paper, we describe a novel method of word acquisition through multimodal interaction between a humanoid robot and humans. The developed robot realizes word, actually verb, acquisition from raw multimodal sensory stimulus by seeing movement of the given objects and listening to spoken utterance by humans without symbolic representations of semantics. In addition, the robot can utter the learnt words base on its own phonemes which correspond to the categorical phonetic feature map. We consider that words bind directly to non-symbolic perceptual physical feature: such as visual features of the given object and acoustic features of given utterance, aside from symbolic representations of semantics.