An Introduction to Variational Methods for Graphical Models
Machine Learning
Learning words from sights and sounds: a computational model
Learning words from sights and sounds: a computational model
The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
One-Shot Learning of Object Categories
IEEE Transactions on Pattern Analysis and Machine Intelligence
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
On the integration of grounding language and learning objects
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Developing a user recommendation engine on twitter using estimated latent topics
HCII'11 Proceedings of the 14th international conference on Human-computer interaction: design and development approaches - Volume Part I
Hi-index | 0.00 |
In this paper we propose LDA-based framework for multimodal categorization and words grounding for robots. The robot uses its physical embodiment to grasp and observe an object from various view points as well as listen to the sound during the observing period. This multimodal information is used for categorizing and forming multimodal concepts. At the same time, the words acquired during the observing period are connected to the related concepts using multimodal LDA. We also provide a relevance measure that encodes the degree of connection between words and modalities. The proposed algorithm is implemented on a robot platform and some experiments are carried out to evaluate the algorithm. We also demonstrate a simple conversation between a user and the robot based on the learned model.