Focusing computational visual attention in multi-modal human-robot interaction
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Color naming models for color selection, image editing and palette design
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
An assistive vision system for the blind that helps find lost things
ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II
Hi-index | 0.00 |
In recent years, natural verbal and non-verbal human-robot interaction has attracted an increasing interest. Therefore, models for robustly detecting and describing visual attributes of objects such as, e.g., colors are of great importance. However, in order to learn robust models of visual attributes, large data sets are required. Based on the idea to overcome the shortage of annotated training data by acquiring images from the Internet, we propose a method for robustly learning natural color models. Its novel aspects with respect to prior art are: firstly, a randomized HSL transformation that reflects the slight variations and noise of colors observed in real-world imaging sensors, secondly, a probabilistic ranking and selection of the training samples, which removes a considerable amount of outliers from the training data. These two techniques allow us to estimate robust color models that better resemble the variances seen in real world images. The advantages of the proposed method over the current state-of-the-art technique using the training data without proper transformation and selection are confirmed in experimental evaluations. In combination, for models learned with pLSA-bg and HSL, the proposed techniques reduce the amount of mislabeled objects by 19.87% on the well-known E-Bay data set.