Combining speech and haptics for intuitive and efficient navigation through image databases

Authors:
Thomas Käster;Michael Pfeiffer;Christian Bauckhage
Affiliations:
Bielefeld University, Bielefeld, Germany;Bielefeld University, Bielefeld, Germany;Bielefeld University, Bielefeld, Germany
Venue:
Proceedings of the 5th international conference on Multimodal interfaces
Year:
2003

Citing 18
Cited 10

Sum and Difference Histograms for Texture Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Color indexing

International Journal of Computer Vision
Perceptual user interfaces: multimodal interfaces that process what comes naturally

Communications of the ACM
Support vector domain description

Pattern Recognition Letters - Special issue on pattern recognition in practice VI
Consistency of personality in interactive characters: verbal cues, non-verbal cues, and user characteristics

International Journal of Human-Computer Studies
Human-Computer Interaction

Human-Computer Interaction
Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image Retrieval by Regions: Coarse Segmentation and Fine Color Description

VISUAL '02 Proceedings of the 5th International Conference on Recent Advances in Visual Information Systems
Integrated Recognition and Interpretation of Speech for a Construction Task Domain

Proceedings of HCI International (the 8th International Conference on Human-Computer Interaction) on Human-Computer Interaction: Ergonomics and User Interfaces-Volume I - Volume I
Bayesian networks for speech and image integration

Eighteenth national conference on Artificial intelligence
Differential Invariants for Color Images

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Multimodal Interfaces for Multimedia Information Agents

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Evaluating Integrated Speech- and Image Understanding

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Support vector clustering

The Journal of Machine Learning Research
Helping Computer Vision by Verbal and Nonverbal Communication

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments

IEEE Transactions on Image Processing
Relevance feedback: a power tool for interactive content-based image retrieval

IEEE Transactions on Circuits and Systems for Video Technology

Combining environmental cues & head gestures to interact with wearable devices

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Content-based image retrieval: approaches and trends of the new age

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Vision systems with the human in the loop

EURASIP Journal on Applied Signal Processing
The visual active memory perspective on integrated recognition systems

Image and Vision Computing
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
Towards a Multidimensional Approach for the Evaluation of Multimodal Application User Interfaces

Proceedings of the 13th International Conference on Human-Computer Interaction. Part II: Novel Interaction Methods and Techniques
Usability study of multi-modal interfaces using eye-tracking

INTERACT'07 Proceedings of the 11th IFIP TC 13 international conference on Human-computer interaction - Volume Part II
A cognitive vision system for action recognition in office environments

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Avaliação da usabilidade de MUI: um estudo de caso

Proceedings of the 10th Brazilian Symposium on on Human Factors in Computing Systems and the 5th Latin American Conference on Human-Computer Interaction
Empirical evaluation of multimodal input interactions

HCI International'13 Proceedings of the 15th international conference on Human Interface and the Management of Information: information and interaction design - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given the size of todays professional image databases, the stan-dard approach to object- or theme-related image retrieval is to in-teractively navigate through the content. But as most users of such databases are designers or artists who do not have a technical back-ground, navigation interfaces must be intuitive to use and easy to learn. This paper reports on efforts towards this goal. We present a system for intuitive image retrieval that features different moda-lities for interaction. Apart from conventional input devices like mouse or keyboard it is also possible to use speech or haptic gesture to indicate what kind of images one is looking for. Seeing a selection of images on the screen, the user provides relevance feedback to narrow the choice of motifs presented next. This is done either by scoring whole images or by choosing cer-tain image regions. In order to derive consistent reactions from multimodal user input, asynchronous integration of modalities and probabilistic reasoning based on Bayesian networks are applied. After addressing technical details, we will discuss a series of usability experiments, which we conducted to examine the impact of multimodal input facilities on interactive image retrieval. The results indicate that users appreciate multimodality. While we ob-served little decrease in task performance, measures of contentment exceeded those for conventional input devices.