The Added Value of Multimodality in the NESPOLE! Speech-to-Speech Translation System: an Experimental Study

Authors:
Erica Costantini
Affiliations:
-
Venue:
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Year:
2002

Citing 5
Cited 2

Integration and synchronization of input modes during multimodal human-computer interaction

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Architecture and design considerations in NESPOLE!: a speech translation system for e-commerce applications

HLT '01 Proceedings of the first international conference on Human language technology research
The NESPOLE! speech-to-speech translation system

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Multimodal interactive maps: designing for human performance

Human-Computer Interaction

The NESPOLE! Multimodal Interface for Cross-lingual Communication - Experience and Lessons Learned

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Language engineering and the pathway to healthcare: a user-oriented view

MST '06 Proceedings of the Workshop on Medical Speech Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimodal interfaces, which combine two or more input modes (speech, pen, touch驴), are expected to be more efficient, natural and usable than single-input interfaces. However, the advantage of multimodal input has only been ascertained in highly controlled experimentalconditions [4, 5, 6]; in particular, we lack data about what happens with real' human-human, multilingual communication systems. In this work we discuss the results of an experiment aiming to evaluate the added value of multimodality in a "true" speech-to-speech translation system, the NESPOLE! system, which provides for multilingual and multimodal communication in the tourism domain, allowing users to interact through the internet sharing maps, web-pages and pen-based gestures. We compared two experimental conditions differing as to whether multimodal resources were available: a speech-only condition (SO), and a multimodal condition (MM). Most of the data show tendencies for MM to be better than SO.