Building an application framework for speech and pen input integration in multimodal learning interfaces

Authors:
R. A. Goubran;C. Wood
Affiliations:
Interactive Syst. Labs., Carnegie Mellon Univ., Pittsburgh, PA, USA;-
Venue:
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 06
Year:
1996

Citing 0
Cited 21

QuickSet: multimodal interaction for distributed applications

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
The music notepad

Proceedings of the 11th annual ACM symposium on User interface software and technology
Partial Information in Multimodal Dialogue

ICMI '00 Proceedings of the Third International Conference on Advances in Multimodal Interfaces
Multimodal interfaces

The human-computer interaction handbook
Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Modality fusion for graphic design applications

Proceedings of the 6th international conference on Multimodal interfaces
Multimodal interaction under exerted conditions in a natural field setting

Proceedings of the 6th international conference on Multimodal interfaces
A user interface framework for multimodal VR interactions

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Multimodal interaction systems: information and time features

International Journal of Web and Grid Services
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces

HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions

Human-Computer Interaction
Multimodal Interfaces: A Survey of Principles, Models and Frameworks

Human Machine Interaction
RavenCalendar: a multimodal dialog system for managing a personal calendar

NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Graph-based partial hypothesis fusion for pen-aided speech input

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams

Neurocomputing
Mulitmodal interaction for distributed interactive simulation

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Mudra: a unified multimodal interaction framework

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Fusion in multimodal interactive systems: an HMM-based algorithm for user-induced adaptation

Proceedings of the 4th ACM SIGCHI symposium on Engineering interactive computing systems
Modeling multimodal integration with event logic charts

Proceedings of the 14th ACM international conference on Multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

While significant advances have been made improve speech recognition performance, and gesture and handwriting recognition, speech- and pen-based systems have still not found broad acceptance in everyday life. One reason for this is the inflexibility of each input modality when used alone. Human communication is very natural and flexible because we can take advantage of a multiplicity of communication signals working in concert to supply complementary information or increase robustness with redundancy. We present a multimodal interface capable of jointly interpreting speech, pen-based gestures, and handwriting in the context of an appointment scheduling application. The interpretation engine based on semantic frame merging correctly interprets 80% of a multimodal data set assuming perfect speech and gesture/handwriting recognition; in the presence of recognition errors the interpretation performance is in the range of 35-62%. A dialog processing scheme uses task domain knowledge to guide the user in supplying information and permits human-computer interactions to span several related multimodal input events.