Putting people first: specifying proper names in speech interfaces

Authors:
Matt Marx;Chris Schmandt
Affiliations:
Speech Research Group, MIT Media Laboratory, 20 Ames St., Cambridge, MA;Speech Research Group, MIT Media Laboratory, 20 Ames St., Cambridge, MA
Venue:
UIST '94 Proceedings of the 7th annual ACM symposium on User interface software and technology
Year:
1994

Citing 4
Cited 9

Attention, intentions, and the structure of discourse

Computational Linguistics
Expressive richness: a comparison of speech and text as media for revision

CHI '91 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Phoneshell: the telephone as computer terminal

MULTIMEDIA '93 Proceedings of the first ACM international conference on Multimedia
An algorithm for high accuracy name pronunciation by parametric speech synthesizer

Computational Linguistics

Providing integrated toolkit-level support for ambiguity in recognition-based interfaces

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Charting past, present, and future research in ubiquitous computing

ACM Transactions on Computer-Human Interaction (TOCHI) - Special issue on human-computer interaction in the new millennium, Part 1
Interaction techniques for ambiguity resolution in recognition-based interfaces

UIST '00 Proceedings of the 13th annual ACM symposium on User interface software and technology
Cross-modal interaction using XWeb

UIST '00 Proceedings of the 13th annual ACM symposium on User interface software and technology
WebContext: remote access to shared context

Proceedings of the 2001 workshop on Perceptive user interfaces
Interaction techniques for ambiguity resolution in recognition-based interfaces

ACM SIGGRAPH 2006 Courses
Towards a taxonomy of error-handling strategies in recognition-based multi-modal human-computer interfaces

Signal Processing - Special section: Multimodal human-computer interfaces
Interaction techniques for ambiguity resolution in recognition-based interfaces

ACM SIGGRAPH 2007 courses
Steps in Identifying Interaction Design Patterns for Multimodal Systems

HCSE-TAMODIA '08 Proceedings of the 2nd Conference on Human-Centered Software Engineering and 7th International Workshop on Task Models and Diagrams

Quantified Score

Hi-index	0.00

Visualization

Abstract

Communication is about people, not machines. But as firms and families alike spread out geographically, we rely increasingly on telecommunications tools to keep us “connected”. The challenge of such systems is to enable conversation between individuals without computational infrastructure getting in the way. This paper compares two speech-based communication systems, Phoneshell and Chatter, in how they deal with the keys to communication: proper names. Chatter, a conversational system using speech-recognition, improves upon the hierarchical nature of the touch-tone based Phoneshell by maintaining context and enabling use of anaphora. Proper names can present particular problems for speech recognizers, so an interface algorithm for reliable name specification by spelling is offered. Since individual letter recognition is non-robust, Chatter implicitly disambiguates strings of letters based on context. We hypothesize that the right interface can make faulty speech recognition as usable as TouchTones—even more so.