Using confidence scores to improve hands-free speech based navigation in continuous dictation systems

Authors:
Jinjuan Feng;Andrew Sears
Affiliations:
UMBC, Baltimore, MD;UMBC, Baltimore, MD
Venue:
ACM Transactions on Computer-Human Interaction (TOCHI)
Year:
2004

Citing 15
Cited 11

Storywriter: a speech oriented editor

CHI '94 Conference Companion on Human Factors in Computing Systems
SUITEKeys: a speech understanding interface for the motor-control challenged

Assets '98 Proceedings of the third international ACM conference on Assistive technologies
Patterns of entry and correction in large vocabulary continuous speech recognition systems

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Taming recognition errors with a multimodal interface

Communications of the ACM
A comparison of voice controlled and mouse controlled web browsing

Assets '00 Proceedings of the fourth international ACM conference on Assistive technologies
Multimodal error correction for speech user interfaces

ACM Transactions on Computer-Human Interaction (TOCHI)
Conversational interface technologies

The human-computer interaction handbook
Multimodal interfaces

The human-computer interaction handbook
A Probabilistic Approach to Confidence Estimation and Evaluation

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Error-responsive feedback mechanisms for speech recognizers

Error-responsive feedback mechanisms for speech recognizers
Automatic detection of poor speech recognition at the dialogue level

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Incorporating confidence measures in the Dutch train timetable information system developed in the ARISE project

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Multimodal interactive maps: designing for human performance

Human-Computer Interaction
Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions

Human-Computer Interaction
Hands-free, speech-based navigation during dictation: difficulties, consequences, and solutions

Human-Computer Interaction

Error correction of voicemail transcripts in SCANMail

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Robust speech recognition with nonstationary noise

SPPRA'06 Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications
Discovering Cues to Error Detection in Speech Recognition Output: A User-Centered Approach

Journal of Management Information Systems
Speech technology in real world environment: early results from a long term study

Proceedings of the 10th international ACM SIGACCESS conference on Computers and accessibility
Acceptance of speech recognition by physicians: A survey of expectations, experiences, and social influence

International Journal of Human-Computer Studies
Speech-Based Navigation: Improving Grid-Based Solutions

INTERACT '09 Proceedings of the 12th IFIP TC 13 International Conference on Human-Computer Interaction: Part I
Using knowledge of misunderstandings to increase the robustness of spoken dialogue systems

Knowledge-Based Systems
Third-party error detection support mechanisms for dictation speech recognition

Interacting with Computers
Investigating Grid-Based Navigation: The Impact of Physical Disability

ACM Transactions on Accessible Computing (TACCESS)
A framework for robust and flexible handling of inputs with uncertainty

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Use of speech technology in real life environment

UAHCI'11 Proceedings of the 6th international conference on Universal access in human-computer interaction: applications and services - Volume Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speech recognition systems have improved dramatically, but recent studies confirm that error correction activities still account for 66--75% of the users' time, and 50% of that time is spent just getting to the errors that need to be corrected. While researchers have suggested that confidence scores could prove useful during the error correction process, the focus is typically on error detection. More importantly, empirical studies have failed to confirm any measurable benefits when confidence scores are used in this way within dictation-oriented applications. In this article, we provide data that explains why confidence scores are unlikely to be useful for error detection. We propose a new navigation technique for use when speech-only interactions are strongly preferred and common, desktop-sized displays are available. The results of an empirical study that highlights the potential of this new technique are reported. An informal comparison between the current study and previous research suggests the new technique reduces time spent on navigation by 18%. Future research should include additional studies that compare the proposed technique to previous non-speech and speech-based navigation solutions.