Readings in speech recognition
Readings in speech recognition
Feedback strategies for error correction in speech recognition systems
International Journal of Man-Machine Studies
Fundamentals of speech recognition
Fundamentals of speech recognition
Automatic speech recognition for generalised time based media retrieval and indexing
MULTIMEDIA '98 Proceedings of the sixth ACM international conference on Multimedia
An overview of audio information retrieval
Multimedia Systems - Special issue on audio and multimedia
Patterns of entry and correction in large vocabulary continuous speech recognition systems
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Multimodal error correction for speech user interfaces
ACM Transactions on Computer-Human Interaction (TOCHI)
Transcriber: Development and use of a tool for assisting speech corpora production
Speech Communication - Special issue on speech annotation and corpus tools
SCANMail: a voicemail interface that makes speech browsable, readable and searchable
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Speech and Language Processing for Multimodal Human-Computer Interaction
Journal of VLSI Signal Processing Systems
A system for dynamic 3D visualisation of speech recognition paths
AVI '08 Proceedings of the working conference on Advanced visual interfaces
Browsing recorded meetings with ferret
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Interface design strategies for computer-assisted speech transcription
Proceedings of the 20th Australasian Conference on Computer-Human Interaction: Designing for Habitus and Habitat
A prototype for interactive speech transcription balancing error and supervision effort
Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
Hi-index | 0.00 |
As performance gains in automatic speech recognition systems plateau, improvements to existing applications of speech recognition technology seem more likely to come from better user interface design than from further progress in core recognition components. Among all applications of speech recognition, the usability of systems for transcription of spontaneous speech is particularly sensitive to high word error rates. This paper presents a series of approaches to improving the usability of such applications. We propose new mechanisms for error correction, use of contextual information, and use of 3D visualisation techniques to improve user interaction with a recogniser and maximise the impact of user feedback. These proposals are illustrated through several prototypes which target tasks such as: off-line transcript editing, dynamic transcript editing, and real-time visualisation of recognition paths. An evaluation of our dynamic transcript editing system demonstrates the gains that can be made by adding the corrected words to the recogniser's dictionary and then propagating the user's corrections.