A maximum likelihood approach to continuous speech recognition
Readings in speech recognition
Training and search methods for speech recognition
Voice communication between humans and machines
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A commonsense approach to predictive text entry
CHI '04 Extended Abstracts on Human Factors in Computing Systems
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
ConceptNet — A Practical Commonsense Reasoning Tool-Kit
BT Technology Journal
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Signal Processing - Special section: Multimodal human-computer interfaces
Enhancing accessibility through correction of speech recognition errors
ACM SIGACCESS Accessibility and Computing - ASSETS 2007 doctoral consortium
An interface for targeted collection of common sense knowledge using a mixture model
Proceedings of the 14th international conference on Intelligent user interfaces
Understanding users' perception of speech recognition errors in mobile communication
International Journal of Mobile Learning and Organisation
AnalogySpace: reducing the dimensionality of common sense knowledge
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
An empirical study on users' acceptance of speech recognition errors in text-messaging
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
No Code Required: Giving Users Tools to Transform the Web
No Code Required: Giving Users Tools to Transform the Web
Third-party error detection support mechanisms for dictation speech recognition
Interacting with Computers
Autobiographical design in HCI research: designing and learning through use-it-yourself
Proceedings of the Designing Interactive Systems Conference
Applying Commonsense Reasoning to Place Identification
International Journal of Handheld Computing Research
Information distance between what I said and what it heard
Communications of the ACM
Hi-index | 0.02 |
A principal problem in speech recognition is distinguishing between words and phrases that sound similar but have different meanings. Speech recognition programs produce a list of weighted candidate hypotheses for a given audio segment, and choose the "best" candidate. If the choice is incorrect, the user must invoke a correction interface that displays a list of the hypotheses and choose the desired one. The correction interface is time-consuming, and accounts for much of the frustration of today's dictation systems. Conventional dictation systems prioritize hypotheses based on language models derived from statistical techniques such as n-grams and Hidden Markov Models.We propose a supplementary method for ordering hypotheses based on Commonsense Knowledge. We filter acoustical and word-frequency hypotheses by testing their plausibility with a semantic network derived from 700,000 statements about everyday life. This often filters out possibilities that "don't make sense" from the user's viewpoint, and leads to improved recognition. Reducing the hypothesis space in this way also makes possible streamlined correction interfaces that improve the overall throughput of dictation systems.