Third-party error detection support mechanisms for dictation speech recognition

  • Authors:
  • Lina Zhou;Yongmei Shi;Andrew Sears

  • Affiliations:
  • Department of Information Systems, UMBC, Baltimore, MD 21250, USA;Tetherless World Constellation, Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA;Department of Information Systems, UMBC, Baltimore, MD 21250, USA

  • Venue:
  • Interacting with Computers
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although speech recognition has improved significantly in recent years, its adoption continues to be limited, in part, by the effort and frustration associated with correcting speech recognition errors. Error detection is a particularly challenging issue in third-party error correction where different individuals are responsible for the original dictation and correcting the resulting text. This research aims to address the difficulty experienced in third-party error detection by developing and evaluating a variety of support mechanisms. Drawing on a growing body of literature on human computer interaction and speech recognition, four support mechanisms were designed and evaluated, namely indexed audio, speech summarization, error prediction, and the presentation of alternative hypotheses. A user study assessed the impact of these support mechanisms on both performance and perceptions during error detection tasks. Performance measures included effectiveness and efficiency, and perception measures included confidence, perceived usefulness, and cognitive workload. The results provide strong support for the use of indexed audio in the context of third-party error detection. The results also confirm that consecutive error rate, or the percentage of recognition errors immediately adjacent to another error, has a negative impact on the effectiveness of third-party error detection. Other support mechanisms failed to improve either effectiveness or perceptions, but they did negate the negative impact as consecutive error rate increased. These findings have significant implications for speech recognition error detection research and the design of error detection support solutions.