Layering predictions: flexible use of dialog expectation in speech recognition

  • Authors:
  • Sheryl R. Young;Wayne H. Ward;Alexander G. Hauptmann

  • Affiliations:
  • Carnegie Mellon University, School of Computer Science, Pittsburgh, PA;Carnegie Mellon University, School of Computer Science, Pittsburgh, PA;Carnegie Mellon University, School of Computer Science, Pittsburgh, PA

  • Venue:
  • IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 2
  • Year:
  • 1989

Quantified Score

Hi-index 0.01

Visualization

Abstract

When computer speech recognition is used for problem solving or any plan based task, predictable features of the user's behavior may be inferred and used to aid the recognition of the speech input. The MINDS system generates expectations of what will be said next and uses them to assist speech recognition. Since a user does not always conform to system expectations, MINDS handles violated expectations. We use pragmatic knowledge to dynamically derive constraints about what the user is likely to say next. Then we loosen the constraints in a principled manner to generate layered sets of predictions which range from very specific to very general. To enable the speech system to give priority to recognizing what a user is most likely to say, each prediction set dynamically generates a grammar which is used by the speech recognizer. A different set of grammars is created after each user utterance. The grammars are tried in order of most specific first, until an acceptable parse is found. This allows optimal performance when users behave predictably, and displays graceful degradation when they do not.