A new method for OOV detection using hybrid word/fragment system

  • Authors:
  • Ariya Rastrow;Abhinav Sethy;Bhuvana Ramabhadran

  • Affiliations:
  • Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, USA;IBM T.J. Watson Research Center, Yorktown Heights, NY, USA;IBM T.J. Watson Research Center, Yorktown Heights, NY, USA

  • Venue:
  • ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a new method for detecting regions with out-of-vocabulary (OOV) words in the output of a large vocabulary continuous speech recognition (LVCSR) system. The proposed method uses a hybrid system combining words and data-driven variable length sub word units. With the use of a single feature, the posterior probability of sub word units, this method outperforms existing methods published in the literature. We also presents a recipe to discriminatively train a hybrid language model to improve OOV detection rate. Results are presented on the RT04 broadcast news task.