A comparison of rule-based and statistical methods for semantic language modeling and confidence measurement

  • Authors:
  • Ruhi Sarikaya;Yuqing Gao;Michael Picheny

  • Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents a comparison of a rule-based and a statistical semantic information modeling technique. For the rule--based method we employ Embedded Grammar (EG) tagging and for the statistical method we use a previously proposed Semantic Structured Language Modeling (SSLM) technique. Both EG and SSLM achieve around 15% relative improvement in speech recognition performance over the baseline dialog state--based trigram language model in a financial transaction domain. Combining EG and SSLM using linear interpolation results in further improvement. We also use the features obtained from EG and SSLM for confidence measurement. Word level confidence measurement experiments using EG and SSLM--based semantic features combined with posterior probability show over 20% relative improvement in correct acceptance rate (CA) at 5% false alarm (FA) rate over the posterior probability based feature. In both language model rescoring and confidence measurement experiments SSLM outperforms EG by a small margin.