A large-vocabulary continuous speech recognition algorithm and its application to a multi-modal telephone directory assistance system

Authors:
Yasuhiro Minami;Kiyohiro Shikano;Osamu Yoshioka;Satoshi Takahashi;Tomokazu Yamada;Sadaoki Furui
Affiliations:
NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan
Venue:
HLT '94 Proceedings of the workshop on Human Language Technology
Year:
1994

Citing 1
Cited 0

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes an accurate and efficient algorithm for very-large-vocabulary continuous speech recognition based on an HMM-LR algorithm. The HMM-LR algorithm uses a generalized LR parser as a language model and hidden Markov models (HMMs) as phoneme models. To reduce the search space without pruning the correct candidate, we use forward and backward trellis likelihoods, an adjusting window for choosing only the probable part of the trellis for each predicted phoneme, and an algorithm for merging candidates that have the same allophonic phoneme sequences and the same context-free grammar states. Candidates are also merged at the meaning level. This algorithm is applied to a telephone directory assistance system that recognizes spontaneous speech containing the names and addresses of more than 70,000 subscribers (vocabulary size is about 80,000). The experimental results show that the system performs well in spite of the large perplexity. This algorithm was also applied to a multi-modal telephone directory assistance system, and the system was evaluated from the human-interface point of view. To cope with the problem of background noise, an HMM composition technique which combines a noise-source HMM and a clean phoneme HMM into a noise-added phoneme HMM was investigated and incorporated into the system.