Recent topics in speech recognition research at NTT laboratories

Authors:
Sadaoki Furui;Kiyohiro Shikano;Shoichi Matsunaga;Tatsuo Matsuoka;Satoshi Takahashi;Tomokazu Yamada
Affiliations:
NTT Human Interface Laboratories, Midori-cho, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Midori-cho, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Midori-cho, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Midori-cho, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Midori-cho, Musashino-shi, Tokyo, Japan;NTT Human Interface Laboratories, Midori-cho, Musashino-shi, Tokyo, Japan
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1992

Citing 2
Cited 1

A Cache-Based Natural Language Model for Speech Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
The acoustic-modeling problem in automatic speech recognition

The acoustic-modeling problem in automatic speech recognition

Task adaptation in stochastic language model for Chinese homophone disambiguation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces three recent topics in speech recognition research at NTT (Nippon Telegraph and Telephone) Human Interface Laboratories.The first topic is a new HMM (hidden Markov model) technique that uses VQ-code bigrams to constrain the output probability distribution of the model according to the VQ-codes of previous frames. The output probability distribution changes depending on the previous frames even in the same state, so this method reduces the overlap of feature distributions with different phonemes.The second topic is approaches for adapting a syllable trigram model to a new task in Japanese continuous speech recognition. An approach which uses the most recent input phrases for adaptation is effective in reducing the perplexity and improving phrase recognition rates.The third topic is stochastic language models for sequences of Japanese characters to be used in a Japanese dictation system with unlimited vocabulary. Japanese characters consist of Kanji (Chinese characters) and Kana (Japanese alphabets), and each Kanji has several readings depending on the context. Our dictation system uses character-trigram probabilities as a source model obtained from a text database consisting of both Kanji and Kana, and generates Kanji-and-Kana sequences directly from input speech.