AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition

Authors:
Satoshi Nakamura;Kazuya Takeda;Kazumasa Yamamoto;Takeshi Yamada;Shingo Kuroiwa;Norihide Kitaoka;Takanobu Nishiura;Akira Sasou;Mitsunori Mizumachi;Chiyomi Miyajima;Masakiyo Fujimoto;Toshiki Endo
Affiliations:
The authors are with the ATR Spoken Language Translation Research Laboratories, "Keihanna Science City", Kyoto-fu, 619. 0288 Japan. E-mail: satoshi.nakamura@atr.jp,;The authors are with Nagoya University, Nagoya-shi, 464. 8603 Japan.,;The author is with Shinshu University, Nagano-shi, 380.8553 Japan.,;The author is with University of Tsukuba, Tsukuba-shi, 305. 8573 Japan.,;The author is with University of Tokushima, Tokushima-shi, 770.8506 Japan.,;The author is with Toyohashi University of Technology, Toyohashi-shi, 441.8580 Japan.,;The author is with Ritsumeikan University, Kusatsu-shi, 525. 8577 Japan.,;The author is with National Institute of Advanced Industrial Science and Technology, Tsukuba-shi, 305.8568 Japan.,;The authors are with the ATR Spoken Language Translation Research Laboratories, "Keihanna Science City", Kyoto-fu, 619. 0288 Japan. E-mail: satoshi.nakamura@atr.jp,;The authors are with Nagoya University, Nagoya-shi, 464. 8603 Japan.,;The authors are with the ATR Spoken Language Translation Research Laboratories, "Keihanna Science City", Kyoto-fu, 619. 0288 Japan. E-mail: satoshi.nakamura@atr.jp,;The authors are with the ATR Spoken Language Translation Research Laboratories, "Keihanna Science City", Kyoto-fu, 619. 0288 Japan. E-mail: satoshi.nakamura@atr.jp,
Venue:
IEICE - Transactions on Information and Systems
Year:
2005

Citing 0
Cited 7

Voice activity detection based on adjustable linear prediction and GARCH models

Speech Communication
Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs

IEICE - Transactions on Information and Systems
Noise Robust Voice Activity Detection Based on Switching Kalman Filter

IEICE - Transactions on Information and Systems
Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors

IEICE - Transactions on Information and Systems
Noise robust voice activity detection based on periodic to aperiodic component ratio

Speech Communication
Temporal AM-FM combination for robust speech recognition

Speech Communication
Frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces an evaluation framework for Japanese noisy speech recognition named AURORA-2J. Speech recognition systems must still be improved to be robust to noisy environments, but this improvement requires development of the standard evaluation corpus and assessment technologies. Recently, the Aurora 2, 3 and 4 corpora and their evaluation scenarios have had significant impact on noisy speech recognition research. The AURORA-2J is a Japanese connected digits corpus and its evaluation scripts are designed in the same way as Aurora 2 with the help of European Telecommunications Standards Institute (ETSI) AURORA group. This paper describes the data collection, baseline scripts, and its baseline performance. We also propose a new performance analysis method that considers differences in recognition performance among speakers. This method is based on the word accuracy per speaker, revealing the degree of the individual difference of the recognition performance. We also propose categorization of modifications, applied to the original HTK baseline system, which helps in comparing the systems and in recognizing technologies that improve the performance best within the same category.