A recurrent neural fuzzy network for word boundary detection invariable noise-level environments

  • Authors:
  • Gin-Der Wu;Chin-Teng Lin

  • Affiliations:
  • Dept. of Electr. & Control. Eng., Nat. Chiao Tung Univ., Hsinchu;-

  • Venue:
  • IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses the problem of automatic word boundary detection in the presence of variable-level background noise. Commonly used robust word boundary detection algorithms always assume that the background noise level is fixed. In fact, the background noise level may vary during the procedure of recording. This is the major reason that most robust word boundary detection algorithms cannot work well in the condition of variable background noise level. In order to solve this problem, we first propose a refined time-frequency (RTF) parameter for extracting both the time and frequency features of noisy speech signals. The RTF parameter extends the (time-frequency) TF parameter proposed by Junqua et al. from single band to multiband spectrum analysis, where the frequency bands help to make the distinction between speech signal and noise clear. The RTF parameter can extract useful frequency information. Based on this RTF parameter, we further propose a new word boundary detection algorithm by using a recurrent self-organizing neural fuzzy inference network (RSONFIN). Since RSONPIN can process the temporal relations, the proposed RTF-based RSONFIN algorithm can find the variation of the background noise level and detect correct word boundaries in the condition of variable background noise level. As compared to normal neural networks, the RSONFIN can always find itself an economic network size with high-learning speed. Due to the self-learning ability of RSONFIN, this RTF-based RSONFIN algorithm avoids the need for empirically determining ambiguous decision rules in normal word boundary detection algorithms. Experimental results show that this new algorithm achieves higher recognition rate than the TF-based algorithm which has been shown to outperform several commonly used word boundary detection algorithms by about 12% in variable background noise level condition, It also reduces the recognition error rate due to endpoint detection to about 23%, compared to an average of 47% obtained by the TF-based algorithm in the same condition