Investigation of supervised dimensionality reduction methods for phonetic classification

  • Authors:
  • Heyun Huang;Yang Liu;Lou Boves

  • Affiliations:
  • Radboud University Nijmegen, Erasmuslaan, HT, Nijmegen, the Netherlands;The Hong Kong Polytechnic University, Hong Kong, P. R. China;Radboud University Nijmegen, Erasmuslaan, Nijmegen, the Netherlands

  • Venue:
  • Proceedings of the Third International Conference on Internet Multimedia Computing and Service
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic Speech Recognition (ASR) depends crucially on establishing acoustic models for speech units including phones. One disadvantage that lies in popular acoustic models is the lack of modeling speech continuity information. Stacking short-term features of consecutive frames may keep sufficient articulatory information. Unfortunately, the resultant high-dimensional feature space is still full of redundant information and also causes the curse of dimensionality for subsequent acoustic modeling. Motivated by this and some recent research [4, 15], our paper investigates the supervised dimensionality reduction methods to answer two research questions: whether local structures exist in the feature space formulated by stacking frames and whether the local structures help the acoustic modeling. Experimental results by TIMIT phonetic classification show that the assumed local structures do exist in the feature space and could be best described by nearest neighbor graphs.