Optimization of Speech Recognition by Clustering of Phones

  • Authors:
  • Agnieszka Nowak;Alicja Wakulicz-Deja;Sebastian Bachliń/ski

  • Affiliations:
  • University of Silesia Institute of Computer Science Bȩ/dziń/ska 39, 41-200 Sosnowiec, Poland. E-mails: nowak@us.edu.pl/ wakulicz@us.edu.pl/ sbachlin@us.edu.pl;University of Silesia Institute of Computer Science Bȩ/dziń/ska 39, 41-200 Sosnowiec, Poland. E-mails: nowak@us.edu.pl/ wakulicz@us.edu.pl/ sbachlin@us.edu.pl;University of Silesia Institute of Computer Science Bȩ/dziń/ska 39, 41-200 Sosnowiec, Poland. E-mails: nowak@us.edu.pl/ wakulicz@us.edu.pl/ sbachlin@us.edu.pl

  • Venue:
  • Fundamenta Informaticae - SPECIAL ISSUE ON CONCURRENCY SPECIFICATION AND PROGRAMMING (CS&P 2005) Ruciane-Nide, Poland, 28-30 September 2005
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Optimization of the speech recognition process is aiming at achieving short time of classification (speech to text system), while preserving the content of speech signal description, and all necessary details of speech signal in considered application. The goal of parametrization of the human's speech is to eliminate of those physical features of speech signal, that do not bring any useful information (e.g., frequency of laryngeal tone, timbre of voice). The purpose of the parametrization of a speech signal is to minimize the volume of information that is to be analyzed. Our experiments suggest that using the cluster analysis method with agglomerative hierarchical technique is very helpful in finding relationships between speech phones. It lets us accelerate the process of speech recognition, simply because it is not necessary to analyze each phone separately and comparing it with an unclassified object. This principle has been carried to hidden Markov models. To organize those models we use the cluster analysis method with hierarchical techniques. Each model represents a single sequence of speech (probably the phone sequence). At the "top" of the structure we have models of phones in the most general context. When we go thru this structure to the bottom, there are models of phones in particular context. By the context we understand the juxtaposition the different phones.