Use of Haar wavelet transform based multiple template matching for analyses of speech voice

  • Authors:
  • Shinji Karasawa;Hiroshi Sakuraba

  • Affiliations:
  • Miyagi National College of Technology;Miyagi National College of Technology

  • Venue:
  • EATIS '07 Proceedings of the 2007 Euro American conference on Telematics and information systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Technologies of wavelet transformation were used in JPEG2000 and those will be available for CODEC. Pivotal reminders for voice recognition were investigated by using multi-resolution of Haar wavelet representation (H-WR). Template of a phoneme differs from that of a syllable. Optimum accuracy of the feature depends on segmentation of template-matching (TM) analyses. 64 components of Haar wavelet coefficients (H-WC) for recognition of a phoneme are able to decrease to 15 components with lower frequency. Here, each set of data begins at peak value in each pitch. Sampling frequency is 10 kHz. The period of segment for a phoneme is 6.4msec. Segmentation of phoneme in speech can be checked by using the fact that ratio (r) between SWC (sum of absolute value of WC in a scale) becomes r=1, at a transition. SWC is available as a constituent in vector quantization for a syllable. Short syllables are decoded by means of 8 pieces of SWC, here the SWC was obtained from a set of data of 1024 pieces on a syllable (sampling frequency is 5 kHz, period of extraction for a syllable is 204.8msec).