Automatic segmentation and labeling for continuous number recognition

  • Authors:
  • S. A. R. Al-Haddad;Salina Abdul Samad;Aini Hussein;K. A. Ishak;A. A. Azid;R. Ghaffar;D. Ramli;M. R. Zainal;M. K. A. Abdullah

  • Affiliations:
  • Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Lab Signal Processing, Dept. Electrical, Electronic and System Engineering, Faculty of Engineering, National University of Malaysia, Selangor, Malaysia;Department Computer and Communication System Engineering, Faculty of Engineering, Putra University of Malaysia, Selangor, Malaysia

  • Venue:
  • ISCGAV'06 Proceedings of the 6th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This study is focused on continuous number speech recognition with the intention to distinguish speech and non-speech segments and segment it as one digit. This study proposes an algorithm for automatic segmentation of male and female voiced speech. The calculations of log energy and zero rate crossing are used to process speech samples to accomplish the segmentation. The thresholds are set based on the maximum likelihood for accurate labeling parts of speech (POS). The algorithms manage to get 95% correct segmentation for male speakers and 72.5% from female speakers.