Automatic segmentation of speech recorded in unknown noisy channel characteristics
Speech Communication - Special issue on robust speech recognition
Akaike's information criterion and recent developments in information complexity
Journal of Mathematical Psychology
DISTBIC: a speaker-based segmentation for audio data indexing
Speech Communication - Special issue on accessing information in spoken audio
Automatic transcription of Broadcast News
Speech Communication - Special issue on automatic transcription of broadcast news data
A Comparison of Different Approaches to Automatic Speech Segmentation
TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units
IEICE - Transactions on Information and Systems
Computationally Efficient and Robust BIC-Based Speaker Segmentation
IEEE Transactions on Audio, Speech, and Language Processing
Speaker diarization exploiting the eigengap criterion and cluster ensembles
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Automatic phone segmentation techniques based on model selection criteria are studied. We investigate the phone boundary detection efficiency of entropy- and Bayesian- based model selection criteria in continuous speech based on the DISTBIC hybrid segmentation algorithm. DISTBIC is a text-independent bottom-up approach that identifies sequential model changes by combining metric distances with statistical hypothesis testing. Using robust statistics and small sample corrections in the baseline DISTBIC algorithm, phone boundary detection accuracy is significantly improved, while false alarms are reduced. We also demonstrate further improvement in phonemic segmentation by taking into account how the model parameters are related in the probability density functions of the underlying hypotheses as well as in the model selection via the information complexity criterion and by employing M-estimators of the model parameters. The proposed DISTBIC variants are tested on the NTIMIT database and the achieved F1 measure is 74.7% using a 20-ms tolerance in phonemic segmentation.