Combining evidence from temporal and spectral features for person recognition using humming

Authors:
Hemant A. Patil;Maulik C. Madhavi;Rahul Jain;Alok K. Jain
Affiliations:
Dhirubhai Ambani Institute of Information and Communication Technology, Gujarat, India;Dhirubhai Ambani Institute of Information and Communication Technology, Gujarat, India;Hindustan Institute of Technology and Management, Agra, Uttar Pradesh, India;Nikhil Institute of Engineering and Management, Mathura, Uttar Pradesh, India
Venue:
PerMIn'12 Proceedings of the First Indo-Japan conference on Perception and Machine Intelligence
Year:
2012

Citing 3
Cited 0

Identification of Speakers from Their Hum

TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Humming-based human verification and identification

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, hum of a person is used to identify a speaker with the help of machine. In addition, novel temporal features (such as zero-crossing rate & short-time energy) and spectral features (such as spectral centroid & spectral flux) are proposed for person recognition task. Feature-level fusion of each of these features with state-of-the art spectral feature set, viz ., Mel Frequency Cepstral Coefficients (MFCC) is found to give better recognition performance than MFCC alone. In addition, it is shown that the person identification rate is competitive over baseline MFCC. Furthermore, the reduction in equal error rate (EER) by 1.46 % is obtained when a feature-level fusion system is employed by combining evidences from MFCC, temporal and proposed spectral features.