Dynamic Database Creation for Speaker Recognition System

Authors:
Bhushan Dayaram Patil;Yogesh Manav;Pavan Sudheendra
Affiliations:
Mobile Protocols and Platforms Group, Samsung R&D Institute India, Bangalore(SRIB), ORION Building, Bangalore, INDIA;Mobile Protocols and Platforms Group, Samsung R&D Institute India, Bangalore(SRIB), ORION Building, Bangalore, INDIA;Mobile Protocols and Platforms Group, Samsung R&D Institute India, Bangalore(SRIB), ORION Building, Bangalore, INDIA
Venue:
Proceedings of International Conference on Advances in Mobile Computing & Multimedia
Year:
2013

Citing 4
Cited 0

DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
Adaptive speaker identification with audiovisual cues for movie content analysis

Pattern Recognition Letters - Video computing
Segregation of speakers for speech recognition and speaker identification

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Efficient Speaker Change Detection Using Adapted Gaussian Mixture Models

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The classical speaker identification algorithm gives acceptable results if the training is done offline using good quality database [5]. Though there has been a substantial amount of research in speaker recognition area, the majority of work has been focused on the offline training scenario. However in some scenarios where real time speaker recognition is required like in the case of Viewer preference based presentation/playback of media content, offline training is not possible as there is no prior information on the subjects/speakers present in the content. A run time training approach is required to generate a dynamic features database, which can be used to provide features like Viewer preference based seek or Zoom to specific subject/speaker during Media Playback. In this paper we propose a speaker recognition system using a dynamically created database. In this paper we consider Speaker recognition as a classification problem wherein speakers are classified based on speech features. The proposed speaker recognition system uses MFCC (Mel Frequency Cepstral Coefficients) as features and Polynomial/GMM (Gaussian Mixture Model) as classifiers. In our analysis, we demonstrate the pros and cons of the algorithms employing dynamic database creation. The test results show that ~96% accuracy for a content having 5 speakers can be achieved using the proposed system.