Fundamentals of speech recognition
Fundamentals of speech recognition
Classification of general audio data for content-based retrieval
Pattern Recognition Letters - Special issue on image/video indexing and retrieval
Indexing and Retrieval of Audio: A Survey
Multimedia Tools and Applications
Content-Based Classification, Search, and Retrieval of Audio
IEEE MultiMedia
Automatic recognition of frog calls using a multi-stage average spectrum
Computers & Mathematics with Applications
Hi-index | 0.10 |
In this paper we propose a method that uses the averaged Mel-frequency cepstral coefficients (MFCCs) and linear discriminant analysis (LDA) to automatically identify animals from their sounds. First, each syllable corresponding to a piece of vocalization is segmented. The averaged MFCCs over all frames in a syllable are calculated as the vocalization features. Linear discriminant analysis (LDA), which finds out a transformation matrix that minimizes the within-class distance and maximizes the between-class distance, is utilized to increase the classification accuracy while to reduce the dimensionality of the feature vectors. In our experiment, the average classification accuracy is 96.8% and 98.1% for 30 kinds of frog calls and 19 kinds of cricket calls, respectively.