SVM-based audio classification for content-based multimedia retrieval

Authors:
Yingying Zhu;Zhong Ming;Qiang Huang
Affiliations:
Faculty of Information Engineering, Shenzhen University, Shenzhen, P.R.China and Software Engineering Ltd. of Harbin Institute of Technology, Haerbin, China;Faculty of Information Engineering, Shenzhen University, Shenzhen, P.R.China;Faculty of Information Engineering, Shenzhen University, Shenzhen, P.R.China
Venue:
MCAM'07 Proceedings of the 2007 international conference on Multimedia content analysis and mining
Year:
2007

Citing 7
Cited 2

Automatic audio content analysis

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Towards robust features for classifying audio in the CueVideo system

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
Real-time discrimination of broadcast speech/music

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Hierarchical classification of audio data for archiving and retrieving

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

Dynamic and scalable audio classification by collective network of binary classifiers framework: An evolutionary approach

Neural Networks
An analysis of content-based classification of audio signals using a fuzzy c-means algorithm

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Audio classification is very important in multimedia retrieval such as audio indexing, analysis and content-based video retrieval. In this paper, we have proposed a clip-based support vector machine (SVM) approach to classify audio signals into six classes, which are pure speech, music, silence, environmental sound, speech with music and speech with environmental sound. The classification results are then used to partition a video into homogeneous audio segments, which is used to analyze and retrieve its higher-level content. The experimental results show that the proposed system not only improves classification accuracy, but also performs better than the other classification systems using the decision tree (DT), K Nearest Neighbor (K-NN) and Neural Network (NN).