A generic audio classification and segmentation approach for multimedia indexing and retrieval

Authors:
S. Kiranyaz;Ahmad Farooq Qureshi;M. Gabbouj
Affiliations:
Inst. of Signal Process., Tampere Univ. of Technol., Finland;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2006

Citing 0
Cited 10

Classification of audio signals using SVM and RBFNN

Expert Systems with Applications: An International Journal
Semantic concept annotation based on audio PLSA model

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Audio segmentation in AAC domain for content analysis

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Audio signal representations for indexing in the transform domain

IEEE Transactions on Audio, Speech, and Language Processing
Classification of audio signals using AANN and GMM

Applied Soft Computing
Audio query by example using similarity measures between probability density functions of features

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on scalable audio-content analysis
Pattern classification models for classifying and indexing audio signals

Engineering Applications of Artificial Intelligence
Environmental sound classification for scene recognition using local discriminant bases and HMM

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Dynamic and scalable audio classification by collective network of binary classifiers framework: An evolutionary approach

Neural Networks
Fusing audio vocabulary with visual features for pornographic video detection

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We focus the attention on the area of generic and automatic audio classification and segmentation for audio-based multimedia indexing and retrieval applications. In particular, we present a fuzzy approach toward hierarchic audio classification and global segmentation framework based on automatic audio analysis providing robust, bi-modal, efficient and parameter invariant classification over global audio segments. The input audio is split into segments, which are classified as speech, music, fuzzy or silent. The proposed method minimizes critical errors of misclassification by fuzzy region modeling, thus increasing the efficiency of both pure and fuzzy classification. The experimental results show that the critical errors are minimized and the proposed framework significantly increases the efficiency and the accuracy of audio-based retrieval especially in large multimedia databases.