Audio classification based on MPEG-7 spectral basis representations

Authors:
Hyoung-Gook Kim;N. Moreau;T. Sikora
Affiliations:
Commun. Syst. Group, Tech. Univ. of Berlin, Germany;-;-
Venue:
IEEE Transactions on Circuits and Systems for Video Technology
Year:
2004

Citing 0
Cited 12

Spectral Anticipations

Computer Music Journal
Nericell: rich monitoring of road and traffic conditions using mobile smartphones

Proceedings of the 6th ACM conference on Embedded network sensor systems
Similarity search in animal sound databases

IEEE Transactions on Multimedia
Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features

IEEE Transactions on Multimedia
On feature combination for music classification

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Audio signal processing using time-frequency approaches: coding, classification, fingerprinting, and watermarking

EURASIP Journal on Advances in Signal Processing - Special issue on time-frequency analysis and its applications to multimedia signals
Music genre classification based on MPEG-7 audio features

ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
Music classification via the bag-of-features approach

Pattern Recognition Letters
First steps to an audio ontology-based classifier for telemedicine

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
A self-similarity approach to repairing large dropouts of streamed music

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Optimizing cepstral features for audio classification

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
A real-time stream storage and analysis platform for underwater acoustic monitoring

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present an MPEG-7-based audio classification and retrieval technique targeted for analysis of film material. The technique consists of low-level descriptors and high-level description schemes. For low-level descriptors, low-dimensional features such as audio spectrum projection based on audio spectrum basis descriptors is produced in order to find a balanced tradeoff between reducing dimensionality and retaining maximum information content. High-level description schemes are used to describe the modeling of reduced-dimension features, the procedure of audio classification, and retrieval. A classifier based on continuous hidden Markov models is applied. The sound model state path, which is selected according to the maximum-likelihood model, is stored in an MPEG-7 sound database and used as an index for query applications. Various experiments are presented where the speaker- and sound-recognition rates are compared for different feature extraction methods. Using independent component analysis, we achieved better results than normalized audio spectrum envelope and principal component analysis in a speaker recognition system. In audio classification experiments, audio sounds are classified into selected sound classes in real time with an accuracy of 96%.