Hierarchical classification of audio data for archiving and retrieving

Authors:
Tong Zhang;C.-C. J. Kuo
Affiliations:
Integrated Media Syst. Center, Univ. of Southern California, Los Angeles, CA, USA;-
Venue:
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Year:
1999

Citing 0
Cited 28

A singer identification technique for content-based classification of MP3 music objects

Proceedings of the eleventh international conference on Information and knowledge management
Audio classification in speech and music: a comparison between a statistical and a neural approach

EURASIP Journal on Applied Signal Processing - Image analysis for multimedia interactive services - part I
Speech/music segmentation using entropy and dynamism features in a HMM classification framework

Speech Communication
Search audio data with the wavelet pyramidal algorithm

Information Processing Letters - Devoted to the rapid publication of short contributions to information processing
Multimodal Video Indexing: A Review of the State-of-the-art

Multimedia Tools and Applications
Verifier-tuple for audio-forensic to determine speaker environment

MM&Sec '05 Proceedings of the 7th workshop on Multimedia and security
Inferring similarity between music objects with application to playlist generation

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Online audio background determination for complex audio environments

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Audio classification in speech and music: a comparison between a statistical and a neural approach

EURASIP Journal on Applied Signal Processing
Perceptual audio hashing functions

EURASIP Journal on Applied Signal Processing
Sound classification in hearing aids inspired by auditory scene analysis

EURASIP Journal on Applied Signal Processing
A concurrency and an access control system for a web based multimedia distance education environment

MMACTE'05 Proceedings of the 7th WSEAS International Conference on Mathematical Methods and Computational Techniques In Electrical Engineering
Dynamic privacy assessment in a smart house environment using multimodal sensing

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A decision-tree-based algorithm for speech/music classification and segmentation

EURASIP Journal on Audio, Speech, and Music Processing
A combination of data mining method with decision trees building for Speech/Music discrimination

Computer Speech and Language
Content-based scene segmentation scheme for efficient multimedia information retrieval

International Journal of Wireless and Mobile Computing
A combination of data mining method with context-based state transfer for speech/music discrimination

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Indexing and retrieval scheme for content-based multimedia applications

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
SVM-based audio classification for content-based multimedia retrieval

MCAM'07 Proceedings of the 2007 international conference on Multimedia content analysis and mining
Audio data model for multi-criteria query formulation and retrieval

Proceedings of the 7th International Conference on Advances in Mobile Computing and Multimedia
A hooking method running on MHAP environment

ACE'10 Proceedings of the 9th WSEAS international conference on Applications of computer engineering
Digital video scenes identification using audiovisual features

WebMedia '09 Proceedings of the XV Brazilian Symposium on Multimedia and the Web
An access control mechanism based on situation-aware ubiquitous computing for seamless multimedia view sharing

ICIC'06 Proceedings of the 2006 international conference on Intelligent computing: Part II
Identifying the classical music composition of an unknown performance with wavelet dispersion vector and neural nets

Information Sciences: an International Journal
Comparison of methods for language-dependent and language-independent query-by-example spoken term detection

ACM Transactions on Information Systems (TOIS)
A reliable qos model for festival constraint running on MHAP in festival site

FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Dictionary learning based sparse coefficients for audio classification with max and average pooling

Digital Signal Processing
Introducing the use of depth data for fall detection

Personal and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The first stage is called the coarse-level audio classification and segmentation, where audio recordings are classified and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of short-time features of audio signals. In the second stage, environmental sounds are further classified into finer classes such as applause, rain, bird sound, etc. This fine-level classification is based on time-frequency analysis of audio signals and use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to an input sample audio. It is shown that the proposed system has achieved an accuracy higher than 90% for coarse-level audio classification. Examples of audio fine classification and audio retrieval are also provided.