Automatic classification of speech and music using neural networks
Proceedings of the 2nd ACM international workshop on Multimedia databases
PebbleBox and CrumbleBag: tactile interfaces for granular synthesis
NIME '04 Proceedings of the 2004 conference on New interfaces for musical expression
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Semantic-event based analysis and segmentation of wedding ceremony videos
Proceedings of the international workshop on Workshop on multimedia information retrieval
New speech/music discrimination approach based on fundamental frequency estimation
Multimedia Tools and Applications
Classification of audio signals using SVM and RBFNN
Expert Systems with Applications: An International Journal
An Efficient Approach for Classification of Speech and Music
PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
A wavelet-based parameterization for speech/music discrimination
Computer Speech and Language
Noise robust features for speech/music discrimination in real-time telecommunication
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
A two phase method for general audio segmentation
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Signal segmentation and modelling based on equipartition principle
DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Engineering Applications of Artificial Intelligence
Online speech/music segmentation based on the variance mean of filter bank energy
EURASIP Journal on Advances in Signal Processing
Classification of audio signals using AANN and GMM
Applied Soft Computing
EURASIP Journal on Advances in Signal Processing - Special issue on time-frequency analysis and its applications to multimedia signals
Pattern classification models for classifying and indexing audio signals
Engineering Applications of Artificial Intelligence
Hierarchical audio content classification system using an optimal feature selection algorithm
Multimedia Tools and Applications
Scene segmentation of wedding party videos by scenario-based matching with example videos
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Audio content analysis for understanding structures of scene in video
ICIC'06 Proceedings of the 2006 international conference on Intelligent Computing - Volume Part I
A multi-class method for detecting audio events in news broadcasts
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Improvement of commercial boundary detection using audiovisual features
PCM'05 Proceedings of the 6th Pacific-Rim conference on Advances in Multimedia Information Processing - Volume Part I
Mining movies to extract song sequences
Proceedings of the Eleventh International Workshop on Multimedia Data Mining
International Journal of Speech Technology
Mining movies for song sequences with video based music genre identification system
Information Processing and Management: an International Journal
DemoCut: generating concise instructional videos for physical demonstrations
Proceedings of the 26th annual ACM symposium on User interface software and technology
Audio Classification and Retrieval Using Wavelets and Gaussian Mixture Models
International Journal of Multimedia Data Engineering & Management
Mining movie archives for song sequences
Multimedia Tools and Applications
Hi-index | 0.00 |
Over the last several years, major efforts have been made to develop methods for extracting information from audiovisual media, in order that they may be stored and retrieved in databases automatically, based on their content. In this work we deal with the characterization of an audio signal, which may be part of a larger audiovisual system or may be autonomous, as for example in the case of an audio recording stored digitally on disk. Our goal was to first develop a system for segmentation of the audio signal, and then classification into one of two main categories: speech or music. Among the system's requirements are its processing speed and its ability to function in a real-time environment with a small responding delay. Because of the restriction to two classes, the characteristics that are extracted are considerably reduced and moreover the required computations are straightforward. Experimental results show that efficiency is exceptionally good, without sacrificing performance. Segmentation is based on mean signal amplitude distribution, whereas classification utilizes an additional characteristic related to the frequency. The classification algorithm may be used either in conjunction with the segmentation algorithm, in which case it verifies or refutes a music-speech or speech-music change, or autonomously, with given audio segments. The basic characteristics are computed in 20 ms intervals, resulting in the segments' limits being specified within an accuracy of 20 ms. The smallest segment length is one second. The segmentation and classification algorithms were benchmarked on a large data set, with correct segmentation about 97% of the time and correct classification about 95%.