A compressed domain beat detector using MP3 audio bitstreams
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
A Neural Multi-expert Classification System for MPEG Audio Segmentation
ICAPR '01 Proceedings of the Second International Conference on Advances in Pattern Recognition
A Survey of MPEG-1 Audio, Video and Semantic Analysis Techniques
Multimedia Tools and Applications
Audio-Based Shot Classification for Audiovisual Indexing Using PCA, MGD and Fuzzy Algorithm
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Audio segmentation in AAC domain for content analysis
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Audio signal representations for indexing in the transform domain
IEEE Transactions on Audio, Speech, and Language Processing
On similarity search in audio signals using adaptive sparse approximations
AMR'09 Proceedings of the 7th international conference on Adaptive multimedia retrieval: understanding media and adapting to the user
Shot classification and scene segmentation based on MPEG compressed movie analysis
PCM'04 Proceedings of the 5th Pacific Rim conference on Advances in Multimedia Information Processing - Volume Part I
Video story segmentation and its application to personal video recorders
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Hi-index | 0.00 |
Audio information classification becomes a very important task for such purposes as automatic keyword spotting and other content-based audio-visual query systems. In this paper, we describe a fast and accurate audio data classification method on the MPEG coded data domain. Firstly silent segments are detected using a robust approach for different recording conditions. Then the non-silent segments are classified into three types, music, speech, and applause using temporal density, bandwidth and center frequency of subband energy. In order to be robust for a variety of audio sources as much as possible, we use Bayes discriminant function for multivariate Gaussian distribution instead of manually adjusting a threshold for each discriminator. In the experiment, every one-second of MPEG audio data is classified and about 90% of audio and speech segments have been successfully detected. As for the detection speed, less than 20% of MPEG audio decoding processing power is required.