Sound analysis using MPEG compressed audio

Authors:
G. Tzanetakis;F. Cook
Affiliations:
Dept. of Comput. Sci., Princeton Univ., NJ, USA;-
Venue:
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Year:
2000

Citing 0
Cited 11

Pause concepts for audio segmentation at different semantic levels

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Music analysis and retrieval systems for audio signals

Journal of the American Society for Information Science and Technology - Music information retrieval
A Survey of MPEG-1 Audio, Video and Semantic Analysis Techniques

Multimedia Tools and Applications
An initial usability assessment for symbolic haptic rendering of music parameters

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Complexity-scalable beat detection with mp3 audio bitstreams

Computer Music Journal
Audio signal representations for indexing in the transform domain

IEEE Transactions on Audio, Speech, and Language Processing
Robust audio identification for MP3 popular music

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
How sparsely can a signal be approximated while keeping its class identity?

Proceedings of 3rd international workshop on Machine learning and music
A two level strategy for audio segmentation

Digital Signal Processing
Shot classification and scene segmentation based on MPEG compressed movie analysis

PCM'04 Proceedings of the 5th Pacific Rim conference on Advances in Multimedia Information Processing - Volume Part I
Towards extracting emotions from music

IMTCI'04 Proceedings of the Second international conference on Intelligent Media Technology for Communicative Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

There is a huge amount of audio data available that is compressed using the MPEG audio compression standard. Sound analysis is based on the computation of short time feature vectors that describe the instantaneous spectral content of the sound. An interesting possibility is the calculation of features directly from compressed data. Since the bulk of the feature calculation is performed during the encoding stage this process has a significant performance advantage if the available data is compressed. Combining decoding and analysis in one stage is also very important for audio streaming applications. In this paper, we describe the calculation of features directly from MPEG audio compressed data. Two of the basic processes of analyzing sound are: segmentation and classification. To illustrate the effectiveness of the calculated features we have implemented two case studies: a general audio segmentation algorithm and a music/speech classifier. Experimental data is provided to show that the results obtained are comparable with sound analysis algorithms working directly with audio samples.