Audio Feature Extraction and Analysis for Scene Segmentation and Classification

Authors:
Zhu Liu;Yao Wang;Tsuhan Chen
Affiliations:
Polytechnic University, Brooklyn, NY 11201;Polytechnic University, Brooklyn, NY 11201;Carnegie Mellon University, Pittsburgh, PA 15213
Venue:
Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
Year:
1998

Citing 14
Cited 38

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Neural networks for signal processing

Neural networks for signal processing
Automatic partitioning of full-motion video

Multimedia Systems
Content-Based Video Indexing and Retrieval

IEEE MultiMedia
Automatic audio content analysis

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
Query by Image and Video Content: The QBIC System

Computer
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Audio as a Support to Scene Change Detection and Characterization of Video Sequences

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
Combined Audio and Visual Streams Analysis for Video Sequence Segmentation

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
Integrated Image and Speech Analysis for Content-Based Video Indexing

ICMCS '96 Proceedings of the 1996 International Conference on Multimedia Computing and Systems
Real-time discrimination of broadcast speech/music

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Video visualization for compact presentation and fast browsing of pictorial content

IEEE Transactions on Circuits and Systems for Video Technology
Face recognition/detection by probabilistic decision-based neural network

IEEE Transactions on Neural Networks

Automatically extracting highlights for TV Baseball programs

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Scene Determination Based on Video and Audio Features

Multimedia Tools and Applications
Automatic segmentation of news items based on video and audio features

Journal of Computer Science and Technology
VideoCube: A Novel Tool for Video Mining and Classification

ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
A Neural Multi-expert Classification System for MPEG Audio Segmentation

ICAPR '01 Proceedings of the Second International Conference on Advances in Pattern Recognition
Analysis of Environmental Sounds as Indexical Signs in Film

PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Content-Based Audio Classification with Generalized Ellipsoid Distance

PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
The case for reconfigurable hardware in wearable computing

Personal and Ubiquitous Computing
Using structure patterns of temporal and spectral feature in audio similarity measure

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Automatic Feature Extraction for Classifying Audio Data

Machine Learning
Method trees: building blocks for self-organizable representations of value series: how to evolve representations for classifying audio data

GECCO '05 Proceedings of the 7th annual workshop on Genetic and evolutionary computation
Estimation of musical sound separation algorithm effectiveness employing neural networks

Journal of Intelligent Information Systems - Special issue: Intelligent multimedia applications
Towards optimal audio "keywords" detection for audio content analysis and discovery

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Incorporating feature hierarchy and boosting to achieve more effective classifier training and concept-oriented video summarization and skimming

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Retrieval of movie scenes by semantic matrix and automatic feature weight update

Expert Systems with Applications: An International Journal
A new hybrid audio classification algorithm based on SVM weight factor and Euclidean distance

CEA'07 Proceedings of the 2007 annual Conference on International Conference on Computer Engineering and Applications
A multimodal data mining framework for soccer goal detection based on decision tree logic

International Journal of Computer Applications in Technology
Design of a content based multimedia retrieval system

CSECS'06 Proceedings of the 5th WSEAS International Conference on Circuits, Systems, Electronics, Control & Signal Processing
Cross-lingual audio-to-text alignment for multimedia content management

Decision Support Systems
Scene detection using visual and audio attention

Proceedings of the 2008 Ambi-Sys workshop on Ambient media delivery and interactive television
A Novel Video Classification Method Based on Hybrid Generative/Discriminative Models

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
A Flexible Framework for Audio Semantic Content Detection

PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Audio-Based Shot Classification for Audiovisual Indexing Using PCA, MGD and Fuzzy Algorithm

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Supervised Machine Learning: A Review of Classification Techniques

Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies
Semantic concept annotation based on audio PLSA model

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Content-based scene segmentation scheme for efficient multimedia information retrieval

International Journal of Wireless and Mobile Computing
On supervision and statistical learning for semantic multimedia analysis

Journal of Visual Communication and Image Representation
Similarity clustering of music files according to user preference

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Classification of similar impact sounds

ICISP'10 Proceedings of the 4th international conference on Image and signal processing
Harmonic and instrumental information fusion for musical genre classification

Proceedings of 3rd international workshop on Machine learning and music
A two level strategy for audio segmentation

Digital Signal Processing
Example-based video remixing

Multimedia Tools and Applications
Content-Based news video mining

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Audio and video feature fusion for activity recognition in unconstrained videos

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Multi-stage classification for audio based activity recognition

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
An enhanced fuzzy c-means algorithm for audio segmentation and classification

Multimedia Tools and Applications
Fusing audio vocabulary with visual features for pornographic video detection

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Understanding of the scene content of a video sequence isvery important for content-based indexing and retrieval of multimediadatabases. Research in this area in the past several years hasfocused on the use of speech recognition and image analysistechniques. As a complimentary effort to the prior work, we havefocused on using the associated audio information (mainly thenonspeech portion) for video scene analysis. As an example, weconsider the problem of discriminating five types of TV programs,namely commercials, basketball games, football games, news reports,and weather forecasts. A set of low-level audio features are proposedfor characterizing semantic contents of short audio clips. The linearseparability of different classes under the proposed feature space isexamined using a clustering analysis. The effective features areidentified by evaluating the intracluster and intercluster scatteringmatrices of the feature space. Using these features, a neural netclassifier was successful in separating the above five types of TVprograms. By evaluating the changes between the feature vectors ofadjacent clips, we also can identify scene breaks in an audiosequence quite accurately. These results demonstrate the capabilityof the proposed audio features for characterizing the semanticcontent of an audio sequence.