Film segmentation and indexing using autoassociative neural networks

Authors:
K. Sreenivasa Rao;Dipanjan Nandi;Shashidhar G. Koolagudi
Affiliations:
School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, India 721302;School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, India 721302;Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal, India 575025
Venue:
International Journal of Speech Technology
Year:
2014

Citing 18
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Automatic recognition of film genres

Proceedings of the third ACM international conference on Multimedia
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Comparison of different implementations of MFCC

Journal of Computer Science and Technology
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
AANN: an alternative to GMM for pattern recognition

Neural Networks
Time-Constrained Clustering for Segmentation of Video into Story Unites

ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Artificial Neural Networks

Artificial Neural Networks
Modeling durations of syllables using neural networks

Computer Speech and Language
Discrete-time speech signal processing: principles and practice

Discrete-time speech signal processing: principles and practice
VideoMule: a consensus learning approach to multi-label classification from noisy user-generated videos

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Voice conversion by mapping the speaker-specific features using pitch synchronous approach

Computer Speech and Language
Voice transformation by mapping the features at syllable level

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Movie genre classification via scene categorization

Proceedings of the international conference on Multimedia
Recognition of emotions from video using neural network models

Expert Systems with Applications: An International Journal
On the use of computable features for film classification

IEEE Transactions on Circuits and Systems for Video Technology
Neural network based feature transformation for emotion independent speaker identification

International Journal of Speech Technology
Two-stage intonation modeling using feedforward neural networks for syllable based text-to-speech synthesis

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, Autoassociative Neural Network (AANN) models are explored for segmentation and indexing the films (movies) using audio features. A two-stage method is proposed for segmenting the film into sequence of scenes, and then indexing them appropriately. In the first stage, music and speech plus music segments of the film are separated, and music segments are labelled as title and fighting scenes based on their position. At the second stage, speech plus music segments are classified into normal, emotional, comedy and song scenes. In this work, Mel frequency cepstral coefficients (MFCCs), zero crossing rate and intensity are used as audio features for segmentation and indexing the films. The proposed segmentation and indexing method is evaluated on manual segmented Hindi films. From the evaluation results, it is observed that title, fighting and song scenes are segmented and indexed without any errors, and most of the errors are observed in discriminating the comedy and normal scenes. Performance of the proposed AANN models used for segmentation and indexing of the films, is also compared with hidden Markov models, Gaussian mixture models and support vector machines.