Image processing on compressed data for large video databases
MULTIMEDIA '93 Proceedings of the first ACM international conference on Multimedia
Automatic partitioning of full-motion video
Multimedia Systems
Automating the creation of a digital video library
Proceedings of the third ACM international conference on Multimedia
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Probabilistic Visual Learning for Object Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A visual search system for video and image databases
ICMCS '97 Proceedings of the 1997 International Conference on Multimedia Computing and Systems
Feature-Based Algorithms for Detecting and Classifying Scene Breaks
Feature-Based Algorithms for Detecting and Classifying Scene Breaks
Example Based Learning for View-Based Human Face Detection
Example Based Learning for View-Based Human Face Detection
Journal of Cognitive Neuroscience
Improving Color Based Video Shot Detection
ICMCS '99 Proceedings of the 1999 IEEE International Conference on Multimedia Computing and Systems - Volume 02
Modeling the manifolds of images of handwritten digits
IEEE Transactions on Neural Networks
Challenges of Image and Video Retrieval
CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
Vlogging: A survey of videoblogging technology on the web
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
In this paper, an online Bayesian formulation is presented to detect and describe the most significant key-frames and shot boundaries of a video sequence. Visual information is encoded in terms of a reduced number of degrees of freedom in order to provide robustness to noise, gradual transitions, flashes, camera motion and illumination changes. We present an online algorithm where images are classified according to their appearance contents -pixel values plus shape information- in order to obtain a structured representation from sequential information. This structured representation is presented on a grid where nodes correspond to the location of the representative image for each cluster. Since the estimation process takes simultaneously into account clustering and nodes' locations in the representation space, key-frames are placed considering visual similarities among neighbors. This fact not only provides a powerful tool for video navigation but also offers an organization for posterior higher-level analysis such as identifying pieces of news, interviews, etc.